paint-brush
The Heart of DolphinScheduler: In-Depth Analysis of the Quartz Scheduling Frameworkby@williamguo

The Heart of DolphinScheduler: In-Depth Analysis of the Quartz Scheduling Framework

by William Guo4mNovember 20th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Delve into Quartz, the mighty open-source Java framework for scheduling tasks, and its dynamic partnership with DolphinScheduler through QuartzExecutorImpl. Find out how they work together to orchestrate workflows and manage timings in our in-depth exploration.
featured image - The Heart of DolphinScheduler: In-Depth Analysis of the Quartz Scheduling Framework
William Guo HackerNoon profile picture

Quartz is an open-source Java job scheduling framework that provides powerful capabilities for scheduling tasks. In DolphinScheduler, Quartz is used to implement task scheduling and management. DolphinScheduler integrates with Quartz through the QuartzExecutorImpl class, combining workflow and schedule management operations with Quartz's scheduling framework to achieve task execution.


This article provides a detailed analysis of Quartz's principles and implementation within DolphinScheduler.


Quartz Entity-Relationship Diagram


  1. QRTZ_JOB_DETAILS and QRTZ_TRIGGERS are the central tables, defining the relationship between jobs and triggers.


  2. The QRTZ_TRIGGERS table links with multiple trigger-type tables, such as QRTZ_SIMPLE_TRIGGERS and QRTZ_CRON_TRIGGERS, to enable different trigger mechanisms.


  3. QRTZ_FIRED_TRIGGERS records execution history, associating with both the job and trigger tables.


  4. QRTZ_CALENDARS defines calendar exclusion rules for triggers, while QRTZ_PAUSED_TRIGGER_GRPS manages the pause state of trigger groups.


  5. QRTZ_SCHEDULER_STATE and QRTZ_LOCKS are used for scheduling coordination in clustered environments, ensuring high availability.


Quartz in DolphinScheduler

We will focus on the principle analysis of the use of quartz in DolphinScheduler, so the steps for using quartz in DolphinScheduler will be introduced briefly. Generally speaking, there are 4 steps:


  • Creating a New SHELL Task
  • Defining and Configuring Workflow Scheduling
  • Scheduling Activation
  • Workflow Instance Execution

Principle Analysis

Creating a Schedule

org.apache.dolphinscheduler.api.controller.SchedulerController#createSchedule  
--org.apache.dolphinscheduler.api.service.impl.SchedulerServiceImpl#insertSchedule  
...  
Schedule scheduleObj = new Schedule();  
Date now = new Date();  
scheduleObj.setTenantCode(tenantCode);  
scheduleObj.setProjectName(project.getName());  
scheduleObj.setProcessDefinitionCode(processDefineCode);  
scheduleObj.setProcessDefinitionName(processDefinition.getName());  
ScheduleParam scheduleParam = JSONUtils.parseObject(schedule, ScheduleParam.class);  
scheduleObj.setCrontab(scheduleParam.getCrontab());  
scheduleObj.setTimezoneId(scheduleParam.getTimezoneId());  
scheduleObj.setWarningType(warningType);  
scheduleObj.setWarningGroupId(warningGroupId);  
scheduleObj.setFailureStrategy(failureStrategy);  
scheduleObj.setCreateTime(now);  
scheduleObj.setUpdateTime(now);  
scheduleObj.setUserId(loginUser.getId());  
scheduleObj.setUserName(loginUser.getUserName());  
scheduleObj.setReleaseState(ReleaseState.OFFLINE);  
scheduleObj.setProcessInstancePriority(processInstancePriority);  
scheduleObj.setWorkerGroup(workerGroup);  
scheduleObj.setEnvironmentCode(environmentCode);  
scheduleMapper.insert(scheduleObj);  
...

At its core, this operation inserts a new entry into the schedule table, as shown below:
Image description

Activating a Schedule

org.apache.dolphinscheduler.api.controller.SchedulerController#publishScheduleOnline  
--org.apache.dolphinscheduler.api.service.impl.SchedulerServiceImpl#onlineScheduler  
----org.apache.dolphinscheduler.api.service.impl.SchedulerServiceImpl#doOnlineScheduler  
------org.apache.dolphinscheduler.scheduler.quartz.QuartzScheduler#insertOrUpdateScheduleTask  

// Simplified code:  
JobKey jobKey = QuartzTaskUtils.getJobKey(schedule.getId(), projectId);  
Map<String, Object> jobDataMap = QuartzTaskUtils.buildDataMap(projectId, schedule);  
String cronExpression = schedule.getCrontab();  
String timezoneId = schedule.getTimezoneId();  

Date startDate = DateUtils.transformTimezoneDate(schedule.getStartTime(), timezoneId);  
Date endDate = DateUtils.transformTimezoneDate(schedule.getEndTime(), timezoneId);  
JobDetail jobDetail = newJob(ProcessScheduleTask.class).withIdentity(jobKey).build();  
jobDetail.getJobDataMap().putAll(jobDataMap);  
scheduler.addJob(jobDetail, false, true);  

TriggerKey triggerKey = new TriggerKey(jobKey.getName(), jobKey.getGroup());  
CronTrigger cronTrigger = newTrigger()  
                    .withIdentity(triggerKey)  
                    .startAt(startDate)  
                    .endAt(endDate)  
                    .withSchedule(  
                            cronSchedule(cronExpression)  
                                    .withMisfireHandlingInstructionIgnoreMisfires()  
                                    .inTimeZone(DateUtils.getTimezone(timezoneId)))  
                    .forJob(jobDetail).build();  

scheduler.scheduleJob(cronTrigger);  

Job Details Table: Stores detailed information about each task.

Image description

Trigger Base Table: Stores basic information for all trigger types.

Image description

Cron Trigger Table: Stores information about Cron expression triggers.

Image description


Schedule Execution

org.apache.dolphinscheduler.scheduler.quartz.ProcessScheduleTask

protected void executeInternal(JobExecutionContext context) {  
    JobDataMap dataMap = context.getJobDetail().getJobDataMap();  
    int projectId = dataMap.getInt(QuartzTaskUtils.PROJECT_ID);  
    int scheduleId = dataMap.getInt(QuartzTaskUtils.SCHEDULE_ID);  

    Date scheduledFireTime = context.getScheduledFireTime();  
    Date fireTime = context.getFireTime();  

    Command command = new Command();  
    command.setCommandType(CommandType.SCHEDULER);  
    command.setExecutorId(schedule.getUserId());  
    command.setFailureStrategy(schedule.getFailureStrategy());  
    command.setProcessDefinitionCode(schedule.getProcessDefinitionCode());  
    command.setScheduleTime(scheduledFireTime);  
    command.setStartTime(fireTime);  
    command.setWarningGroupId(schedule.getWarningGroupId());  
    String workerGroup = StringUtils.isEmpty(schedule.getWorkerGroup()) ? Constants.DEFAULT_WORKER_GROUP  
            : schedule.getWorkerGroup();  
    command.setWorkerGroup(workerGroup);  
    command.setTenantCode(schedule.getTenantCode());  
    command.setEnvironmentCode(schedule.getEnvironmentCode());  
    command.setWarningType(schedule.getWarningType());  
    command.setProcessInstancePriority(schedule.getProcessInstancePriority());  
    command.setProcessDefinitionVersion(processDefinition.getVersion());  

    commandService.createCommand(command);  
}

Essentially, this is a callback function in Quartz that ultimately generates a Command.