Clustering Quartz Jobs

I was looking for a scale-out option with scheduling jobs and having used quartz previously, found that it is pretty easy to get clustering up and running pretty easily. The only caveat being that it is possible only with JDBC job store. The sample I tried with was a straight-forward job that just prints the time and the scheduler which has triggered it.

Sample Job:

import org.quartz.*;

@PersistJobDataAfterExecution
public class PrintJob implements Job {

   public void execute(JobExecutionContext jobExecutionContext) throws JobExecutionException {
      try {
         System.out.println("Print : "+System.currentTimeMillis()+" , "+jobExecutionContext.getScheduler().getSchedulerInstanceId());
      } catch (SchedulerException e) {
         e.printStackTrace();
      }
   }
}

Sample Trigger:

import org.quartz.*;
import org.quartz.impl.StdSchedulerFactory;

import java.io.*;
import java.util.Properties;

import static org.quartz.JobBuilder.newJob;
import static org.quartz.SimpleScheduleBuilder.simpleSchedule;

public class PrintScheduler {

	private Scheduler scheduler;
	public PrintScheduler(String instanceId) {
		try {
			Properties properties = loadProperties();
			properties.put("org.quartz.scheduler.instanceId",instanceId);
			scheduler = new StdSchedulerFactory(properties).getScheduler();
			scheduler.start();
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

	private Properties loadProperties() throws FileNotFoundException,IOException {
		Properties properties = new Properties();
		try (InputStream fis = PrintScheduler.class.getResourceAsStream("quartz.properties")) {
			properties.load(fis);
		}
		return properties;
	}

	public void schedule() throws SchedulerException {
		JobDetail job = newJob(PrintJob.class).withIdentity("printjob", "printjobgroup").build();
		Trigger trigger = TriggerBuilder.newTrigger().withIdentity("printTrigger", "printtriggergroup")
				.startNow().withSchedule(simpleSchedule().withIntervalInMilliseconds(100l).repeatForever()).build();
		scheduler.scheduleJob(job, trigger);
	}

	public void stopScheduler() throws SchedulerException {
		scheduler.shutdown();
	}

	public static void main(String[] args) {
		PrintScheduler printScheduler = new PrintScheduler(args[0]);
		try {
//			printScheduler.schedule();
			Thread.sleep(60000l);
			printScheduler.stopScheduler();
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

}

Please note, I have used quartz 2.x for this example.

On the configuration side, more-or-less it remains the same as for single node with couple of exceptions –

org.quartz.scheduler.instanceName = PRINT_SCHEDULER1
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 4
org.quartz.threadPool.threadsInheritContextClassLoaderOfInitializingThread = true

#specify the jobstore used
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.useProperties = false

#The datasource for the jobstore that is to be used
org.quartz.jobStore.dataSource = myDS

#quartz table prefixes in the database
org.quartz.jobStore.tablePrefix = qrtz_
org.quartz.jobStore.misfireThreshold = 60000
org.quartz.jobStore.isClustered = true
org.quartz.scheduler.instanceId = PRINT_SCHEDULER1

#The details of the datasource specified previously
org.quartz.dataSource.myDS.driver = com.mysql.jdbc.Driver
org.quartz.dataSource.myDS.URL = jdbc:mysql://localhost:3307/blog_test
org.quartz.dataSource.myDS.user = root
org.quartz.dataSource.myDS.password = root
org.quartz.dataSource.myDS.maxConnections = 20<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>

The configurations that are cluster specific here are –  org.quartz.jobStore.isClustered and org.quartz.scheduler.instanceId. In case of a single node instance, org.quartz.jobStore.isClustered is marked as false. In case of a cluster setup, it is changed to true. The second property that needs to be changed is on the instanceId which is like a name/ID used to uniquely identify the scheduler instance in the cluster. This property can be marked as AUTO in which case, each scheduler instance will be automatically assigned with a unique value, or you can choose to provide a value on your own (which I find useful since it helps me identify where the job is running). But, please note that the uniqueness is still to be maintained.

One of the requirement for this to work is to have time sync between the nodes running the scheduler instances or there might be issues with the schedule. Also, there is no guarantee that there will be equal load distribution amongst the nodes with clustering. As per the documentation, quartz ideally prefers to run the job on the same node in case it is not currently on load.

Code @ https://github.com/vageeshhoskere/blog/tree/master/quartz

Leave a comment