Skip to content

Distributed Kronos

Ankit Nanglia edited this page Apr 19, 2019 · 8 revisions

Setting up distributed Kronos

Here, we will setup Kronos in a distributed mode and will run the scheduler and executor process separately in the same machine and also use Kafka as the message queue.

Prerequisite

  • Ensure Java version 8 or above is installed.
  • Kafka up and running

Follow these steps to get started

  • Download a distribution of Kronos from release page.
  • Unzip the downloaded distribution and change directory into it.

The distribution of Kronos contains the following directories :

FOLDER DESCRIPTION
sbin scripts to start/ stop Kronos web application
conf configuration files for Kronos web application
lib jars dependencies for Kronos and Kronos extensions
webapp the web files for Kronos web server

In the conf directory there will be these files:

  • scheduler.yaml - Used by scheduler to configure itself. (Read more)
  • executor.yaml - Used by executor to configure itself. (Read more)
  • queue.yaml - Used to configure the messaging queue used by scheduler and executor. (Read more)
  • log4j.properties - Used to configure log4j for logging.

The following properties in env.sh located under sbin directory is used to configure Kronos web server.

export HOST="localhost"
export PORT=8080
export HEAP_OPTS="-Xmx128m -Xms128m"
  • Configure Kafka as message queue to enable distributed mode. Update the queue.yaml file located under conf dir.
producerConfig:
  producerClass: com.cognitree.kronos.queue.producer.KafkaProducerImpl
  config:
    kafkaProducerConfig:
      bootstrap.servers : localhost:9092
      key.serializer : org.apache.kafka.common.serialization.StringSerializer
      value.serializer : org.apache.kafka.common.serialization.StringSerializer
consumerConfig:
  consumerClass: com.cognitree.kronos.queue.consumer.KafkaConsumerImpl
  config:
    kafkaConsumerConfig:
      bootstrap.servers : localhost:9092
      group.id: Kronos
      key.deserializer : org.apache.kafka.common.serialization.StringDeserializer
      value.deserializer : org.apache.kafka.common.serialization.StringDeserializer
    pollTimeoutInMs: 5000
  pollIntervalInMs: 5000
taskStatusQueue: taskstatus

Update the bootstrap.servers field in kafkaProducerConfig and kafkaConsumerConfig to point to a running Kafka broker. Any property passed in kafkaConsumerConfig or kafkaProducerConfig section is passed as it is while creating Consumer and Producer in Kafka.

Note: Kafka queue extension is part of the release distribution. If setting up manually add the extension to the classpath for it to work.

  • Start the Kronos scheduler.
$ ./sbin/kronos.sh start scheduler
  • Start the Kronos executor.
$ ./sbin/kronos.sh start executor

The Kronos application is running and can be accessed through a browser. This will open a Swagger UI listing available resources.

http://localhost:8080/
  • Stopping Kronos application.
$ ./sbin/kronos.sh stop scheduler
$ ./sbin/kronos.sh stop executor

Note: The startup script currently will stop both the scheduler and executor if running on the same node. The issue will be fixed soon

This concludes setting up Kronos web application in a distributed mode. The tasks scheduled by the scheduler will be picked by executor from the message queue. One can deploy multiple instances of the executor to distribute load across executors. This setup uses a RAM store which keeps all its data in RAM and thus won't be able to retain its state on restart.

Kronos allows you to plug in a store to persist its state and restore the same on restart. As a reference, we have added a store plugin which uses embedded HSQL DB to persist state on disk. The HSQL DB store extension jar is already available under lib as part of Kronos distribution. Update the Kronos scheduler configuration to use embedded HSQL store to maintain its state.

Update the scheduler.yaml available under conf dir

storeServiceConfig:
  storeServiceClass: com.cognitree.kronos.scheduler.store.jdbc.EmbeddedHSQLStoreService
  config:
    # directory to keep the kronos data
    dbPath: /tmp
    #username:
    #password:
    #minIdleConnection:
    #maxIdleConnection:

The configuration is explained in detail here.

What next? Head on to running the example guide to schedule a simple workflow executing shell command tasks.