Skip to content

Distributed Kronos

Ankit Nanglia edited this page Jun 14, 2019 · 8 revisions

Setting up distributed Kronos

Here, we will setup Kronos in a distributed mode and will run the scheduler and executor process separately in the same machine and also use Kafka as the message queue.

Prerequisite

  • Ensure Java version 8 or above is installed.
  • Kafka up and running

Follow these steps to get started

  • Download distribution of Kronos from release page.
  • Unzip the downloaded distribution and change directory into it.

The distribution of Kronos contains the following directories :

FOLDER DESCRIPTION
sbin scripts to start/ stop Kronos web application
conf configuration files for Kronos web application
lib jars dependencies for Kronos
ext jars dependencies for Kronos extensions
webapp the web files for Kronos web server

In the conf directory there will be these files:

  • scheduler.yaml - Used by scheduler to configure itself. (Read more)
  • executor.yaml - Used by executor to configure itself. (Read more)
  • queue.yaml - Used to configure the messaging queue used by scheduler and executor. (Read more)
  • log4j.properties - Used to configure log4j for logging.

To configure Kronos web server set the following env variables, the default values are

export HOST="localhost"
export PORT=8080
export HEAP_OPTS="-Xmx128m -Xms128m"
  • Configure Kronos extensions

The Kafka queue and Embedded HSQL Store extension is not distributed as part of the Kronos and needs to be downloaded and configure separately. Download all the available kronos-extension from extensions release page and extract all the jars to the ext folder.

  • Configure Kafka as message queue to enable distributed mode. Update the queue.yaml file located under conf dir.
producerConfig:
  producerClass: com.cognitree.kronos.queue.producer.KafkaProducerImpl
  config:
    kafkaProducerConfig:
      bootstrap.servers : localhost:9092
      key.serializer : org.apache.kafka.common.serialization.StringSerializer
      value.serializer : org.apache.kafka.common.serialization.StringSerializer
consumerConfig:
  consumerClass: com.cognitree.kronos.queue.consumer.KafkaConsumerImpl
  config:
    kafkaConsumerConfig:
      bootstrap.servers : localhost:9092
      group.id: Kronos
      key.deserializer : org.apache.kafka.common.serialization.StringDeserializer
      value.deserializer : org.apache.kafka.common.serialization.StringDeserializer
    pollTimeoutInMs: 5000
  pollIntervalInMs: 5000
taskStatusQueue: taskstatus

Update the bootstrap.servers field in kafkaProducerConfig and kafkaConsumerConfig to point to a running Kafka broker. Any property passed in kafkaConsumerConfig or kafkaProducerConfig section is passed as it is while creating Consumer and Producer in Kafka.

Note: Kafka queue extension is not part of the release distribution and needs to be set up manually by downloading the extensions to ext folder.

  • Start the Kronos scheduler.
$ ./sbin/kronos.sh start scheduler
  • Start the Kronos executor.
$ ./sbin/kronos.sh start executor

The Kronos application is running and can be accessed through a browser. This will open a Swagger UI listing available resources.

http://localhost:8080/
  • Stopping Kronos application.
$ ./sbin/kronos.sh stop scheduler
$ ./sbin/kronos.sh stop executor

This concludes setting up Kronos web application in a distributed mode. The tasks scheduled by the scheduler will be picked by executor from the message queue. One can deploy multiple instances of the executor to distribute load across executors. This setup uses a RAM store which keeps all its data in RAM and thus won't be able to retain its state on restart.

Kronos allows you to plug in a store to persist its state and restore the same on restart. As a reference, we have added a store plugin which uses embedded HSQL DB to persist state on disk. The HSQL DB store extension jar is already available under lib as part of Kronos distribution. Update the Kronos scheduler configuration to use embedded HSQL store to maintain its state.

Update the scheduler.yaml available under conf dir

storeServiceConfig:
  storeServiceClass: com.cognitree.kronos.scheduler.store.jdbc.EmbeddedHSQLStoreService
  config:
    # directory to keep the kronos data
    dbPath: /tmp
    #username:
    #password:
    #minIdleConnection:
    #maxIdleConnection:

The configuration is explained in detail here.

What next? Head on to running the example guide to schedule a simple workflow executing shell command tasks.