This project demonstrates how to set up a Kafka cluster using KRaft mode (without Zookeeper) and integrate it with Apache Druid for data analytics and visualization.
- Kafka in KRaft mode: A modern setup for Kafka that replaces Zookeeper with a self-managed metadata quorum.
- Apache Druid: A high-performance, real-time analytics database integrated with Kafka for efficient querying and visualization.
- Fully containerized Kafka cluster in KRaft mode (no Zookeeper).
- Apache Druid cluster for real-time ingestion and querying.
- Integration with PostgreSQL for Druid metadata storage.
-
Kafka (KRaft Mode):
-
Apache Druid:
- Docker and Docker Compose installed on your machine.
- Minimum hardware requirements:
- 4 GB RAM
- Quad-core processor
git clone https://github.com/evanmathew/Apache-Kafka-Kraft-and-Apache-Druid.git
cd Apache-Kafka-Kraft-and-Apache-Druid
Ensure the following variables are set in the environment
file:
- PostgreSQL Database:
POSTGRES_PASSWORD
: Set the password for thedruid
user.POSTGRES_USER
: The database username (default:druid
).POSTGRES_DB
: The database name (default:druid
).
docker-compose up -d
- Druid Router:
http://localhost:8888
- Kafka Brokers: Exposed on ports
29092
,39092
.
- Install virtual env.: python -m venv venv
- Initiate venv: venv/Scripts/activate
- Run the code which will produce the random sample data and stream to kafka using producer
- Access the Druid Router UI (
http://localhost:8888
). - Navigate to
Load Data
and selectApache Kafka
. - Configure the Kafka topic to ingest data.
- bootstrap server:
broker-1:19092,broker-2:19092
- topic name:
ecommerce_event_data
- Start parsing the data
- bootstrap server:
- controller.quorum.voters: Defines the controller quorum.
- process.roles: Specifies whether a node is a
broker
,controller
, orbroker,controller
- node.id: Unique identifier for each node in the cluster.
- Druid relies on Kafka for real-time data ingestion.
- Metadata is stored in PostgreSQL, mounted as a volume.
- Kafka is not starting: Ensure the
controller.quorum.voters
setting is correct in thedocker-compose.yml
. - Druid UI not accessible: Verify that the ports are not blocked or in use by other applications.
Contributions are welcome! Please fork the repository and submit a pull request.
Happy Streaming and Querying! 🚀