An example distributed system for Machine Learning model inference by using Kafka
.
The system serves user requests to predict objects from input images. The word "distributed" here does not imply techniques for dividing a model or input data for parallel execution. Instead, it refers to a loosely coupled architecture where workers can perform model inference in parallel to maximize the throughput of request completion.
In general, a server handles images uploaded by users and delivers inference requests via a Kafka broker. A group of workers receives the requests and runs a Machine Learning model to predict objects. Workers will deliver the prediction results to Kafka. The results will be accumulated on the server so that users can later query the inference results.
The system is composed of
Spring boot
REST server which connects toMariaDB
database byJPA
.Kafka
cluster deployable indocker
environment.Java
programs:- to process users' input and transforms data in
Kafka Streams
manner. - to infer a
Resnet
model and produces message to Kakfa.
- to process users' input and transforms data in
A NodeJs
client script is included for demonstrating how to interact with the system.
folder | description |
---|---|
docker | Contains Docker Compose file(s) to run services such as Kafka and MariaDB in docker. |
inference-server | A Maven project - Sprint boot REST server. |
inference-worker | A Maven project - Java program for ML model inference. |
preprocessor | A Maven project - Java program for preprossing input from server. |
client-demo | Contains a client nodejs script for interaction with the system. |
- Follow kafka/README.md, you will be able to run kafka cluster in docker containers.
- Follow mariadb/README.md, you will be able to run database(s) in docker containers.
- Follow monogodb/README.md, you will be able to run mongodb(s) in docker containers.
- Follow message/README.md, you will install the common java package
com.ah.message
for the other java projects in this repo. - Follow preprocessor/README.md, you will be able to run multiple java programs for input preprocessing tasks.
- Follow inference-workder/README.md, you will be able to run multiple java programs for model inference tasks.
- Follow inference-server/README.md, you will run the REST API server to handle user requests.
- Try out the system by using the client script here!