Distributed Model Inference

An example distributed system for Machine Learning model inference by using Kafka.

The system serves user requests to predict objects from input images. The word "distributed" here does not imply techniques for dividing a model or input data for parallel execution. Instead, it refers to a loosely coupled architecture where workers can perform model inference in parallel to maximize the throughput of request completion.

In general, a server handles images uploaded by users and delivers inference requests via a Kafka broker. A group of workers receives the requests and runs a Machine Learning model to predict objects. Workers will deliver the prediction results to Kafka. The results will be accumulated on the server so that users can later query the inference results.

The system is composed of

Spring boot REST server which connects to MariaDB database by JPA.
Kafka cluster deployable in docker environment.
Java programs:
- to process users' input and transforms data in Kafka Streams manner.
- to infer a Resnet model and produces message to Kakfa.

A NodeJs client script is included for demonstrating how to interact with the system.

Repository Structure

folder	description
docker	Contains Docker Compose file(s) to run services such as `Kafka` and `MariaDB` in docker.
inference-server	A Maven project - Sprint boot REST server.
inference-worker	A Maven project - Java program for ML model inference.
preprocessor	A Maven project - Java program for preprossing input from server.
client-demo	Contains a client `nodejs` script for interaction with the system.

How to setup

Follow kafka/README.md, you will be able to run kafka cluster in docker containers.
Follow mariadb/README.md, you will be able to run database(s) in docker containers.
Follow monogodb/README.md, you will be able to run mongodb(s) in docker containers.
Follow message/README.md, you will install the common java package com.ah.message for the other java projects in this repo.
Follow preprocessor/README.md, you will be able to run multiple java programs for input preprocessing tasks.
Follow inference-workder/README.md, you will be able to run multiple java programs for model inference tasks.
Follow inference-server/README.md, you will run the REST API server to handle user requests.
Try out the system by using the client script here!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Model Inference

Repository Structure

How to setup

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
client-demo		client-demo
docker		docker
inference-server		inference-server
inference-worker		inference-worker
message		message
preprocessor		preprocessor
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
architecture.drawio.svg		architecture.drawio.svg

License

AlvinHon/distributed-model-inference

Folders and files

Latest commit

History

Repository files navigation

Distributed Model Inference

Repository Structure

How to setup

About

Topics

Resources

License

Stars

Watchers

Forks

Languages