Notifi-Scaler

Overview

The notifi-scaler project is a Kubernetes-based system designed to scale works horizontally among workers based on the number of values in a table. The project includes Kubernetes manifests to deploy a control plane, workers, and database components. The workers process subsets of the table, incrementing a Value column and updating the CurrentWorker column as they go along.

The primary objective is to balance the workload across multiple worker nodes and to adapt to changes in the table or the number of workers dynamically. To achieve that, a consistent hashing algorithm in implemented.

System Requirement

Given a table of values with the following schema, our system horizontally scale up and down workers based on the number of values in the table. When a worker processing a subset of the table is removed, the workload will be evenly distributed across the remaining workers until a new worker node is added.

Table Schema:

ID: uuid
Value: integer
CurrentWorker: string

While a worker is processing a subset, it will increment the Value by 1 every second and set the CurrentWorker value to the name of the worker pod.

Every second, the worker will print the list of IDs being watched in the table. This list will remain stable if there are no changes to the number of values in the table or the number of available workers. However, if there is a change to either, a redistribution will be performed to minimize the amount of change across workers using consistent hashing algorithm.

Structure

➜  notifi-scaler git:(main) ✗ tree
.
├── Chart.yaml
├── README.md
├── images
│   ├── control-plane
│   │   ├── Dockerfile
│   │   └── main.go
│   └── worker
│       ├── Dockerfile
│       └── main.go
├── static
│   └── architecture.png
├── templates
│   ├── control-plane-deployment.yaml
│   ├── control-plane-service.yaml
│   ├── db-configmap.yaml
│   ├── db-pv.yaml
│   ├── db-pvc.yaml
│   ├── db-secret.yaml
│   ├── db-service.yaml
│   ├── db-statefulset.yaml
│   ├── db-storageclass.yaml
│   ├── redis-deployment.yaml
│   ├── redis-service.yaml
│   ├── worker-deployment.yaml
│   └── worker-service.yaml
└── values.yaml

Requirements

Helm v3.x
Kubernetes v1.27.x

Architecture

Quick Start

Deploy

git clone https://github.com/yourusername/notifi-scaler.git
cd notifi-scaler
helm install notifi-scaler .

Monitor

# k8s
while true; do clear; kubectl get pods; sleep 1; done

# control-plane
while true; do kubectl logs $(kubectl get pods -l app=control-plane -o jsonpath='{.items[0].metadata.name}'); sleep 1; done

# all workers
for hash in $(kubectl get pods -l app=worker -o=jsonpath='{.items[*].metadata.labels.pod-template-hash}'); do kubectl logs -l app=worker,pod-template-hash=$hash --all-containers=true; done

# single worker
kubectl logs <worker-pod-name>

# database
while true; do clear; kubectl exec -it postgres-0 -- psql -U user -d mydatabase -c "SELECT * FROM public.work_items ORDER BY currentWorker;"; sleep 1; done

# redis
POD_NAME=$(kubectl get pods -l app=redis -o=jsonpath='{.items[0].metadata.name}' -n default) && while true; do clear; kubectl exec -it "$POD_NAME" -- redis-cli hgetall workers; sleep 1; done

Scale

helm upgrade notifi-scaler-release . --set worker.replicaCount=<number-of-replicas>

Areas of Improvement

To manage concurrency, I have implemented atomic operations in Redis to mitigate risks. However, this is just the initial layer of defense against issues like race conditions and deadlocks. To ensure robust protection for the system, we need to integrate advanced synchronization techniques. This could involve using locks at the Go or process level, or utilizing database-level transactions.
During testing, both the control plane and worker nodes demonstrated their ability to handle sudden spikes in workload. However, if we want to be fully prepared for extreme variations in load, we should implement rate-limiting as well as back-off and retry strategies.
For resilience, we must incorporate failover mechanisms for our key components (control plane, Redis, and Postgres). This may involve setting up the control plane in a clustered manner and enabling data replication for the databases, making our architecture much more resistant to individual component failures.
Currently, we have basic monitoring and logging in place, but we could greatly benefit from a more comprehensive setup that provides real-time insights into system performance and health.
On the security front, it is crucial to secure communication channels with SSL/TLS encryption and enforce Role-Based Access Control (RBAC) for system interactions. This will enhance system integrity.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notifi-Scaler

Overview

System Requirement

Structure

Requirements

Architecture

Quick Start

Areas of Improvement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
images		images
static		static
templates		templates
Chart.yaml		Chart.yaml
README.md		README.md
values.yaml		values.yaml

imehrdadmahdavi/elastic-notification-scaler

Folders and files

Latest commit

History

Repository files navigation

Notifi-Scaler

Overview

System Requirement

Structure

Requirements

Architecture

Quick Start

Areas of Improvement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages