A curated list of awesome big data frameworks, ressources and other awesomeness.
-
Updated
May 7, 2024
A curated list of awesome big data frameworks, ressources and other awesomeness.
Apache Kafka® running on Kubernetes
A lightweight stream processing library for Go
Probabilistic data structures for processing continuous, unbounded streams.
NIST Certified SCAP 1.2 toolkit
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
A stream processing API for Go (alpha)
Series and Panels for Real-time and Exploratory Analysis of Data Streams
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Public tracker for Scramjet Cloud Platform, a platform that bring data from many environments together.
Data stream analytics: Implement online learning methods to address concept drift and model drift in data streams using the River library. Code for the paper entitled "PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams" published in IEEE GlobeCom 2021.
The Open Source Time-Series Data Historian
Kafka-ML: connecting the data stream with ML/AI frameworks (now TensorFlow and PyTorch!)
The Tornado 🌪️ framework, designed and implemented for adaptive online learning and data stream mining in Python.
Probabilistic deep learning for data streams.
Event Based Applications [DEPRECATED]
Full-stack Highly Scalable Cloud-native Machine Learning system for demand forecasting with realtime data streaming, inference, retraining loop, and more
An online learning method used to address concept drift and model drift. Code for the paper entitled "A Lightweight Concept Drift Detection and Adaptation Framework for IoT Data Streams" published in IEEE Internet of Things Magazine.
Explore Apache Kafka data pipelines in Kubernetes.
Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level tutorial.
Add a description, image, and links to the data-stream topic page so that developers can more easily learn about it.
To associate your repository with the data-stream topic, visit your repo's landing page and select "manage topics."