A very Long never ending Learning around Data Engineering & Machine Learning

Weekly Digest

The Data Engineering

Level 0

Level 1

Gyaan

Infrastructure

Machine Learning

MLOPS

Project

Insightful

Paper

Distributed System

Crazy

The Snowflake Paper - Core idea is to build an enterprise-ready #datawarehouse solution for the #cloud 🎉📰📕
Most important points around Distributed #dataengineering Platform
Fundamental of #distributedsystems Scaling - Avoiding Co-ordination 🎊♨️🔆
Technical Debt in #dataengineering #softwareengineering 🔕💡🔕
Paper on Wander Join: Online Aggregation via Random Walks 📃💭📑 Join problem
The Delta Lake Paper - High-Performance ACID Table Storage 📋💡📋
Dynamo - AWS Highly Available Key-value Store #distributedsystem 💬💡🎉
An Efficient and Syntactically Idiomatic Approach to Management of Streams and Tables, A Single SQL for all 💡📩📩
Secure & Robust Machine Learning in #healthcare 💊🧪🥳
Progress in Medical Science using #deeplearning 💊💡💉
The Amazon Redshift Paper - A fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze large volumes of data using existing #businessintelligence tools 📂📰💭
Advancing #drugdiscovery via Artificial Intelligence 💊🏥🏥
Apache Calcite is a dynamic data management framework 🎉📚🎉
Lakehouse - A Paper on new Generation of #datawarehouse technology 💡🔎💡
Calvin: Fast Distributed Transactions for Partitioned Database Systems 📝📝
Presto or Trino - #SQL on Everything ( The Design, Motivation & Performance) #presto 💭🎊💡
Design - Exactly Once Delivery & Transactional Messaging in Apache Kafka
Apache Kafka Paper : Distributed Messaging System for Log Processing
Paper: Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size
Paper: Ground is an open-source data context service, a system to manage all the information that informs the use of data
Azure Data Lake Store(ADLS) is a fully-managed, elastic, scalable, and secure file system that supports #hadoop distributed file system (HDFS) and Cosmos semantics
An LFU (Least Frequently Used) Cache eviction algorithm of O(1) Runtime complexity

NA

Cloud

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
sketchnotes		sketchnotes
.gitignore		.gitignore
A Data Engineering Story.pptx		A Data Engineering Story.pptx
README.md		README.md
WhyDataOrchestration.pdf		WhyDataOrchestration.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A very Long never ending Learning around Data Engineering & Machine Learning

Weekly Digest

The Data Engineering

Level 0

Level 1

Gyaan

Infrastructure

Machine Learning

MLOPS

Project

Insightful

Paper

Distributed System

Crazy

NA

Cloud

About

Releases

Packages

DaniHBV/around-dataengineering

Folders and files

Latest commit

History

Repository files navigation

A very Long never ending Learning around Data Engineering & Machine Learning

Weekly Digest

The Data Engineering

Level 0

Level 1

Gyaan

Infrastructure

Machine Learning

MLOPS

Project

Insightful

Paper

Distributed System

Crazy

NA

Cloud

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages