This is a simple aggregation of all of the "Awesome --Topic name--" github repos I've found till date and I feel are important to get started in the corresponding Topic.
The topics are relevant to Data lifecycle, Machine Learning, Deep learning research and some distributed computing.
Note: Not all of these links are actively maintained but some of them may serve as good starting points. There are multiple lists for certain topics which may or may not have common links, I have added them with a serial number under the topic in no particular order.
- Math
- Data augmentation
- Multitask learning
- Diffusion models
- Self supervised learning
- Semi supervised learning
- Weakly Supervised Learning
- Learning with Label noise
- Adversarial ML/DL
- Architecture Search
- Contrastive self supervised learning
- Zero shot learning
- One shot learning
- Few shot learning
- Siamese networks
- Image Classification
- Contrastive learning
- Visual transformers
- Transformers for vision
- Transformers in Medical Imaging
- Transformers
- OpenSetRecognition list
- Incremental learning
- Meta Learning
- Deep learning uncertainty
- Semantic segmentation
- Instance Segmentation
- Box supervised Instance Segmentation
- Referring Segmentation
- Image Delineation
- Long tailed recognition/learning
- Image matting
- Image inpainting
- Image Harmonization
- Face
- Conformal predictions
- Scene understanding
- Panoptic Segmentation
- Object Tracking and Detection
- Image Denoising
- Image Distortion correction
- Image Deblurring
- Continual Learning/Lifelong learning
- Multimodal learning
- Multimodal Conversational AI
- Active learning
- Deep Reinforcement Learning
- Knowledge Distillation
- Anomaly detection
- Local Global Descriptors
- Image Captioning
- Image to Image translation
- Image Registration
- Text To Speech
- Text to Image synthesis
- Deep HDR Image synthesis
- Speech recognition & synthesis
- Speaker Diarization
- Video Analysis
- Video Generation
- Pose Estimation
- Machine translation
- Visual Question Answering (VQA)
- Question Answering (QA)
- Vision and Language Pretrained models
- Vision-Language Navigation
- Explainable Graph Reasoning
- Knowledge Graph Question Answering
- Sentence embeddings
- Embedding Models
- Text Summarization
- Optical Character Recognition (OCR)
- Document understanding
- Graph Neural Networks (GNN)
- Generative Adversarial Networks (GAN)
- AI Art Image Synthesis
- Generative Modelling
- Optical Flow
- 360 vision
- Image Alignment and Stitching
- Energy based models
- Decision Trees
- Gradient Boosting Models
- Metric Learing
- Recommendation Systems
- Point Cloud Analysis
- Pruning
- Neural Ordinary Differential Equations (ODE)
- Autonomous Vehicles (FSD)
- Robotics
- Curriculum Learning
- Causal Inference/ML
- Satellite Image Deep Learning
- Transfer learning, Domain Adaption & Domain Generalization
- Image Restoration
- Variational Autoencoders
- Bayesian inference/Bayesian DL
- Deep Geometrical Learning
- Drug Discovery
- Representation Learning
- Disentangled Representations
- Time series
- Unsupervised/Weakly supervised learning
- Neural Rendering
- Neural Radiance Fields
- Neural Art & Style Transfer
- Deepfakes
- Makeup transfer
- Audio/Music
- Inverse Graphics
- Model Quantization
- Quantum Machine Learning
- Game AI
- Biomedical ML
- Financial ML/Quantitative Finance
- Embodied Vision
- Embedded and Mobile ML
- 3D Machine Learning
- AutoML
- Chatbot projects
- Conversational AI
- AI for climate change
- Federated Learning
- Relational Extraction
- Attention Mechanisms in Computer Vision
- Masked Image Modelling
- Table Recognition
- Imbalanced Learning
- Music Generation
- 3D Generation
- Small Molecule - Drug Discovery ML
- Low light Image Enhancement
- DehazeZoo
- ML for cybersecurity
- Various Attention implementations in PyTorch
- Transformer Attention
- Radiology Report generation
- Vision and Language
- Machine Learning for Combinatorial Optimization
- Dataset Distillation/Condensation
- Virtual Try on
- Protein Design using DL
- Novel Class Discovery
- Music Informatics
- Discovery of Physical Laws
- CLIP (Contrastive Language-Image Pre-Training)
- Content Based Image Retrieval
- Topic Modeling
- Vector Search
- Clinical NLP
- Medical Coding NLP
- Content Moderation
- Pretrained Chinese NLP models
- Computer Vision in the Wild
- Mixture of Experts
- Reasoning in Large Language Models
- LLM Robotics
- Networks Beyond Attention
- Matching tasks
- Semantic Search
- GAN Inversion
- Dynamic Graph Learning
- Molecular Generation
- Failed ML
- Generative AI Applications
- ChatGPT
- Prompt Engineering
- Generative AI
- Conditional Content Generation
- Large Language Models (LLM)
- LLM Robotics
- LangChain
- Reinforcement Learning with Human Feedback(RLHF)
- LLMOps
- Instruction Dataset
- Totally open ChatGPT
- 6D Object Pose Estimation and Reconstruction
- Awesome Anything
- Pretrained models for Information Retrieval
- AI Tools for Game development
- Human Video Generation
- Medical Vision-Language Models
- Video Diffusion
- Multimodal Large Language Models
- Data collection search engines
- Deep Learning Drizzle
- Deep Vision
- Deep Learning
- Machine Learning
- Computer vision
- Image processing applications with OpenCV
- DL papers
- DL paper reading roadmap
- ML Acronyms
- Applied Deep Learning
- NLP
- Jupyter
- GPT 3
- Made With ML
- Guide to Pytorch Learning Rate Scheduling
- PyTorch
- TensorFlow
- JAX
- 500 AI ML projects
- Tensorflow.js
- HuggingFace
- FastAI
- NLP recipes/Best practices
- Computer vision recipes/Best practices
- Time Series Forecasting - Best practices
- Access amd modify layes of pretrained models in PyTorch
- ML in Digital Signal Processing
- Paper and literature search websites, twitter accounts
- Math for ML
- ML Papers explained
- Transfer Learning
- Data Collection
- Public APIs
- Robotics Datasets
- Medical Imaging Datasets
- Robotic Tooling
- Public Datasets
- Web Scraping
- Data Engineering
- Feature Engineering
- SQL
- NoSQL
- DynamoDB
- MongoDB
- Firebase
- Redis
- Apache Spark
- BigQuery
- BigData
- Workflow Engines/Pipelines/DAG schedulers
- Pandas
- Dask
- Graph Databases
- Data Annotation
- Data Visualization
- Tools to Design or Visualize Architecture of Neural Network
- Power BI
- Network analysis
- Software Engineering for Machine Learning
- Scikit Learn
- Production level ML/DL
- ML system design
- Applied ML
- ML Infrastructure
- MLOps
- AIOps
- Explainable AI(XAI)
- ML Interpretability
- Fairness in AI
- Kaggle Solutions
- Ethical AI Guidelines
- SageMaker
- CoreML models
- Terraform
- DataBricks
- ML stack of blogs of companies
- Colab notebooks
- ML Libraries and tooling
- ML Design Patterns
- Distributed ML Patterns
- Deep Learning Design Patterns
- PyTorch Computer Vision Cookbook - 70 recipes
- System Design
- RESTful API
- Dev Ops Exercises
- Webhooks
- ASGI
- Flask
- Django
- FastAPI
- Distributed systems
- Python
- GraphQL
- Scientific Python
- CUDA
- Parallel Computing
- Docker
- Docker Compose
- Kubernetes
- Ansible
- High Performance Computing (HPC)
- Microservices
- Low Level Design Primer
- System Design Primer
- Design Patterns
- Software Architecture
- Asyncio Python
- Knowledge graphs
- GRPC
- Peer-to-Peer
- RabbitMQ
- Kafka
- ML-tooling: Best of Python
- Amazon Web Services (AWS)
- Google Cloud Platform (GCP) Certifications
- Azure
- Digital Ocean (DO)
Note: These lists are related to general concepts in Software architecture. They may or may not be completely related to Data lifecycle, ML and DL topics.
- Full Text Search
- MapReduce
- Build Your Own X
- Falsehoods
- CTO
- Raspberry Pi
- Creative Coding
- Site Reliability Engineering(SRE)
- Web Security
- Chaos Engineering
- ElasticSearch
- Database Internals
- Cryptography
- Tech Stacks
- Geographic Information System(GIS)
- GeoSpatial Tools
- IoT
- Quantum Computing
- Regex
- CS Video courses
- ML-YouTube Courses
- AI Expert Roadmap
- Hacking
- Concepts and Laws
- CS books and Digests
- Self Hosted SaaS solutions
- Search
- UI
Check CONTRIBUTING.md
- Convert README to a tabular format
- Rearrange the topics in alphabetical order
- Add definitions for tasks if needed