Programming skills are crucial for an MLOps engineer. Python is the most commonly used language in machine learning, making it important for collaboration with machine learning engineers and data scientists.
Start learning Python through books and practice.
-
Tutorial Suggestion: https://realpython.com
-
Book Suggestion: Python Crash Course, 3rd Edition by Eric Matthes
-
Code Practice: LeetCode Python Problems
-
Courses: Learn Python 3
-
Track Suggestions: Python fundamentals, Python programming
-
Important Topics: Installing Python, using virtual environments, IDE usage, Python basics, Pytest, Packaging
Understanding bash is essential for MLOps tasks.
- Book Suggestion: The Linux Command Line, 2nd Edition by William E. Shotts
- Course: Bash mastery
- Tutorials: VIM beginners guide, VIM adventures, VIM by Daniel Miessler
These technologies are vital in modern software engineering.
Docker is a popular open-source containerization platform used in MLOps.
- Docker Roadmap: Roadmap.sh Docker
- Tutorial: Full docker tutorial by Techworld by Nana
Kubernetes is essential for machine learning model training and deployment.
- Kubernetes Roadmap: Roadmap.sh Kubernetes
- Tutorial: Kubernetes course by freecodecamp.com
- Course: Kubernetes mastery
- Tool: K9s CLI for Kubernetes
An MLOps engineer should have a basic understanding of machine learning models.
- Courses: MLCourse.ai, Fast.ai
- Book Suggestion: Applied Machine Learning and AI for Engineers by Jeff Prosise
Awareness of MLOps principles and maturity factors is required.
- Books:
- Designing Machine Learning Systems by Chip Huyen
- Introducing MLOps by Mark Treveil and Dataiku
- Assessment: MLOps maturity assessment
- Great resource on MLOps: ml-ops.org
MLOps platforms consist of various components, from version control to feature stores.
Critical for traceable and reproducible ML model deployments.
- Books:
- Learning GitHub Actions by Brent Laster
- Learning Git by Anna Skoulikari
- Tutorials & Courses: Git & GitHub for beginners, Python to Production guide, Version Control Missing Semester, https://learngitbranching.js.org/
- Tool: Pre-commit hooks
Systems like Airflow and Mage are important in ML engineering.
- Course: Introduction to Airflow in Python
- Note: Airflow is also featured in the ML Engineering with Python book and The Full Stack 7-Steps MLOps Framework.
Logging metadata, parameters, and artifacts of training runs.
- Tool: MLflow
- Courses: MLflow Udemy course, End-to-end machine learning (MLflow piece)
Feature stores are a crucial component of MLOps infrastructure.
- Tutorial: Creating a feature store with Feast Part 1 Part 2 Part 3
- Tool: DVC for data tracking
- Course: End-to-end machine learning (DVC piece)
Decisions depend on the organization's infrastructure.
- Repository Suggestion: ML Deployment k8s Fast API
- Tutorial Suggestions: ML deployment with k8s FastAPI, Building an ML app with FastAPI, Basic Kubeflow pipeline, Building and deploying ML pipelines, KServe tutorial
Vital for the health and performance of ML systems.
- ML Monitoring vs Observability article
- Course: Machine learning monitoring concepts, Monitoring ML in Python
- Tools: Prometheus, Grafana
Essential for a reproducible MLOps framework.
- Course: Terraform course for beginners
- Video: 8 Terraform best practices by Techworld by Nana
- Book Suggestion: Terraform: Up and Running, 3rd Edition by Yevgeniy Brikman