This is a demo project for the Advanced Topics in Data Engineering II (TAED-2) and the Machine Learning Systems in Production (MLOps) 2023-24 courses (Universitat Politècnica de Catalunya-BarcelonaTech (UPC), Spain).
This project follows the structure proposed by Lanubile et al. [1].
Milestone | Practice | Tools | Demo |
---|---|---|---|
Milestone 1 — Project Inception | Selection of problem and requirements engineering for ML | Model and dataset cards (by Hugging Face) | |
Project coordination and communication | Taiga, Trello, Slack | ||
Milestone 2 — Model building: repoducibility | Project structure | Cookiecutter data science template | Project setup guide |
Code and data versioning | Git with GitHub Flow, DVC | Git demo, DVC demo | |
Experiment tracking | MLflow | MLflow demo | |
Milestone 3 — Model building: QA | Energy efficiency awareness | CodeCarbon | CodeCarbon demo |
Quality assurance for ML (static analysis + testing data and model) | Pynbilint (notebook + repository QA), Pylint or flake8, Pytest and Great Expectations | Pytest demo, Great Expectations demo | |
Milestone 4 — Model deployment: API | ML system design | Cloud platform selected by the students (VMs, Heroku, DigitalOcean, etc.) | Deployment guides |
APIs for ML | FastAPI and Pytest (to test the API endpoints) | FastAPI demo | |
Milestone 5 — Model Packaging | Containers | Docker, Docker Compose, Kubernetes | Docker demo |
[1] F. Lanubile, S. Martínez-Fernández, and L. Quaranta, "Teaching MLOps in Higher Education through Project-Based Learning." SEET@ICSE 2023: 95-100. doi: 10.1109/ICSE-SEET58685.2023.00015.
[2] F. Lanubile, S. Martínez-Fernández, and L. Quaranta, "Training future ML engineers: a project-based course on MLOps." IEEE Software 2024. doi: 10.1109/MS.2023.3310768.