Skip to content

Implementing automatic data processing, model training and experiment tracking with Airflow and MLflow.

Notifications You must be signed in to change notification settings

stsibikov/MLOps.Inception

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

HSE university, spring 2024

Implementing automatic data processing, model training and experiment tracking with Airflow and MLflow.

The goal of the course was to write a DAG that trains 3 different regressor models and stores its code and metrics in a MLflow experiment.

The DAG:

  • Retrieves data from locally-running PostgreSQL server, and stores it into S3
  • Retrieves data from S3 and runs preprocessing. The results are also stored into S3
  • Initializes the MLflow experiment, providing experiment id to be used for training tasks
  • Runs 3 model training tasks in parallel, logging the model and storing regressor metrics into MLflow
  • Saves timestamps of the tasks into S3

DAG graph: DAG_graph

MLflow metrics of all models: MLflow_metrics

Metadata of one of the models - HistGB: MLflow_HistGB_artifacts

Code for the DAG is available here.

About

Implementing automatic data processing, model training and experiment tracking with Airflow and MLflow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages