This project follows the idea from LSTMs for Human Activity Recognition Time Series Classification, but my implementation is done with PyTorch instead of Keras. Moreover, Docker environment and Python notebook are provided for a convenient experiment and reproducibility.
For tracking experiment, I chose MLFlow because it's an end-to-end MLOps tool which supports wide range of features from experiment tracking and project packaging to model deployment. Additionally, it's an open-source project and allows me to deploy a tracker service on my own server. By the way, MLFlow supports auto logging with only PyTorch Lightning so that, here, I share some code for defining dataset module and neural network class with PyTorch Lightning scheme.
- Human Activity Recognition with LSTM model (PyTorch) and MLFlow Tracking
NOTE: Please see how to install docker
at Get Docker and docker-compose
at Install Docker Compose.
If you want to use Nvidia GPU for training a model, follow the instruction steps at Nvidia Container Toolkit. Its runtime dependencies are required by a Docker container to access GPU resources.
WARNING: If your machine doesn't have GPU or you don't wanna to use it, you can comment the below lines (config of har_lstm_jupyterlab
container) in docker-compose.yaml
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
- Create
.env
and insert these below variables.UID
is an ID of a current user (check by running a command lineid
).JUPYTER_CONTAINER_MEM_LIMIT
is used to limit memory usage of Jupyter-lab's container.
UID=1000
JUPYTER_CONTAINER_MEM_LIMIT=8g
JUPYTER_EXPOSED_PORT=8888
MYSQL_DATABASE=mlflow
MYSQL_USER=mlflow
MYSQL_PASSWORD=mlflow
MYSQL_ROOT_PASSWORD=mlflow
MYSQL_DATA_PATH=/path/to/har-lstm/data/mlflow_db
MLFLOW_EXPOSED_PORT=5000
MLFLOW_ARTIFACTS_PATH=/path/to/har-lstm/data/mlflow_artifacts
Here, I use Docker environment for running 3 services as containers; Jupyterlab, MLFlow tracker service, and SQL database. Docker compose is used for conveniently managing those containers and they are configured by docker-compose.yaml
.
Let's build images of each service by
$ docker-compose build
Credit: Thanks the base docker-compose.yaml
and Dockerfile
for MLFlow from MLFlow Docker Setup
- Download a dataset to
./data
folder by running a scriptscripts/download_dataset.sh
$ ./scripts/download_dataset.sh ./data
Then, you will have a folder ./data/har_dataset
like the below
data/har_dataset
├── activity_labels.txt
├── features_info.txt
├── features.txt
├── README.txt
├── test
└── train
- Create other required folders for database and MLFLow services
$ mkdir -p ./data/mlflow_db ./data/mlflow_artifacts
- Edit
config/jupyter/jupyter_notebook_config.py
by your own. - For example, set a set a password hash for Jupyterlab access by running this Python code inside the Jupyterlab container.
from notebook.auth import passwd; passwd()
After that, place a hash string in the config file at
# config/jupyter/jupyter_notebook_config.py
c.NotebookApp.password = 'sha1:place:yourstring'
NOTE: Currently, the default password is jupyter
.
- Start all containers
docker-compose up -d
- Check if all services are ready
docker-compose ps -a
You see all containers are up like below
❯ docker-compose ps -a
NAME COMMAND SERVICE STATUS PORTS
har_lstm_jupyterlab "/bin/bash -c 'addus…" jupyterlab running 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp
mlflow_db "/entrypoint.sh mysq…" mlflow_db running (starting) 33060/tcp
mlflow_tracker "bash ./wait-for-it.…" mlflow running 0.0.0.0:5000->5000/tcp, :::5000->5000/tcp
- After that, you can access Jupyterlab at
http://localhost:8888
(or another port which you've set byconfig/jupyter/jupyter_notebook_config.py
section) with the default passwordjupyter
. - For MLFlow tracker, you can access it at
http://localhost:5000
.
You might want to explore and visualize data. You can do it by using exploration.ipynb
notebook.
- Run code in
train_lstm.ipynb
from a browser or by quickly running the below command inside the Jupyterlab container.
$ docker exec -it har_lstm_jupyterlab /bin/bash -c "jupyter nbconvert --execute /workspace/train_lstm.ipynb"
NOTE: You can use --to notebook
option to output the result notebook (train_lstm.nbconvert.ipynb
) which shows train/val loss during training.
$ docker exec -it har_lstm_jupyterlab /bin/bash -c "jupyter nbconvert --execute --to notebook /workspace/train_lstm.ipynb"
- A model file will be saved at
har_lstm_16_ep50_std.pt
. - Here is train/val loss chart during training by this notebook.
Run code in train_lstm_mlflow.ipynb
or by quickly running the below command inside the Jupyterlab container. Unlike train_lstm.ipynb
, this notebook trains a LSTM model and sends metrics during to MLFlow tracker service.
$ docker exec -it har_lstm_jupyterlab /bin/bash -c "jupyter nbconvert --execute /workspace/train_lstm_mlflow.ipynb"
Please note that, in this notebook, a model is trained with PyTorch Lightning package. Currently, MLFlow auto logging only support training with PyTorch Lightning.
- Here are some screenshots from MLFlow Web UI after training a model.
Thanks to these projects and articles, I can understand basic LSTM in PyTorch, PyTorch Lightning, and MLFlow integration.