The goal of this project is to deploy a sentiment analysis model by using the best practices in MLOPS. The base model is a Roberta which will be finetuned on an Amazon reviews dataset.
You can find the cards here :
- Cloning the repository
git clone https://github.com/MLOps-essi-upc/MLOps-SentiBites.git
- Install requirements
pip install -r requirements.txt
- Pull data from Dagshub
dvc pull -r origin
- Launch the tests
pytest tests/
You have two possibilities :
Pull the latest container from our DockerHub repo (recommended):
- Pull the container
docker pull rudiio/sentibites:latest
- Run the container
docker run -p 5000:5000 -p 8000:8000 rudiio/sentibites:latest
Build a new docker image on your machine.
-
Build the project with the instruction given in the previous section
-
Create the docker image
docker build -t sentibites:x.x .
- Run the container
docker run -p 5000:5000 -p 8000:8000 sentibites:x.x
For training :
python3 src/models/train_model.py --model "roberta-base" --dataset data/processed --output_dir run1 --logging_dir logs --epochs 1 --learning_rate 0.001 --weight_decay 0.005
For inference :
python3 src/models/predict_model.py --model "models/SentiBites" --input "text"
For evaluation :
python3 src/models/evaluate_model.py
You can run the app on local with the following command :
python3 run_servers
that will run the backend on locahost:8000 and the frontend on localhost:5000.
Here is a preview of the our application :
Data processing stage:
dvc stage add -n data-processing \
-d src/data/make_dataset.py \
-d data/raw/Reviews.csv \
-o data/processed/train.csv \
-o data/processed/test.csv \
python3 src/data/make_dataset.py
Training stage:
dvc stage add -n training \
-d src/data/train_model.py \
-d data/processed/train.csv \
-d data/processed/test.csv \
-o models/SentiBites \
-o metrics/emissions.csv \
python3 python3 src/models/train_model.py --model "roberta-base" --dataset data/processed \
--output_dir SentiBites1 --logging_dir logs --epochs 1 --learning_rate 0.001 --weight_decay 0.005
Evaluation stage:
dvc stage add -n evaluation \
-d src/models/evaluate_model.py \
-d data/processed/test.csv \
-d models/SentiBites \
-m metrics/evaluation_scores.csv \
python3 src/models/evaluate.py --model models/SentiBites --dataset data/processed \