This project was focused not so much on training and fine-tuning a model, but rather packaging it for deployment for end users. The dataset being used for this project is the First GOP Debate Twitter Sentiment data set from Kaggle. The goal of the deep learning model is to label a tweet as either having a positive, neutral, or negative sentiment. Most of the model training is performed in DeployML.ipynb, and will be summarized below alongside the extra steps needed to package and deploy the model.
In Python 3.9.2, some packages that we'll be using are:
- Model Training
- Tensorflow (version 2.0.0)
- NOTE: For deployment in Heroku, tensorflow-cpu should be installed to reduce the size of the application
- Pandas
- Numpy (version 1.19.2)
- sklearn
- re
- h5py
- Tensorflow (version 2.0.0)
- Deployment
- fastapi
- uvicorn
- python-multipart
- pydantic
Explore DeployML.ipynb to get an in-depth look at how the tweets are cleaned up, tokenized, and transformed in a equal-sized sequences, as well as see the architecture and training of a basic model.
Because the focus of the project was deployment, model has not been fine-tuned. Increased performance is planned but TBD.
Look in app.py. Using FastAPI()
, a simple HTML form can be created, such that users can input and submit a phrase. The model will then spit out a prediction as well as the prediction probability (essentially, confidence in the estimate). The app can then be hosted locally using a uvicorn server, with uvicorn app:app --reload
. Finally, Heroku can be setup to connect with a Github repository in order to deploy an app online.
You can access my super simple version of it here! Performance is not great at the moment, until I can go back and improve model performance.