Appolo's Ear is an ensemble of three music genre classification neural networks, a nginx server, and a simple react frontend. Its backend is glued together with Docker and Docker compose and you can easily replicate it by following the steps below. This infrastructure can also be used for other sorts of online classification system. The goal of the project was to learn more about Tensorflow and Docker, about integrating a machine learning model with a fullstack application, and about how setting up a https server with Nginx.
- data_acquisition/ data acquisition and audio cleanup utility files
- classifiers.ipynb: Jupyter notebook to implement and train models
- server/ server files
- client.py: simple main file for testing
- Fork the repo
- Download the GTZAN dataset. If you desire aquiring extra data to classify genres other than the ones in the dataset, check data_acquisition/audio_data_collector.py for the needed utility functions. Check this tutorial for how to use Selenium. Note that I personally used Brave Browser and that it works with the Chrome Driver
- Open classifiers.ipynb in Google Collab (recommended) or your favorite Jupyter server
- Follow the notebook to train and save the models. Note that if using Google Collab, you will need to link your Google Drive
- Follow all of A.
- Once you have trained and saved your models, put them somewhere in the server folder.
- Edit
MLP_PATH
,CNN_PATH
, andRNN_PATH
in genre_prediction_service.py to point to your MLP, CNN and RNN_LSTM models respectively. Go to Step 7 if you do not wish to test the system locally - Install Docker and Docker Compose on your machine. Note that Docker Compose is included with the official installations of Docker on Windows and macOS
- In your terminal,
cd server
and rundocker-compose build
. Then, rundocker-compose up
. An Nginx server proxying a UWsgi server which serves server.py should now be running. - In client.py, make sure
URL=DOCKER_TEST_PATH
and thatTEST_PATH
points to an audio file. - Run client.py. It should hit your server and return a predicted genre if everything went as desired
- Spin up a server (EC2 instance recommended as it is what I used) and scp the server/ folder onto it. Then make sure the http port is accessible from anywhere (On EC2, this is done by adding inbound rules.
- ssh onto the server and
cd server
thenchmod +x init.sh
- Run
./init.sh
. Make sure to readinit.sh
to ensure I am not making you download malware ;) - Copy your server ip to
SERVER_IP
in client.py, then make sureURL=PROD_URL
. Then run the file. It should hit your server and return a prediction if all went well.
By now, you should have 3 models for music genre classification, and a production-ready server to power any frontend application
PS: Tensorflow hangs on some servers when using the RNN_LSTM model. I am as of yet incapable of solving the bug. If it happens to you, best course of action is to use a different model on the server. If you manage to solve the bug, please let me know.
I would have not been able to complete this project if not for the content of multiple open-source engineers. I would like to particularly highlight Valerio Velardo from the The Sound of AI. Anyone interested in Machine Learning for audio classification should definitely check his pages.