- Urban sound 8K dataset
- Analysis of dataset such as recording lengths, class distribution or sampling frequencies.
- Listening to class samples.
- Visualization of data as waveform, STFT, MEL-STFT, MFCC.
- Jupyter notebook
- Google colab
- Trying to classify based on MelSpectogram.
- Data augmented with time shifting, time masking and freq masking.
- Model is saved after each epoch.
- Confusion matrix to peak at model behavior.
- Jupyter notebook
- Google colab
-
Python 3.8 (recommend to create virtual env)
python38 -m venv .venv source .venv/Scripts/Activate
-
All necessary dependencies in requirements.txt
pip install -r requirements.txt