Pattern identification task - Build a methodology that is able to identify repeated instances of melodic patterns. Students will characterize the difference in performance of these annotated melodic patterns and develop a process that can identify them from audio.
This notebook final_notebook_nns.ipynb
aims to, through the selection of the most significant features, train a model that can identify the nns pattern, one of the most common ones in Raga Ritigowla. This notebook trains diferent models and compares their performance against a random prediction. The models are a Gradient Boosting Classifier (GBC) and Random Forest Classifier (RFC). The notebook includes data preparation, model training, evaluation, and comparison steps.
- Prerequisites
- Setting Up the Environment
- Notebook Structure
- Running the Notebook
- Understanding the Output
Before running the notebook, ensure you have the following installed:
- Python 3.12.2 or later
- Jupyter Notebook or JupyterLab
- Required Python libraries:
numpy
pandas
librosa
math
scipy
scikit-learn
matplotlib
IPython.display
You can install the required libraries using pip
:
pip install numpy pandas librosa scipy scikit-learn matplotlib IPython
Clone the repository to your local machine or download it as a ZIP file and extract it.
git clone https://github.com/pdpau/Raga-Bhairavi-Pattern-Identification.git
Change to the directory where the notebook is located.
cd code
Launch Jupyter Notebook or JupyterLab in the directory.
jupyter notebook
In the Jupyter interface, open final_notebook_nns.ipynb
from the list of files.
The notebook is structured as follows:
In this first section, all the main libraries are installed and imported, some of them are going to be imported when needed in the corresponding cell. The notebook uses the following libraries:
numpy
for numerical operationspandas
for data manipulationlibrosa
for audio processingmath
for mathematical operationsscipy
for scientific computingscikit-learn
for machine learningmatplotlib
for plotting graphsIPython.display
for displaying audio files
The data is loaded from the data
directory, there are two audio files, Koti Janmani and Vanajaksha Ninni Kore, and their corresponding annotations in a text file. The features are extracted from the annotations, audio and pitch of each song. Once the features are extracted, one single dataframe with 17 features and 2 targets is created.
The data is split into training and testing sets. Firstly, a random classifier is used to set the baseline results if the prediction was done by chance. Then, the Gradient Boosting Classifier (GBC) and Random Forest Classifier (RFC) models are trained on the training data. Afterwards, the models are used to predict the test data. Predictions are evaluated using accuracy scores, confusion matrices and other classification reports.
To execute the notebook, follow these steps:
You can run all cells sequentially by selecting Cell > Run All from the Jupyter menu.
Alternatively, you can execute the cells one by one. This allows you to understand the code and outputs at each stage.
Ensure all libraries are installed. If you encounter errors, check if all dependencies are satisfied.
If you wish to experiment, modify the parameters or try with different datasets.
Each dataframe generated after each feature extraction is avaliable to visualize, as well as plots of the pitch to see the patterns in the data.
Visual and statistical summaries, like metrics and confusion matrices to understand the performance of each model. Also, the random baseline is shown to compare the performance of the models.
Side-by-side comparison of the GBC and RFC against random predictions. The analysis should show that the GBC and RFC perform better than random chance, indicating it has learned meaningful patterns in the data.