Persian Classical Music Instrument Recognition (PCMIR) Using a Novel Persian Music Database
- https://ieeexplore.ieee.org/abstract/document/8965166
- https://www.kaggle.com/datasets/hosseinmousavi/pcmir-database
The database includes recordings of seven Persian classical instruments:
The classification methodology involves extracting audio features like:
- Mel-Frequency Cepstral Coefficients (MFCCs)
- Spectral Roll-off
- Spectral Centroid
- Zero Crossing Rate (ZCR)
- Entropy Energy
These features are refined using Fuzzy Entropy for feature selection and classified using Multi-Layer Neural Networks (MLNN).
- Database Creation: Development of a novel database for Persian classical musical instruments.
- Feature Extraction: Utilization of spatial and frequency domain features to characterize audio signals.
- Feature Selection: Implementation of Fuzzy Entropy to identify the most relevant features.
- Classification: Leveraging MLNN for instrument classification.
- Performance Evaluation: Achieving robust classification results for educational and artistic use cases.
A unique Persian music database was created for this research. The dataset includes:
- 7 Instrument Classes: Ney, Tar, Santur, Kamancheh, Tonbak, Ud, Setar.
- Audio Samples: Each class contains 89β110 samples, each 5β10 seconds long.
- Recording Environment: Audio was recorded in various environments such as rooms, music shops, and studios.
- File Format: MP3.
- https://www.kaggle.com/datasets/hosseinmousavi/pcmir-database
Instrument | Samples | Duration (sec) | Recording Environment |
---|---|---|---|
Ney | 102 | 5β10 | Closed and Open Spaces |
Tar | 96 | 5β10 | Closed and Open Spaces |
Santur | 107 | 5β10 | Closed and Open Spaces |
Kamancheh | 101 | 5β10 | Closed and Open Spaces |
Tonbak | 110 | 5β10 | Closed and Open Spaces |
Setar | 89 | 5β10 | Closed and Open Spaces |
Ud | 93 | 5β10 | Closed and Open Spaces |
- Audio Preprocessing:
- Normalize audio signals to ensure uniformity.
- Feature Extraction:
- Extracted five key features from each audio sample:
- MFCCs
- Spectral Roll-off
- Spectral Centroid
- Zero Crossing Rate
- Entropy Energy
- Extracted five key features from each audio sample:
- Feature Selection:
- Applied Fuzzy Entropy to identify and retain the most significant features for classification.
- Classification:
- Trained a Multi-Layer Neural Network (MLNN) for instrument classification.
- Evaluation:
- Validation using confusion matrix and classification accuracy.
The proposed method demonstrated promising results:
- Overall Accuracy: 82.57%
- Highest Accuracy: 95% for Tonbak (Percussion instrument)
- Lowest Accuracy: 70% for Ud
Instrument | Accuracy |
---|---|
Ney | 89% |
Tar | 79% |
Santur | 82% |
Kamancheh | 87% |
Tonbak | 95% |
Setar | 76% |
Ud | 70% |
The research suggests the following improvements:
- Incorporating additional features such as Spectral Flux, Spectral Spread, and Short Time Energy.
- Expanding the database to include all 11 Persian classical instruments.
- Enhancing classification accuracy using Deep Learning models such as Autoencoders.
- Applying the system in real-world settings for Persian music education and preservation.
If you use this project or find it helpful, please cite the original paper:
@inproceedings{mousavi2019persian,
title={Persian classical music instrument recognition (PCMIR) using a novel Persian music database},
author={Mousavi, Seyed Muhammad Hossein and Prasath, VB Surya and Mousavi, Seyed Muhammad Hassan},
booktitle={2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)},
pages={122--130},
year={2019},
organization={IEEE}
}
- ### DOI:
- https://doi.org/10.1109/ICCKE48569.2019.8965166