This is an audio emotions classifier It classifies audio files by emotions and gender:
CREMA-D: https://www.kaggle.com/ejlok1/cremad/code
TESS: https://www.kaggle.com/ejlok1/toronto-emotional-speech-set-tess/kernels
RAVDESS: https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio
SAVEE: https://www.kaggle.com/ejlok1/surrey-audiovisual-expressed-emotion-savee
First the machine learning model is trained in 1D CNN, then in 2D CNN. in 1D CNN we got an accuracy for gender classification of about 81%. For emotions classification 54% accuracy. For combined emotions and gender classification: 48%.
As the accuracy percentage was low in 1D CNN, I made it into 2D CNN and in the end managed to get an accuracy of 66% for combined genders and emotions classification.
Gender classes:
Male, female.
Emotions classes:
Angry, fear, happy, sad, surprise, neutral, disgust.
Combiend emotions and genders classes:
female_angry, female_disgust, female_fear, female_happy, female_sad, female_surprise, female_neutral,
male_angry, male_fear, male_happ, male_sad, male_surprise, male_neutral, male_disgust.
1D CNN genders classification result:
1D CNN emotions classification result:
1D CNN combined genders and emotions classification result:
Final result, 2D CNN combined genders and emotions classification:
Sources for this project: https://www.kaggle.com/ejlok1/audio-emotion-part-1-explore-data