- divided each
.wav
file into 5 segments to increse training data. - exctracted mfcc for each segment.
- stored the complete dataset in format :
{ "mapping" : ["classical","blues",...], "mfcc": [[[...],[...],...,[...]],...], "labels": [0,2,...] }
-
Architecture
-
Result
-
Comments :
- CNN performs very well.
- The training data had to be reshaped since CNN required 3 dimensional input.
- Fastest