NCMMSC2021

Introduction

This is the repo for NCMMSC2021 competition.

Notice that the master branch is for development, for stable release, please switch to the stable branch.

Project Structure

NCMMSC2021
├─bin               # Contains the runnable scripts
├─configs           # Contains the configiurations
├─dataset           # Contains the dataset
│  ├─merge          # Concat all the audios from one person
│  │  ├─AD
│  │  ├─HC
│  │  └─MCI
│  ├─raw            # Raw audios 
│  │  ├─AD
│  │  ├─HC
│  │  └─MCI
│  ├─merge_vad      # Perform unsupervised VAD on the separated audios and concat the results
│  │  ├─AD
│  │  ├─HC
│  │  └─MCI
│  └─raw_vad        # Perform unsupervised VAD on raw audios
│      ├─AD
│      ├─HC
│      └─MCI
├─log               # Contains the log files
├─model             # Contains the main model
│  ├─models         # Contains all the model
│  └─modules        # Contains all the modules
├─weight            # Contains the weight files
└─util              # Contains the util files
   ├─log_util       # Utils for log
   ├─tool           # Useful tools for drawing and files
   ├─train_util     # Dataloader and trainer
   └─model_util     # Utils for networks

Target Approach

There are two given tasks, predicting on 5 seconds audio and on 30 seconds audio separately

For both, extract features (MFCC, Spectrogram and MelSpectrogram) from the audio and treat them with the Image-based Classification methods.
LSTM is introduced into the model, however, not performing well.
Other fusion methods like Feature Fusion are also tested but not work well in feature fusion than concat.

Model Performance

ID	Sample Seconds	Model	Use Feature	K-fold	Accuracy	Train Average Acc	Remark	Evaluation
20210903_230628	5s	SpecificTrainModel	MFCC	4	75.91%,63.10%,76.21%,68.23%	68.36%
20210903_230628	5s	SpecificTrainModel	SPECS	4	71.47%,59.78%,77.42%,62.50%	67.79%
20210903_230628	5s	SpecificTrainModel	MELSPEC	4	71.77%,54.74%,78.73%,64.69%	67.48%
20210904_141710	5s	MSMJointConcatFineTuneModel	General	4	75.60%,69.15%,77.22%,73.96%	71.48%	MFCC,SPECS,MELSPEC for training
20210904_141710	5s	MSMJointConcatFineTuneModel	Fine-tune	4	78.53%,68.25%,78.63%,75.00%	75.10%	MFCC,SPECS,MELSPEC for training
20210904_150739	5s	SpecificTrainResNetModel	MELSPEC	4	67.64%,70.06%,72.18%,68.23%	69.53%
20210915_093218	5s	CompetitionSpecificTrainVggNet19BNBackboneModel	SPEC	4	70.36%,80.85%,83.67%,68.85%	75.93%
20210915_012356	5s	CompetitionSpecificTrainVggNet19BNBackboneModel	MFCC	4	75.50%,63.41%,81.15%,74.90%	73.74%
20210914_221835	5s	CompetitionSpecificTrainVggNet19BNBackboneModel	MELSPEC	4	79.23%,75.40%,85.69%,62.81%	75.78%
20210916_144512	5s	CompetitionSpecificTrainResNet18BackboneModel	MFCC	4	69.96%,72.08%,76.71%,61.04%	69.92%
20210917_154750	5s	CompetitionSpecificTrainWideResNet	MELSPEC	4	77.52%,74.80%,78.02%,55.73%	71.51%
20210917_154750	5s	CompetitionSpecificTrainVggNet16BNBackboneModel	MELSPEC	4	76.81%,79.94%,79.64%,63.12%	74.87%
20210917_184756	5s	CompetitionSpecificTrainVggNet16BNBackboneModel	SPEC	4	76.92%,78.63%,78.93%,61.77%	74.06%
20210917_184859	5s	CompetitionSpecificTrainVggNet16BNBackboneModel	MFCC	4	72.48%,71.17%,80.54%,64.90%	72.27%
20210904_215820	25s	SpecificTrainResNetLongLSTMModel	MELSPEC	4	65.32%,57.46%,65.73%,72.29%	65.20%		Detail General
20210904_234029	25s	SpecificTrainResNetLongModel	MELSPEC	4	77.62%,59.07%,64.52%,72.50%	68.43%		Detail General
20210905_151007	25s	SpecificTrainLongLSTMModel	MELSPEC	4	73.49%,61.09%,75.40%,65.10%	68.77%		Detail General
20210905_130825	25s	SpecificTrainLongModel	MELSPEC	4	78.23%,59.98%,78.63%,66.35%	70.79%		Detail General
20210905_133648	25s	SpecificTrainLongModel	SPECS	4	70.97%,58.17%,76.41%,66.88%	68.11%		Detail General
20210905_133648	25s	SpecificTrainLongModel	MFCC	4	73.19%,66.94%,76.41%,70.21%	71.68%		Detail General
20210905_133648	25s	SpecificTrainLongModel	MELSPEC	4	78.23%,59.17%,75.60%,63.75%	68.19%		Detail General
20210905_133648	25s	MSMJointConcatFineTuneLongModel	General	4	71.27%,72.38%,79.64%,72.40%	73.92%	MFCC,SPECS,MELSPEC for training	Detail General
20210905_133648	25s	MSMJointConcatFineTuneLongModel	Fine-tune	4	73.29%,64.21%,79.94%,74.79%	73.06%	MFCC,SPECS,MELSPEC for training	Detail General
20210906_215527	25s	SpecificTrainLongModel	MELSPEC_VAD	4	68.45%,66.13%,68.85%,73.12%	69.14%		Detail General
20210906_185221	25s	SpecificTrainLongTransformerEncoderModel	MELSPEC	4	67.94%,65.02%,74.40%,69.06%	69.11%		Detail General
20210908_121607	25s	SpecificTrainResNet18BackboneLongModel	MELSPEC_VAD	4	70.46%,65.83%,79.54%,64.79%	73.77%		Detail General
20210907_230640	25s	MSMJointConcatFineTuneLongModel	General	4	80.04%,63.61%,76.51%,74.90%	73.92%	MFCC,SPECS,MELSPEC for training	Detail General
20210907_230640	25s	MSMJointConcatFineTuneLongModel	Fine-tune	4	77.42%,65.12%,76.11%,74.79%	73.36%	MFCC,SPECS,MELSPEC for training	Detail General
20210907_230704	25s	SpecificTrainLongModel	MELSPEC_VAD	4	68.15%,64.01%,69.15%,70.21%	67.88%		Detail General
20210907_230704	25s	SpecificTrainLongModel	SPECS_VAD	4	70.87%,68.65%,64.82%,71.25%	68.90%		Detail General
20210907_230704	25s	SpecificTrainLongModel	MFCC_VAD	4	67.94%,63.00%,69.15%,64.27%	66.09%		Detail General
20210907_230704	25s	MSMJointConcatFineTuneLongModel	General	4	71.37%,62.50%,67.04%,64.90%	66.45%	MFCC_VAD, SPECS_VAD and MELSPEC_VAD for training	Detail General
20210907_230704	25s	MSMJointConcatFineTuneLongModel	Fine-tune	4	67.04%,66.73%,69.15%,66.77%	67.42%	MFCC_VAD, SPECS_VAD and MELSPEC_VAD for training	Detail General
20210917_134347	25s	CompetitionSpecificTrainWideResNet	MELSPEC	4	78.73%,74.29%,84.48%,55.10%	73.15%