Skip to content

Brain-XAI-Lab/Mel-ResNet

Repository files navigation

Mel-ResNet

ResNet for classifying Mel-spectrograms Mel-ResNet figure
  • We developed a generative model that inputs EEG data during Inner Speech (Imagined Speech) and the corresponding mel spectrogram of Spoken Speech (target) into a GAN.
  • Through this approach, the mel spectrogram generated from the EEG is fed into a ResNet trained on actual speech mel spectrograms to predict the imagined word.

Requirements

`Python >= 3.7`

All the codes are written in Python 3.7.

You can install the libraries used in our project by running the following command:

pip install -r requirements.txt

Dataset

We extracted word utterance recordings for a total of 13 classes using the voices of 5 contributors and TTS technology.
Additionally, to address any data scarcity issues, we applied augmentation techniques such as time stretching, pitch shifting, and adding noise.

The pairs of recorded words are as follows:

  • Call
  • Camera
  • Down
  • Left
  • Message
  • Music
  • Off
  • On
  • Receive
  • Right
  • Turn
  • Up
  • Volume

Model & Training (ongoing)

ResNet-50

(We are continuously collecting data and will need to undergo a hyperparameter tuning process after training.)


Results (ongoing)

Performance metrics : Accuracy, F1 score

(learning curve and metrics will be here)


References

About

ResNet for classifying Mel-spectrograms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages