- We developed a generative model that inputs EEG data during Inner Speech (Imagined Speech) and the corresponding mel spectrogram of Spoken Speech (target) into a GAN.
- Through this approach, the mel spectrogram generated from the EEG is fed into a ResNet trained on actual speech mel spectrograms to predict the imagined word.
`Python >= 3.7`
All the codes are written in Python 3.7.
You can install the libraries used in our project by running the following command:
pip install -r requirements.txt
We extracted word utterance recordings for a total of 13 classes using the voices of 5 contributors and TTS technology.
Additionally, to address any data scarcity issues, we applied augmentation techniques such as time stretching, pitch shifting, and adding noise.
- Call
- Camera
- Down
- Left
- Message
- Music
- Off
- On
- Receive
- Right
- Turn
- Up
- Volume
ResNet-50
(We are continuously collecting data and will need to undergo a hyperparameter tuning process after training.)
Performance metrics : Accuracy, F1 score
(learning curve and metrics will be here)