- Python>=3.7.1
- tensorflow>=2.1.0
- opencv-python>=4.2.0
- dlib
- moviepy>=1.0.1
- numpy>=1.18.1
- Pillow
- matplotlib
- tqdm
- pyDot
- seaborn
- scikit-learn
- imutils>=0.5.3
Note: All Dependencies can be found inside 'setup.py'
- GP DataSet/
| --> align/
| --> video/ - Videos-After-Extraction/
| --> S1/
| --> ....
| --> S20/ - New-DataSet-Videos/
| --> S1/
| --> ....
| --> S20/ - S1/
| --> Adverb/
| --> Alphabet/
| --> Colors/
| --> Commands/
| --> Numbers/
| --> Prepositions/
We use the GRID Corpus dataset which is publicly available at this link
You can download the dataset using our script: GridCorpus-Downloader.sh
which was adapted from the code provided here
To Download please run the following line of code in your terminal:
bash GridCorpus-Downloader.sh FirstSpeaker SecondSpeaker
where FirstSpeaker and SecondSpeaker are integers for the number of speakers to download
- NOTE: Speaker 21 is missing from the GRID Corpus dataset due to technical issues.
- Run DatasetSegmentation.py
- Run Pre-Processing/frameManipulator.py
* After running the above files, all resultant videos will have 30 FPS and 1 second long.
-
Model codes can be found in the directory "NN-Models"
-
First you will need to change the common path value to the directory of your training and test data.
-
Run Each network to start training.
-
Early stopping was used to help stop the training of the model at its optimum validation accuracy.
-
Resultant accuracies after training on the data can be found in: Project Accuracies
or in the following illustration:
All of our networks have the same architecture with the only
difference being the output layer, As shown in:
Dataset preprocessing moduleInitial Convolutional Neural networks' architectureFacial detection algorithmOptimization of the networks' architecturesUnittesting of project files- Proper documentation for the whole project