Speech to Text Translator

Speech to Text Translator is an application developed in python and trained using AWS Sagemaker in a ml.t3.2xlarge instance. The goal of this application is to translate from english to spanish language using the voice as input. The model used was the mT5 trained with data from https://www.kaggle.com/datasets/lonnieqin/englishspanish-translation-dataset.

Repository Structure

README.md: The file contais the description of the project.
main.py: Execute the Speech Translator.
recording_audio.py: The file records the audio using the computer's microphone. Then it saves the recorded audio as a .wav file.
speech_to_text.py: The file takes a recorded audio in .wav format and brings its text representation. The speech to text recognition is done using the Google Speech Recognition API.
load_finetuned_mt5_example.ipynb: The notebook ilustrates how to load the finetuned mT5 model and some examples of the translation from english to spanish.
training: The folder contains the notebook trained in AWS Sagemaker instance and the dataset
- flan_mt5_en_sp.ipynb: Notebook trained in AWS Sagemaker using a ml.t3.2xlarge instance. 10 000 samples where used for the training.
- data/english_to_spanish/dataset: The folder contains the dataset.

Note

The mT5 model was finetuned using 10 000 samples and it took 3h 49min in a ml.t3.2xlarge instance.

Future work

Retrain the model with more samples in order to increase the performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech to Text Translator

Repository Structure

Note

Future work

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
__pycache__		__pycache__
training		training
.gitignore		.gitignore
README.md		README.md
load_finetuned_mt5_example.ipynb		load_finetuned_mt5_example.ipynb
main.py		main.py
recording_audio.py		recording_audio.py
speech_to_text.py		speech_to_text.py

JorgeV20/Speech_to_Text_Translator

Folders and files

Latest commit

History

Repository files navigation

Speech to Text Translator

Repository Structure

Note

Future work

About

Topics

Resources

Stars

Watchers

Forks

Languages