-Prof. Shailesh Shrivastav
Name | Roll no. | Email Id |
---|---|---|
Priyanshu Patel | 30492 | priyanshu24ai042@satiengg.in |
Chandrabhan Bahetwar | 30484 | chandrabhan24ai013@satiengg.in |
Godavari Patle | 29978 | godavari24ai014@satiengg.in |
Shalinee Bakoriya | 30483 | shalinee24ai052@satiengg.in |
Savita Vamankar | 30512 | savita24ai051@satiengg.in |
Develop a generative model that can create synthetic speech samples for underrepresented languages and dialects. The model should be able to generate speech samples in a variety of languages and dialects, with a focus on those that are underrepresented in existing speech datasets. The generated samples can then be used to train and improve speech recognition models for these languages and dialects, promoting linguistic diversity and reducing language barriers in speech technology. If the language of the speaker is English, predict the native language of the speaker based on the accent.
Linguistic Accent is a web app where different language speaking peoples can help by easily translate and understand other accents and their respective languages have asked. This is a user-friendly and easily accessible question-answer application that would save the time of the Living different speaking country. Basically, it is a Accent and translation website for the Linguistic community. Users can ask questions and also translate english sentence to the their respective languages. There is also an advantage for those people who wants to speak in different accent can easily and communicate different speaking langauage people through which they can communicate each other.
This project will develop a generative model that can create synthetic speech samples for underrepresented languages and dialects. The model will be trained on a dataset of speech samples from a variety of languages and dialects. The generated samples will then be used to train and improve speech recognition models for these languages and dialects. This will help to promote linguistic diversity and reduce language barriers in speech technology.
Our project, "Generative Audio/Linguistics Accents," aims to develop a generative model using HTML, CSS, JavaScript, and Bootstrap for the web interface. The model's deployment will be handled by Hugging Face. To achieve our objectives, we will use the GTTS (Google Text-to-Speech) and Translate libraries. Gradio will be integrated into the web interface for building the model. Here are the key points of our project:
Develop a generative model for creating synthetic speech samples of underrepresented languages and dialects.
Generate speech samples in various languages and dialects, focusing on those lacking representation in existing speech datasets.
Improve speech recognition models for underrepresented languages, promoting linguistic diversity and reducing language barriers in speech technology.
Predict the native language of English speakers based on their accent.
Importance: The project contributes to language technology sustainability and preserves linguistic diversity.
Inclusivity: The model will generate speech samples for a wide range of languages and dialects, not limited to a few select ones.
As a team member in the "Generative Audio/Linguistics Accents" project, my role would be as a Machine Learning Engineer. I will be responsible for developing the generative model using Python, incorporating the GTTS and Translate libraries to handle speech synthesis and translation tasks. Additionally, I will integrate the model with Hugging Face for deployment on the web platform. Collaborating with frontend developers, I will also ensure a user-friendly interface using HTML, CSS, JavaScript, and Bootstrap. Furthermore, I will work closely with Gradio's functionalities to create an interactive web interface, enabling users to build and test the model with ease. Together with the team, we will address the objective of promoting linguistic diversity by generating synthetic speech samples for underrepresented languages and dialects. By leveraging this data to improve speech recognition models, our project will play a vital role in bridging language barriers and fostering sustainable language technology.
This repository contains an implementation/LinguisticAccent folder which contains all the code for our web app.
The project is divided into several small apps-
- Speech Synthesis
- Text - You can the write the sentence Here
- Language Available - represents the all over the world different languages
- Accent - represents the different countries Accents
- Output - represents the what users asked and search the final result
- Sentence Translation
- Write the Sentence - only in English Language
- Notifications - implements functionalities of notifications
- Dest - represent all different language present in the world
- Outer - represents the what users asked & search the final result
Some important files/folders in each of the apps mentioned above -
- app.py - describe each object's class.
- index.html - contains all the web interface and different links
- style.css - registers apps with Django
- main.js - contains the form to either create a new object of the given - model/class or edit the already existing object.
Two more folders are present-
- final3 - it has the HTML files for the pages
- Assets - it stores static files such as CSS and javascript files.
The core of your project involves developing a generative model to create synthetic speech samples. Machine learning techniques will be used to train and optimize this model, enabling it to generate speech in different languages and dialects.
Data science plays a crucial role in your project, as it involves working with speech datasets to train the generative model. Data preprocessing, cleaning, and analysis are essential steps in preparing the data for the model.
To deploy and integrate the generative model into a user-friendly web application, software development skills are necessary. You will use technologies like HTML, CSS, JavaScript, and Bootstrap to build the web interface.
NLP techniques may be employed to enhance the model's ability to understand and generate speech samples in various languages and dialects.
These technologies will be used to create the web interface for the project.
This library will be used to deploy the generative model.
Gradio is a framework for building web interfaces for machine learning models. It will provide an interactive and accessible way for users to interact with your generative model.
This library will be used to generate speech samples from the model.
This library will be used to predict the native language of the speaker based on the accent.
Go on the localhost web address which must have been printed on the terminal