VoiceCloner

The Voice Cloner is a Python-based project that leverages Tacotron 2 and WaveGlow models for text-to-speech (TTS) synthesis and basic voice cloning. This project supports 22 official Indian languages, including Sanskrit, making it versatile for multilingual text input.

Voice Cloner Using Audio Synthesis Models

The Voice Cloner is a Python-based project that leverages Tacotron 2 and WaveGlow models for text-to-speech (TTS) synthesis and basic voice cloning. This project supports 22 official Indian languages, including Sanskrit, making it versatile for multilingual text input.

The project is optimized for CPU usage using pre-trained models, enabling developers and enthusiasts to quickly synthesize speech.

Features

1. Text-to-Speech Synthesis

Generate speech audio for the provided text.
Supports English and Indian languages such as Hindi, Bengali, Tamil, Telugu, and Sanskrit.

2. Basic Voice Cloning

Mimics voice patterns and generates audio with similar speech characteristics.

3. CLI-Based Interface

User-friendly command-line interface for generating and saving audio files.

Project Folder Structure

Below is the organized structure of the project:

VoiceCloner/
├── data/
│   ├── samples/                  # Sample audio clips for cloning
│   ├── synthesized_audio/        # Directory for storing generated audio
├── models/
│   ├── tacotron2/                # Pre-trained Tacotron 2 model for text-to-mel
│   ├── waveglow/                 # Pre-trained WaveGlow model for audio synthesis
├── utils/
│   ├── __init__.py               # Initializes the utils module
│   ├── text_processing.py        # Cleans and preprocesses input text
│   ├── voice_cloning.py          # Core logic for voice synthesis
│   ├── language_support.py       # Provides support for multiple languages
├── main.py                       # Entry point to run the project
├── requirements.txt              # Project dependencies
└── README.md                     # Detailed documentation (this file)

Installation Instructions

Follow these steps to set up and run the project on your local machine:

1. Clone the Repository

git clone https://github.com/thekartikeyamishra/VoiceCloner.git
cd VoiceCloner

2. Set Up the Environment

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate    # On Windows: .venv\Scripts\activate

3. Install Dependencies

Install the required libraries:

pip install -r requirements.txt

4. Run the Application

Execute the CLI application:

python main.py

How It Works

Text Input: Enter any text and choose the desired language (supports 22 languages).
Speech Synthesis: The text is processed using Tacotron 2 and converted into mel spectrograms.
Audio Generation: The WaveGlow vocoder generates high-quality audio from the mel spectrogram.
Save Output: The generated audio is saved in the data/synthesized_audio/ directory as a .wav file.

Dependencies

The project requires the following Python libraries:

torch
numpy
librosa

Install these dependencies using the provided requirements.txt.

Supported Languages

The following languages are currently supported:

English
Hindi
Bengali
Tamil
Telugu
Gujarati
Malayalam
Marathi
Kannada
Punjabi
Odia
Assamese
Urdu
Sindhi
Sanskrit

Additional languages can be added in the future with phonetic support.

Future Enhancements

This is the basic version of the Voice Cloner. Future plans include:

GUI Support: A graphical interface for ease of use.
Advanced Voice Cloning: Speaker embedding for personalized voice synthesis.
Support for Additional Models: Integration with FastSpeech and other synthesis models.
Multi-Language Extensions: Support for more global languages.

Contributions

Contributions are welcome! Feel free to:

Fork the Repository.
Create a Feature Branch.
Submit a Pull Request with your improvements.

Contact

If you have any questions, feedback, or suggestions, feel free to reach out!

Let’s bring multilingual speech synthesis to the next level. 🚀
Star ⭐ the project if you find it useful!

git clone https://github.com/thekartikeyamishra/VoiceCloner.git

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceCloner

Voice Cloner Using Audio Synthesis Models

Features

1. Text-to-Speech Synthesis

2. Basic Voice Cloning

3. CLI-Based Interface

Project Folder Structure

Installation Instructions

1. Clone the Repository

2. Set Up the Environment

3. Install Dependencies

4. Run the Application

How It Works

Dependencies

Supported Languages

Future Enhancements

Contributions

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
__init__.py		__init__.py
language_support.py		language_support.py
main.py		main.py
requirements.txt		requirements.txt
text_processing.py		text_processing.py
voice_cloning.py		voice_cloning.py

thekartikeyamishra/VoiceCloner

Folders and files

Latest commit

History

Repository files navigation

VoiceCloner

Voice Cloner Using Audio Synthesis Models

Features

1. Text-to-Speech Synthesis

2. Basic Voice Cloning

3. CLI-Based Interface

Project Folder Structure

Installation Instructions

1. Clone the Repository

2. Set Up the Environment

3. Install Dependencies

4. Run the Application

How It Works

Dependencies

Supported Languages

Future Enhancements

Contributions

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages