Breaking Language Barriers: Real-time Translation and Transcription with Voicenator
- Introduction
- Features
- Installation
- Usage
- Configuration
- Technologies Used
- Contributing
- License
- Contact
Voicenator is a cutting-edge AI-powered application designed to break down language barriers by offering real-time translation and transcription services. Leveraging the power of Web Speech API, Deepgram, and WebSockets, Voicenator provides seamless speech-to-text and text-to-speech functionalities, making communication easier and more efficient.
- Real-time Translation: Translate speech or text in real-time across multiple languages.
- Speech-to-Text: Accurate transcription of spoken words into text using Deepgram.
- Text-to-Speech: Convert written text into natural-sounding speech with SpeechSynthesis.
- Voice Cloning: Create high-quality AI clones of human voices.
- AI Dubbing: Automatically translate and dub videos in multiple languages.
- Transcription: Transcribe videos with high accuracy in over 20 languages.
- AI Avatar: Generate AI-driven video content.
- Text-to-Speech API: Utilize our API for natural-sounding text-to-speech conversions.
- Node.js (v14 or higher)
- npm (v6 or higher) or yarn
-
Clone the repository:
git clone https://github.com/your-username/voicenator.git cd voicenator
-
Install dependencies:
npm install # or yarn install
-
Create a
.env
file in the root directory and add the following variables:REACT_APP_DEEPGRAM_API_KEY=your_deepgram_api_key REACT_APP_WEB_SOCKET_URL=your_websocket_url
-
Start the development server:
npm start # or yarn start
- Real-time Translation: Select your source and target languages, then start speaking or typing to see instant translations.
- Speech-to-Text: Use the microphone button to start speaking and see your words transcribed in real-time.
- Text-to-Speech: Enter text into the input field and press the play button to hear the speech output.
- Voice Cloning, AI Dubbing, and other advanced features: Navigate through the application menu to access and utilize these functionalities.
Voicenator utilizes the Web Speech API for speech recognition and synthesis. Ensure your browser supports this API.
Deepgram provides the backend for speech-to-text functionality. Sign up on Deepgram's website to get your API key and add it to your .env
file.
WebSockets are used for real-time data transmission. Configure the WebSocket URL in your .env
file.
- React: Frontend framework
- TypeScript: Static typing for JavaScript
- Redux: State management
- Web Speech API: Browser API for speech recognition and synthesis
- Deepgram: Speech-to-text API
- Socket.io: WebSocket library for real-time communication
- Vite: Build tool for frontend development
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create your feature branch:
git checkout -b feature/YourFeature
- Commit your changes:
git commit -m 'Add YourFeature'
- Push to the branch:
git push origin feature/YourFeature
- Open a pull request
This project is licensed under the Apache-2.0 License - see the LICENSE file for details.
For questions or suggestions, please reach out to us:
- Email: ayushsoni1010.work@gmail.com
- Website: https://ayushsoni1010.com
Thank you for using Voicenator! We hope this tool makes your communication more effective and breaks down language barriers effortlessly.