layout | background-class | body-class | category | title | summary | image | author | tags | github-link | github-id | featured_image_1 | accelerator | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hub_detail |
hub-background |
hub |
researchers |
Silero Text-To-Speech Models |
A set of compact enterprise-grade pre-trained TTS Models for multiple languages |
silero_logo.jpg |
Silero AI Team |
|
snakers4/silero-models |
no-image |
cuda-optional |
# this assumes that you have a proper version of PyTorch already installed
pip install -q torchaudio omegaconf
import torch
language = 'en'
speaker = 'lj_16khz'
device = torch.device('cpu')
model, symbols, sample_rate, example_text, apply_tts = torch.hub.load(repo_or_dir='snakers4/silero-models',
model='silero_tts',
language=language,
speaker=speaker)
model = model.to(device) # gpu or cpu
audio = apply_tts(texts=[example_text],
model=model,
sample_rate=sample_rate,
symbols=symbols,
device=device)
Silero Text-To-Speech models provide enterprise grade TTS in a compact form-factor for several commonly spoken languages:
- One-line usage
- Naturally sounding speech
- No GPU or training required
- Minimalism and lack of dependencies
- A library of voices in many languages
- Support for
16kHz
and8kHz
out of the box - High throughput on slow hardware. Decent performance on one CPU thread
As of this page update, the speakers of the following languages are supported both in 8 kHz and 16 kHz:
- Russian (6 speakers)
- English (1 speaker)
- German (1 speaker)
- Spanish (1 speaker)
- French (1 speaker)
To see the always up-to-date language list, please visit our repo and see the yml
file for all available checkpoints.
For additional examples and other model formats please visit this link. For quality and performance benchmarks please see the wiki. These resources will be updated from time to time.