Text to speech on pytorch. Implementation of tacotron-2 and wavenet. Based on: dl-start-pack.
Tacotron-2 core features:
- Pytorch-lightning training from various configs: src/tacotron2/configs
- Wandb Logging
- Guided attention
- Monotonic attention
- Teacher forcing value choose (by default 1. - fully teacher forced)
- LJspeech dataset training
WaveNet:
- Pytorch-lightning training from various configs: src/wavenet/configs
- Wandb Logging
- Additional input for melspecs
- Fast inference (in progress)
Several results && weights (just degug in colab, training require much more time):