最佳实践:whisper + bark + Retrieval-based-Voice-Conversion-WebUI
两个方案:首选wishper,次选sherpa-ncnn
- whisper: https://github.com/openai/whisper.git
- Kaldi新一代
- wenet:https://github.com/wenet-e2e/wenet.git
- Retalker
- paddlespeech
- kaldi
- funasr
- speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
- SummerAsr基于Deepspeech2
三选一:edge-tts,libvits-ncnn/summperTTS,bark.cpp
- AudioCraft
- bark
- StyleTTS
- styletts性能非常棒,8S音频可以在100ms完成
- https://github.com/yl4579/StyleTTS
- https://styletts.github.io/
- https://huggingface.co/spaces/yl4579/StyleTTS
- vits
- coqui-tts
- https://github.com/coqui-ai/TTS
- 集成了主流的tts项目
- voicebox
- 微软Azure:付费使用
- edge-tts:https://github.com/rany2/edge-tts
- Mega-TTS 2
- 浙江大学联手字节跳动推出的 Mega-TTS 2,目前最强中文语音AI,在音色、韵律方便处理得非常出色
- fastspeech2+hifigan
- Tacotron 2
- https://github.com/Plachtaa/VALL-E-X
- so-vits-svc
- https://github.com/svc-develop-team/so-vits-svc
- 效果最好,需要训练
- AI唱歌:Singing Voice Conversion
- 声音克隆:Real-Time-Voice-Cloning
- MockingBird
- https://github.com/babysor/MockingBird
- 基于Real-Time-Voice-Cloning
- 效果一般
- bark也可以做
- Retrieval-based-Voice-Conversion-WebUI
- 10分钟训练AI变声器,音色克隆门槛进一步降低!一键训练包发布!
- https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
- FreeVC