Skip to content

Latest commit

 

History

History
130 lines (96 loc) · 3.92 KB

README.md

File metadata and controls

130 lines (96 loc) · 3.92 KB

Pretrained Models Dependency

The models dependency of Amphion are as follows (sort alphabetically):

The instructions about how to download them is displayed as follows.

Amphion Singing BigVGAN

We fine-tune the official BigVGAN pretrained model with over 120 hours singing voice data. The fine-tuned checkpoint can be downloaded here. You need to download the 400000.pt and args.json files into Amphion/pretrained/bigvgan:

Amphion
 ┣ pretrained
 ┃ ┣ bivgan
 ┃ ┃ ┣ 400000.pt
 ┃ ┃ ┣ args.json

Amphion Speech HiFi-GAN

We trained our HiFi-GAN pretrained model with 685 hours speech data. Which can be downloaded here. You need to download the whole folder of hifigan_speech into Amphion/pretrained/hifigan.

Amphion
 ┣ pretrained
 ┃ ┣ hifigan
 ┃ ┃ ┣ hifigan_speech
 ┃ ┃ ┃ ┣ log
 ┃ ┃ ┃ ┣ result
 ┃ ┃ ┃ ┣ checkpoint
 ┃ ┃ ┃ ┣ args.json

Amphion DiffWave

We trained our DiffWave pretrained model with 125 hours speech data and around 80 hours of singing voice data. Which can be downloaded here. You need to download the whole folder of diffwave into Amphion/pretrained/diffwave.

Amphion
 ┣ pretrained
 ┃ ┣ diffwave
 ┃ ┃ ┣ diffwave_speech
 ┃ ┃ ┃ ┣ samples
 ┃ ┃ ┃ ┣ checkpoint
 ┃ ┃ ┃ ┣ args.json

ContentVec

You can download the pretrained ContentVec model here. Note that we use the ContentVec_legacy-500 classes checkpoint. Assume that you download the checkpoint_best_legacy_500.pt into the Amphion/pretrained/contentvec.

Amphion
 ┣ pretrained
 ┃ ┣ contentvec
 ┃ ┃ ┣ checkpoint_best_legacy_500.pt

WeNet

You can download the pretrained WeNet model here. Take the wenetspeech pretrained checkpoint as an example, assume you download the wenetspeech_u2pp_conformer_exp.tar into the Amphion/pretrained/wenet. Unzip it and modify its configuration file as follows:

cd Amphion/pretrained/wenet

### Unzip the expt dir
tar -xvf wenetspeech_u2pp_conformer_exp.tar.gz

### Specify the updated path in train.yaml
cd 20220506_u2pp_conformer_exp
vim train.yaml
# TODO: Change the value of "cmvn_file" (Line 2) to the absolute path of the `global_cmvn` file. (Eg: [YourPath]/Amphion/pretrained/wenet/20220506_u2pp_conformer_exp/global_cmvn)

The final file struture tree is like:

Amphion
 ┣ pretrained
 ┃ ┣ wenet
 ┃ ┃ ┣ 20220506_u2pp_conformer_exp
 ┃ ┃ ┃ ┣ final.pt
 ┃ ┃ ┃ ┣ global_cmvn
 ┃ ┃ ┃ ┣ train.yaml
 ┃ ┃ ┃ ┣ units.txt

Whisper

The official pretrained whisper checkpoints can be available here. In Amphion, we use the medium whisper model by default. You can download it as follows:

cd Amphion/pretrained
mkdir whisper
cd whisper

wget https://openaipublic.azureedge.net/main/whisper/models/345ae4da62f9b3d59415adc60127b97c714f32e89e936602e85993674d08dcb1/medium.pt

The final file structure tree is like:

Amphion
 ┣ pretrained
 ┃ ┣ whisper
 ┃ ┃ ┣ medium.pt

RawNet3

The official pretrained RawNet3 checkpoints can be available here. You need to download the model.pt file and put it in the folder.

The final file structure tree is like:

Amphion
 ┣ pretrained
 ┃ ┣ rawnet3
 ┃ ┃ ┣ model.pt