Release Models for using prosody cloning and GAN-generated speaker embeddings · DigitalPhonetics/speaker-anonymization

This release contains all models of our latest pipeline version capable of generating artificial speaker embeddings using a GAN, prosody cloning and prosody modifications using offsets.

Place the unzipped folders in a models directory located directly under root. So, the structure should look like follows:

speaker-anonymization
   └─ models
        └─ anonymization
            └─ gan_style-embed
                └─ settings.json
                └─ style-embed_wgan.pt
        └─ asr
            └─ asr_branchformer_tts-phn_en.zip
       └─ tts
            └─ Aligner
                └─ aligner.pt
            └─ Embedding
                └─ embedding_function.pt
            └─ FastSpeech2_Multi
                └─ prosody_cloning.pt
            └─ HiFiGAN_combined
                └─ best.pt

Note: Do not unzip the ASR models but keep them as zip folders! They will be unzipped during runtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models for using prosody cloning and GAN-generated speaker embeddings