Skip to content

MuKai2000/S-Align

Repository files navigation

S-Align (Soft alignment for E2E Speech Translation)

The code is forked from Fairseq-v0.12.3. For more Installation details, please refer to Fairseq

Useage

Training scripts and configurations for the MuST-C dataset are as follows:

egs
|---machine_translation
|    |---train.sh
|    |---decode.sh
|    |---load_embedding.py
|---pretrain-all
|    |---joint_train_merge.sh
|    |---decode.sh
|    |---device_run.sh
|    |---conf

Step 1. MT Pretrain

• Prepare MT training data.

• Modify the necessary paths in machine_translation/train.sh, and run machine_translation/train.sh to pretrain MT model.

• Adjust all the required paths in the machine_translation/decode.sh to match those in machine_translation/train.sh, and run machine_translation/decode.sh to inference your pretrained MT model.

• Use machine_translation/load_embedding.py to fetch necessary word embeddings from pretrianed MT model.

Step 2. Multi-Task Fine-tuning

• Download the Hubert-base pretrained Model without fune-tuning.

• Prepare the MuST-C ST training data, please follow here.

• Modify the necessary paths in the pretrain-all/conf/train_soft_alignment.yaml, such as:

w2v-path=/your/path/to/hubert
mt-model-path=/your/path/to/mt/pretrain/model
decoder-embed-path=/your/path/to/mt/word/embedding

• Set data path and other required paths in the pretrain-all/joint_train_merge.sh, and run pretrain-all/joint_train_merge.sh to fune-tune your model.

• Use pretrain-all/decode.sh to inference your model

Citation

About

Soft alignment for E2E Speech Translation

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages