MetaASR CrossAccent

This repo implements {fomaml, reptile, multi-task} pretraining interface for end-to-end ASR, and provides well-organized experiment flow.

Note: This repo is aligned with the paper, but apply meta learning on cross-accent setting. Since the corpus used in the paper (on cross-language setting) is not free (IARPA-BABEL).

Prerequisites

global

torch==1.4.1
numpy==1.18.3
sentencepiece==0.1.85 (for subword units)

local

comet-ml==3.1.6, and you should register using ntu email
editdistance==0.5.3
tqdm-logger==0.3.0 repo
torchexp==0.1.0 repo
torch_optimizer==0.0.1a11 repo

misc

~~Download corpus -> ask me~~
- If you are the member of NTU speech lab, ask me.
- If not, please download the corpus from Mozilla Common Voice Project, I'll add how to pre-process it after I finished my master thesis lol. Basically, I used the recipe in espnet for pre-processing (but should modify a little bit for adding ACCENT label), then use the the script to extract kaldi-format files into one numpy file (with memmap, to save RAM usage).
modify the COMET_PROJECT_NAME, COMET_WORKSPACE in src/marcos.py to your setting

Note

One exp: (Pretraining) -> training (train.py) -> testing/decoding (train.py --test) -> score.sh
Each trainer will instantialized with different interface to decide its training/pretraining behavior

Structure

root

train.py:
- mono-accent training (include training from scratch or fine-tune on pretrained model)
- testing (aka decoding): add --test flag
pretrain.py: multi-accent pretraining (thru meta/multi-task)
excute run_foo.sh (it will call foo_full_exp.sh to conduct one complete experiment) to conduct large-scale experiments on battleship
score.sh will call translate.py with proper env. variables to excute sctk to evaluate the error rate
data/: soft link to data
config/: config yaml
testing-logs/: Can also modify the name in src/marcos.py
- pretrain: store info when excuting pretrain.py
- evaluation: store info when excuting train.py
tensorboard: as title (but NOTE that we use comet.ml to track the experiment, most of the time we don't need this)

src

main files
- foo_trainer.py: define how to run one batch and some specific stuff for this model, will include asr_model inside
- tester.py: define how to decode (all model use same tester)
- train_interface.py and mono_interface.py: used in mono-accent training and fine-tuning
- pretrain_interface.py and fo_meta_interface.py, multi_interface.py...: used in pretraining
modules/: define some components in model (but a little bit deprecated now 😅)
io/:
- data_loader: used in mono-accent
- data_container used in multi-accent
model/:
- transformer (NOTE: transformer_pytorch) is what we use
- blstm (vgg-blstm fully-connected ctc model used in MetaASR)
monitor:
- dashboard.py (comet.ml), tb_dashboard (tensorboard)
- logger.py: tqdm_logger logging
- metric.py: how to calculate error rate

Citation

Feel free to use/modify the code, any bug report or improvement suggestion will be appreciated. If you find this project helpful for your research, please do consider to cite our paper, thanks!

@inproceedings{hsu2020meta,
  title={Meta learning for end-to-end low-resource speech recognition},
  author={Hsu, Jui-Yang and Chen, Yuan-Jui and Lee, Hung-yi},
  booktitle={ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={7844--7848},
  year={2020},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
config		config
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Learning-Curve.ipynb		Learning-Curve.ipynb
README.md		README.md
adapt_full_exp.sh		adapt_full_exp.sh
adapt_model_full_exp.sh		adapt_model_full_exp.sh
find_num_utt.py		find_num_utt.py
mono_full_exp.sh		mono_full_exp.sh
pretrain.py		pretrain.py
run_adapt.sh		run_adapt.sh
run_adapt_from_model.sh		run_adapt_from_model.sh
run_mono.sh		run_mono.sh
score.sh		score.sh
train.py		train.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetaASR CrossAccent

Prerequisites

global

local

misc

Note

Structure

root

src

Citation

About

Releases

Packages

Languages

License

sunprinceS/MetaASR-CrossAccent

Folders and files

Latest commit

History

Repository files navigation

MetaASR CrossAccent

Prerequisites

global

local

misc

Note

Structure

root

src

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages