The primary motivation of NeurST is to facilitate NLP researchers to get started on end-to-end speech translation (ST) and build advanced neural machine translation (NMT) models.
See here for a full list of NeurST examples. And we present recent progress of end-to-end ST technology at https://st-benchmark.github.io/.
NeurST is based on TensorFlow2 and we are working on the pytorch version.
March 29, 2022: Release of GigaST dataset: a large-scale speech translation corpus.
Aug 16, 2021: Release of models and results for IWSLT 2021 offline ST and simultaneous translation task.
June 15, 2021: Integration of LightSeq for training speedup, see the experimental branch.
March 28, 2021: The v0.1.1 release includes the instructions of weight pruning and quantization aware training for transformer models, and several more features. See the release note for more details.
Dec. 25, 2020: The v0.1.0 release includes the overall design of the code structure and recipes for training end-to-end ST models. See the release note for more details.
- Production ready: The model trained by NeurST can be directly exported as TF savedmodel format and use TensorFlow-serving. There is no gap between the research model and production model. Additionally, one can use LightSeq for NeurST model serving with a much lower latency.
- Light weight: NeurST is designed specifically for end-to-end ST and NMT models, with clean and simple code. It has no dependency on Kaldi, which simplifies installation and usage.
- Extensibility and scalability: NeurST has the careful design for extensibility and scalability. It allows users to customize
Model
,Task
,Dataset
etc. and combine each other. - High computation efficiency: NeurST has high computation efficiency and can be further optimized by enabling mixed-precision and XLA. Fast distributed training using
Byteps
/Horovod
is also supported for large-scale scenarios. - Reliable and reproducible benchmarks: NeurST reports strong baselines with well-designed hyper-parameters on several benchmark datasets (MT&ST). It provides a series of recipes to reproduce them.
NeurST provides reference implementations of various models and benchmarks. Please see examples for model links and NeurST benchmark on different datasets.
- Text Translation
- Speech-to-Text Translation
- Python version >= 3.6
- TensorFlow >= 2.3.0
Install NeurST from source:
git clone https://github.com/bytedance/neurst.git
cd neurst/
pip3 install -e .
If there exists ImportError during running, manually install the required packages at that time.
@InProceedings{zhao2021neurst,
author = {Chengqi Zhao and Mingxuan Wang and Qianqian Dong and Rong Ye and Lei Li},
booktitle = {the 59th Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations},
title = {{NeurST}: Neural Speech Translation Toolkit},
year = {2021},
month = aug,
}
Any questions or suggestions, please feel free to contact us: zhaochengqi.d@bytedance.com, wangmingxuan.89@bytedance.com.
We thank Bairen Yi, Zherui Liu, Yulu Jia, Yibo Zhu, Jiaze Chen, Jiangtao Feng, Zewei Sun for their kind help.