WeNet 1.0.0

robin1001 released this 21 Jun 07:27

· 1084 commits to main since this release

Model

propose and support U2++, as the following graph shows, which uses both forward and backward information at training and decoding.

support dynamic left chunk training and decoding, so we can limit history chunk at decoding to save memory and computation.
support distributed training.

Dataset

Now we support the following five standard speech datasets, and we got SOTA result or close to SOTA result.

数据集	语言	数据量(h)	测试集	CER/WER	SOTA
aishell-1	中文	200	test	4.36	4.36(WeNet)
aishell-2	中文	1000	test_ios	5.39	5.39(WeNet)
multi-cn	中文	2385	/	/	/
librispeech	英文	1000	test_clean	2.66	2.10(EspNet)
gigaspeech	英文	10000	test	11.0	10.80(EspNet)

Productivity

Here are some features related to productivity.

LM support. Here is the system design or LM supporting. WeNet can work with/without LM according to your applications/scenarios.

timestamp support.
n-best support.
endpoint support.
gRPC support
further refine x86 server and on-device android recipe.

Assets 2