Name		Name	Last commit message	Last commit date
parent directory ..
custom_operators/ctc_loss		custom_operators/ctc_loss
model_configs		model_configs
tests		tests
.gitignore		.gitignore
README.md		README.md
audio_utils.py		audio_utils.py
conf_utils.py		conf_utils.py
conformer_blocks.py		conformer_blocks.py
conformer_builder.py		conformer_builder.py
conformer_inference.py		conformer_inference.py
conformer_train.py		conformer_train.py
install_ctcdecode.sh		install_ctcdecode.sh
librispeech_data.py		librispeech_data.py
logging_util.py		logging_util.py
requirements.txt		requirements.txt
text_utils.py		text_utils.py

README.md

Graphcore

Conformer for Speech Recognition

This PopART application implements a Speech Recognition model using Conformer blocks as described in this paper:

Conformer: Convolution-augmented Transformer for Speech Recognition.

Currently, training is based on Connectionist Temporal Classification (CTC). For the implementation of the CTC loss, go to custom_operators/ctc_loss.

How to train a conformer model

Prepare the environment.

Install the poplar-sdk following the README provided. Make sure to source the enable.sh scripts for poplar and popart.
Setup a virtual environment

virtualenv venv -p python3.6
source venv/bin/activate

Install required packages like torchaudio and librosa.

pip install -r requirements.txt

Dataset: Currently, we use the LibriSpeech dataset which is a multi-speaker dataset of approximately 1000 hours of 16kHz English speech. For more details see http://www.openslr.org/12

To download the default versions of the training and test sets, go to a suitable location and do:

wget http://www.openslr.org/resources/12/train-clean-100.tar.gz
tar -zxvf train-clean-100.tar.gz
wget http://www.openslr.org/resources/12/test-clean.tar.gz
tar -zxvf test-clean.tar.gz

More train, dev and test datasets can also be downloaded as listed here: http://www.openslr.org/12 .

Build CTC custom op for training. Go to custom_operators/ctc_loss, and run

make all

Run the training program. Use the --model-conf-file option to specify which model configuration to use. Use the --data-dir option to specify the path to the data and use the --model-dir option to specify a path to save the trained model. For e.g., to train the small model configuration, do:

python3 conformer_train.py --model-conf-file  model_configs/small_model_conf_bs4.json --model-dir /path/to/trained/model --data-dir /path/to/librispeech

How to run inference with the conformer model

To run inference, one needs to install the ctcdecode library for CTC beam search decoding.

./install_ctcdecode.sh

Run the inference program. Use the --model-file option to specify which trained model checkpoint to use. And use the --results-dir option to specify where to save the inference results. For e.g.. to test a trained model of small configuration, do:

python3 conformer_inference.py --model-conf-file  model_configs/small_model_conf_bs4.json --model-file /path/to/trained/model.onnx --data-dir /path/to/librispeech --results-dir /path/to/inference_results/

The ground-truth and model predictions will be saved in a .txt file at /path/to/inference_results/.

Run unit-tests

To run unit-tests, simply do:

pytest

Options

Use --help to show the available options. Here are a few other options:

--dataset the dataset to use. For training, this can be one of train-clean-100, train-clean-360,train-other-500. For testing, this can be one of test-clean, test-other.

--num-epochs the number of epochs to run for training.

--select-ipu specifies the ID of the IPU or MultiIPU to use for the session.

License

This application is licensed under the MIT license - see the LICENSE file at the top-level of this repository.

The LibriSpeech dataset used here is licensed under the Creative Commons Attribution 4.0 International License. See http://www.openslr.org/12

The code for this application uses the ctcdecode library which is licensed under the MIT license. See https://github.com/parlance/ctcdecode/blob/v1.0/LICENSE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conformer_asr

conformer_asr

README.md

Graphcore

Conformer for Speech Recognition

How to train a conformer model

How to run inference with the conformer model

Run unit-tests

Options

License

Files

conformer_asr

Directory actions

More options

Directory actions

More options

Latest commit

History

conformer_asr

Folders and files

parent directory

README.md

Graphcore

Conformer for Speech Recognition

How to train a conformer model

How to run inference with the conformer model

Run unit-tests

Options

License