Voice Activity Detection with `pyannote.audio` (ONNX version)

Suported ONNX runtime for pyannote.audio

Installation

Only Python 3.8+ is supported.

# for CPU
pip install onnxruntime
# for GPU, check version on: https://onnxruntime.ai/docs/build/eps.html#cuda
pip install onnxruntime-gpu
# install pyannote
pip install -e .

1. Export ONNX from PyTorch model

# 1. Download pytorch model (.bin) from https://huggingface.co/pyannote/segmentation/blob/main/pytorch_model.bin
wget https://huggingface.co/pyannote/segmentation/blob/main/pytorch_model.bin -O pytorch_model/vad_model.bin
# 2. Export
python onnx/export_onnx.py -i pytorch_model/vad_model.bin -o onnx_model/

Run VAD

# use onnx model (2x faster)
python vad.py -m onnx_model/vad_model.onnx -i tests/data/test_vad.wav
# mean time cost = 5.32921104

# use pytorch model
python vad.py -m onnx_model/vad_model.bin -i tests/data/test_vad.wav
# mean time cost = 9.56711404

Benchmark

Test file tests/data/test_vad.wav with duration 6m15s

CPU Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
GPU Nvidia GTX 1080Ti

Batch size 32

Backend	CPU time (s)	GPU time (s)
PyTorch	12.0	1.5
ONNX	4.33	NA

Batch size 64

Backend	CPU time (s)	GPU time (s)
PyTorch	inf	1.99
ONNX	4.02	NA

Citations

If you use pyannote.audio please use the following citations:

@inproceedings{Bredin2020,
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},
  Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
  Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
  Year = {2020},
}

@inproceedings{Bredin2021,
  Title = {{End-to-end speaker segmentation for overlap-aware resegmentation}},
  Author = {{Bredin}, Herv{\'e} and {Laurent}, Antoine},
  Booktitle = {Proc. Interspeech 2021},
  Year = {2021},
}

Name		Name	Last commit message	Last commit date
Latest commit History 2,275 Commits
.faq		.faq
.github		.github
doc		doc
notebook		notebook
onnx		onnx
onnx_model		onnx_model
pyannote		pyannote
pytorch_model		pytorch_model
questions		questions
tests		tests
tutorials		tutorials
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
FAQ.md		FAQ.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
codecov.yml		codecov.yml
environment.yaml		environment.yaml
faq.yml		faq.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
vad.py		vad.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Activity Detection with `pyannote.audio` (ONNX version)

Installation

1. Export ONNX from PyTorch model

Run VAD

Benchmark

Citations

About

Languages

License

dangvansam/pyannote-onnx

Folders and files

Latest commit

History

Repository files navigation

Voice Activity Detection with pyannote.audio (ONNX version)

Installation

1. Export ONNX from PyTorch model

Run VAD

Benchmark

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

Voice Activity Detection with `pyannote.audio` (ONNX version)