Upgrade to Silero-Vad V5 #884

hoonlight · 2024-06-28T07:10:53Z

https://github.com/snakers4/silero-vad/releases/tag/v5.0

The V5 model now only works with a fixed size window, so the window_size_samples parameter is removed and its value is fixed at 512.
Change to use the state variable instead of the existing h and c variables.
Slightly changed internal logic, now some context (part of previous chunk) is passed along with the current chunk.
Change the dimensions of the state variable from 64 to 128.
Replace ONNX file with V5 version

… function

trungkienbkhn · 2024-06-28T11:36:45Z

@hoonlight , thanks for quickly adapting Silero-Vad V5 for fw after this model was released. Have you run benchmarks for it yet?

hoonlight · 2024-06-28T12:08:11Z

@hoonlight , thanks for quickly adapting Silero-Vad V5 for fw after this model was released. Have you run benchmarks for it yet?

No, I haven't run the benchmarks yet.
I found it a little while ago, thanks.

However, since I won't have access to a GPU for a while, I should be able to run the benchmarks in a couple weeks.

Purfview · 2024-06-28T18:14:51Z

Thanks for adapting v5, I was just thinking to do it, good that I noticed your PR. 😄

Btw, about GPU, look there #499 (comment)

trungkienbkhn · 2024-07-01T08:39:15Z

For information, I ran benchmarks with GPU H100 and large-v3 model.
Below are the results:

1. Speed benchmark:
Processing audio with duration 13:19.231s
Detected language 'fr' with probability 1.00

System	Min execution time
Faster-Whisper	41.413s
FW with SILERO VAD V5	39.529s

2. WER benchmark:
Dataset: librispeech_asr
Number of audio used for evaluation: 500

System	WER
Faster-Whisper	3.139
FW with SILERO VAD V5	2.815

3. Memory benchmark:
GPU name: NVIDIA H100 PCIe
GPU device index: 0

System	Maximum increase of RAM	Maximum GPU memory usage	Maximum GPU power usage
Faster-Whisper	1222 MiB	5107MiB / 81559MiB	145W / 350W
FW with SILERO VAD V5	1225 MiB	5107MiB / 81559MiB	149W / 350W

The results look good to me. Speed has improved a bit, as described in the VAD V5 model release:

3x faster inference for TorchScript, 10% faster inference for ONNX;

This reverts commit 8d400e9.

hoonlight added 11 commits June 28, 2024 15:55

Fix window_size_samples to 512

e8f2666

Update SileroVADModel

ece631c

Replace ONNX file with V5 version

416125d

chore: remove unused import warnings

7d423ce

Update SileroVADModel to include context

91c3890

Fix dtype for context array

ed87016

Fix missing assignment of 'context' variable in get_speech_timestamps…

f2ffb8e

… function

Chore: Delete Whitespace

8b61d96

Update SileroVADModel to fix 'context' variable assignment

be1bce0

Update get_initial_state to fix context initialization

faf5c4f

Rename function to get_initial_states for clarity

82e05fa

trungkienbkhn merged commit 8d400e9 into SYSTRAN:master Jul 1, 2024
3 checks passed

hoonlight deleted the silero-vad-v5 branch July 1, 2024 10:45

188198 mentioned this pull request Jul 11, 2024

faster-whisper 1.0.3 CheshireCC/faster-whisper-GUI#184

Closed

Petemir mentioned this pull request Jul 25, 2024

Silero-VAD Meta Hallucinations #843

Open

shinlw added a commit to shinlw/faster-whisper that referenced this pull request Sep 6, 2024

Revert "Upgrade to Silero-Vad V5 (SYSTRAN#884)"

7af0b45

This reverts commit 8d400e9.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to Silero-Vad V5 #884

Upgrade to Silero-Vad V5 #884

hoonlight commented Jun 28, 2024 •

edited

Loading

trungkienbkhn commented Jun 28, 2024

hoonlight commented Jun 28, 2024 •

edited

Loading

Purfview commented Jun 28, 2024 •

edited

Loading

trungkienbkhn commented Jul 1, 2024

Upgrade to Silero-Vad V5 #884

Upgrade to Silero-Vad V5 #884

Conversation

hoonlight commented Jun 28, 2024 • edited Loading

trungkienbkhn commented Jun 28, 2024

hoonlight commented Jun 28, 2024 • edited Loading

Purfview commented Jun 28, 2024 • edited Loading

trungkienbkhn commented Jul 1, 2024

hoonlight commented Jun 28, 2024 •

edited

Loading

hoonlight commented Jun 28, 2024 •

edited

Loading

Purfview commented Jun 28, 2024 •

edited

Loading