GPU Memory Leak when Concurrent Process #1510

zanjabil2502 · 2023-10-21T13:13:54Z

Firstly, i say thank you for updating pyannote-audio, specially for model v3.0. But i want report a something what i found. I know if the model of speaker embedding is new model and you choice train the model on onnx-runtime to make it lighter than use torch model. But there a bug in onnx-runtime, specially when using GPU. So when i using GPU and i run concurrent process, GPU cant decrease when the model on idle time until i stop my program.

I read some forum from onnx-runtime repo and many user have same problem in onnx-runtime. May you have solution for this problem?

github-actions · 2023-10-21T13:14:13Z

Thank you for your issue.You might want to check the FAQ if you haven't done so already.

Feel free to close this issue if you found an answer in the FAQ.

If your issue is a feature request, please read this first and update your request accordingly, if needed.

If your issue is a bug report, please provide a minimum reproducible example as a link to a self-contained Google Colab notebook containing everthing needed to reproduce the bug:

installation
data preparation
model download
etc.

Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).

Companies relying on pyannote.audio in production may contact me via email regarding:

paid scientific consulting around speaker diarization and speech processing in general;
custom models and tailored features (via the local tech transfer office).

This is an automated reply, generated by FAQtory

hbredin · 2023-10-22T08:55:14Z

I am afraid I won't be able to help you as I have very litte experience with actual deployment and even less concurrent processing.

📣 To people that have successfully deployed pyannote pipelines in production (and version 3.0 in particular), now would be the right time to chime in and help @zanjabil2502

zanjabil2502 · 2023-10-22T09:45:56Z

Thanks for your response. Maybe I have some tips, for reduce GPU Usage you can change batch size of self._segmentation to 16 or 8, it will reduce GPU Usage until under 1 Gb (700 Mb to 800 Mb) and realtime factor still on 0.025.

I add this code on speaker_verification.py (def WeSpeakerPretrainedSpeakerEmbedding)

sess_options.enable_mem_pattern     = False
sess_options.enable_cpu_mem_arena   = False
sess_options.enable_mem_reuse       = False
self.session_ = ort.InferenceSession(self.embedding, sess_options=sess_options, providers=providers)
self.session_.disable_fallback()

From some forum, this code will anticipate memory leak on CPU when use ONNX-Runtime.

And, this from my opinion, when i use onnxruntime-gpu <= 1.12.1, GPU usage will be smaller than use onnxruntime-gpu==1.16.1

hbredin · 2023-11-09T12:04:25Z

FYI: #1537

hbredin · 2023-11-16T13:02:57Z

Closing as latest version no longer relies on ONNX runtime.
Please update to pyannote.audio 3.1 and pyannote/speaker-diarization-3.1 (and open new issues if needed).

hbredin added the help wanted label Oct 22, 2023

hbredin mentioned this issue Nov 9, 2023

Get rid of ONNX WeSpeaker in favor of its pytorch implementation #1537

Closed

hbredin closed this as completed Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Memory Leak when Concurrent Process #1510

GPU Memory Leak when Concurrent Process #1510

zanjabil2502 commented Oct 21, 2023 •

edited

Loading

github-actions bot commented Oct 21, 2023

hbredin commented Oct 22, 2023

zanjabil2502 commented Oct 22, 2023 •

edited

Loading

hbredin commented Nov 9, 2023

hbredin commented Nov 16, 2023

GPU Memory Leak when Concurrent Process #1510

GPU Memory Leak when Concurrent Process #1510

Comments

zanjabil2502 commented Oct 21, 2023 • edited Loading

github-actions bot commented Oct 21, 2023

hbredin commented Oct 22, 2023

zanjabil2502 commented Oct 22, 2023 • edited Loading

hbredin commented Nov 9, 2023

hbredin commented Nov 16, 2023

zanjabil2502 commented Oct 21, 2023 •

edited

Loading

zanjabil2502 commented Oct 22, 2023 •

edited

Loading