chore(gpu): rework select to avoid using local streams #1867

agnesLeroy · 2024-12-13T12:52:54Z

closes: please link all relevant issues

PR content/description

Check-list:

Tests for the changes have been added (for bug fixes / features)
Docs have been added / updated (for bug fixes / features)
Relevant issues are marked as resolved/closed, related issues are linked in the description
Check for breaking changes (including serialization changes) and add them to commit message following the conventional commit specification

agnesLeroy · 2024-12-13T12:54:31Z

Latency on H100 is slightly improved. I need to check the effect on the whitepaper ERC20 transfer throughput.

guillermo-oyarzun

Does this improves the issues in the multi-gpu?

agnesLeroy · 2024-12-16T09:23:14Z

Nope, it doesn't 😞

agnesLeroy · 2024-12-16T09:23:44Z

I think the multi-GPU throughput issue is related to the use of cudaDeviceSynchronize in drop for CudaVec, I haven't tried to check though. Disabling that synchronization may lead to memory errors... I could try nevertheless to confirm it has an effect 🤔

guillermo-oyarzun · 2024-12-16T09:28:05Z

I think the multi-GPU throughput issue is related to the use of cudaDeviceSynchronize in drop for CudaVec, I haven't tried to check though. Disabling that synchronization may lead to memory errors... I could try nevertheless to confirm it has an effect 🤔

Confirmation would be nice, even though changing that safely might involve a big refactor.

agnesLeroy · 2024-12-16T10:00:11Z

Hmm looks like it's not that: https://github.com/zama-ai/tfhe-rs/actions/runs/12349715463/job/34461608896. I disabled cudaDeviceSynchronize in drop there but still we see the same effect. We need to investigate further.

cla-bot bot added the cla-signed label Dec 13, 2024

agnesLeroy requested a review from pdroalves December 13, 2024 12:53

agnesLeroy force-pushed the al/rework_if_then_else branch 3 times, most recently from 8e06cac to a4a5e0e Compare December 13, 2024 17:12

agnesLeroy requested review from guillermo-oyarzun and removed request for pdroalves December 16, 2024 09:01

chore(gpu): rework select to avoid using local streams

e490498

agnesLeroy force-pushed the al/rework_if_then_else branch from a4a5e0e to e490498 Compare December 16, 2024 09:02

guillermo-oyarzun approved these changes Dec 16, 2024

View reviewed changes

zama-bot added the approved label Dec 16, 2024

agnesLeroy merged commit e9c901b into main Dec 16, 2024
100 of 106 checks passed

agnesLeroy deleted the al/rework_if_then_else branch December 16, 2024 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(gpu): rework select to avoid using local streams #1867

chore(gpu): rework select to avoid using local streams #1867

agnesLeroy commented Dec 13, 2024

agnesLeroy commented Dec 13, 2024

guillermo-oyarzun left a comment

agnesLeroy commented Dec 16, 2024

agnesLeroy commented Dec 16, 2024 •

edited

Loading

guillermo-oyarzun commented Dec 16, 2024

agnesLeroy commented Dec 16, 2024

chore(gpu): rework select to avoid using local streams #1867

chore(gpu): rework select to avoid using local streams #1867

Conversation

agnesLeroy commented Dec 13, 2024

PR content/description

Check-list:

agnesLeroy commented Dec 13, 2024

guillermo-oyarzun left a comment

Choose a reason for hiding this comment

agnesLeroy commented Dec 16, 2024

agnesLeroy commented Dec 16, 2024 • edited Loading

guillermo-oyarzun commented Dec 16, 2024

agnesLeroy commented Dec 16, 2024

agnesLeroy commented Dec 16, 2024 •

edited

Loading