Skip to content

Commit

Permalink
chore(gpu): fix multi-gpu div performance
Browse files Browse the repository at this point in the history
  • Loading branch information
agnesLeroy committed Sep 19, 2024
1 parent 00fc281 commit d0624d6
Showing 1 changed file with 1 addition and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,7 @@ uint32_t get_lwe_chunk_size(uint32_t gpu_index, uint32_t max_num_pbs,

int max_blocks_per_sm;
int max_shared_memory = cuda_get_max_shared_memory(0);
cudaSetDevice(gpu_index);
if (max_shared_memory < full_sm_keybundle)
cudaOccupancyMaxActiveBlocksPerMultiprocessor(
&max_blocks_per_sm,
Expand Down

0 comments on commit d0624d6

Please sign in to comment.