Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gpu): fix memory error in mul #1417

Merged
merged 1 commit into from
Jul 26, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions backends/tfhe-cuda-backend/cuda/src/integer/multiplication.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -255,17 +255,19 @@ __host__ void host_integer_sum_ciphertexts_vec_kb(
// create lut object for message and carry
// we allocate luts_message_carry in the host function (instead of scratch)
// to reduce average memory consumption
bool release_reused_lut = false;
int_radix_lut<Torus> *luts_message_carry;
size_t ch_amount = r / chunk_size;
if (!ch_amount)
ch_amount++;
if (reused_lut == nullptr) {
release_reused_lut = true;
size_t ch_amount = r / chunk_size;
if (!ch_amount)
ch_amount++;
reused_lut = new int_radix_lut<Torus>(streams, gpu_indexes, gpu_count,
mem_ptr->params, 2,
2 * ch_amount * num_blocks, true);
luts_message_carry = new int_radix_lut<Torus>(
streams, gpu_indexes, gpu_count, mem_ptr->params, 2,
2 * ch_amount * num_blocks, true);
} else {
luts_message_carry = new int_radix_lut<Torus>(
streams, gpu_indexes, gpu_count, mem_ptr->params, 2,
2 * ch_amount * num_blocks, reused_lut);
}
int_radix_lut<Torus> *luts_message_carry = reused_lut;
auto message_acc = luts_message_carry->get_lut(gpu_indexes[0], 0);
auto carry_acc = luts_message_carry->get_lut(gpu_indexes[0], 1);

Expand Down Expand Up @@ -442,10 +444,8 @@ __host__ void host_integer_sum_ciphertexts_vec_kb(
std::swap(new_blocks, old_blocks);
r = (new_blocks_created + rem_blocks) / num_blocks;
}
if (release_reused_lut) {
reused_lut->release(streams, gpu_indexes, gpu_count);
delete (reused_lut);
}
luts_message_carry->release(streams, gpu_indexes, gpu_count);
delete (luts_message_carry);

host_addition(streams[0], gpu_indexes[0], radix_lwe_out, old_blocks,
&old_blocks[num_blocks * big_lwe_size], big_lwe_dimension,
Expand Down
Loading