diff --git a/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md b/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md index 84cf4c70ca..3705da3e28 100644 --- a/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md +++ b/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md @@ -14,11 +14,11 @@ The following tables benchmark the execution time of some operation sets using ` The next table shows the operation timings on CPU when all inputs are encrypted -{% embed url="https://docs.google.com/spreadsheets/d/1Z2NZvWEkDnbHPYE4Su0Oh2Zz1VBnT9dWbo3E29-LcDg/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1b_-72ArnSdaqfr-gJOnMmVdcBokYZohnylO4LUj2PMw/edit?usp=sharing" %} The next table shows the operation timings on CPU when the left input is encrypted and the right is a clear scalar of the same size: -{% embed url="https://docs.google.com/spreadsheets/d/1NGPnuBhRasES9Ghaij4ixJJTpXVMqDzbqMniX-qIMGc/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1m3tjCi_2GSIHop2zZLAtVbhdDn5wqTGd2lOA3CcJe-U/edit?usp=sharing" %} All timings are based on parallelized Radix-based integer operations where each block is encrypted using the default parameters `PARAM_MESSAGE_2_CARRY_2_KS_PBS`. To ensure predictable timings, we perform operations in the `default` mode, which ensures that the input and output encoding are similar (i.e., the carries are always emptied). @@ -28,7 +28,7 @@ You can minimize operational costs by selecting from 'unchecked', 'checked', or The next table shows the execution time of a keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameter set is given. The configuration is Concrete FFT + AVX-512. -{% embed url="https://docs.google.com/spreadsheets/d/1OdZrsk0dHTWSLLvstkpiv0u5G5tE0mCqItTb7WixGdg/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1o6MWpbzbYhDs3Pnoq-2hlNEgO9G8wGR5niW-OOZ6c_4/edit?usp=sharing" %} ## Reproducing TFHE-rs benchmarks diff --git a/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md b/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md index a271129bec..4248ccc23a 100644 --- a/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md +++ b/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md @@ -8,26 +8,26 @@ All GPU benchmarks presented here were obtained on H100 GPUs, and rely on the mu Below come the results for the execution on a single H100. The following table shows the performance when the inputs of the benchmarked operation are encrypted: -{% embed url="https://docs.google.com/spreadsheets/d/1dhNYXm7oY0l2qjX3dNpSZKjIBJElkEZtPDIWHZ4FA_A/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1xGWykMa8fZ7RWUjkCl-52FJ-BNge8cB-5CSHrVZ6XRo/edit?usp=sharing" %} The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: -{% embed url="https://docs.google.com/spreadsheets/d/1wtnFnOwHrSOvfTWluUEaDoTULyveseVl1ZsYo3AOFKk/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1MZfE9c-cQw3yAP55tu0i8uLl4lTAiH9zW3gRFp0ve7s/edit?usp=sharing" %} ## 2xH100 Below come the results for the execution on two H100's. The following table shows the performance when the inputs of the benchmarked operation are encrypted: -{% embed url="https://docs.google.com/spreadsheets/d/1_2AUeu3ua8_PXxMfeJCh-pp6b9e529PGVEYUuZRAThg/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1bcL0wgFk-cfR4asGSCWDFt7JaqDYJT-l4pH58A-yBkc/edit?usp=sharing" %} The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: -{% embed url="https://docs.google.com/spreadsheets/d/1nLPt_m1MbkSdhMop0iKDnSN_c605l_JdMpK5JC90N_Q/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1_8VIoStixns22lQq_RBSjVm-0iFHjJpntQTrvEHZpSg/edit?usp=sharing" %} ## Programmable bootstrapping The next table shows the execution time of a keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameter set is given. -{% embed url="https://docs.google.com/spreadsheets/d/11JfbPxJ8XMMfob4AZIWhDSglTO9X_YX8R7dNQK73uuk/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1KhElQ7sIsShUSVQw5bKFoP-x5BgMaWh1pZtrVAdC3T4/edit?usp=sharing" %} diff --git a/tfhe/docs/getting_started/benchmarks/summary.md b/tfhe/docs/getting_started/benchmarks/summary.md index d980f7ee92..74b7f74263 100644 --- a/tfhe/docs/getting_started/benchmarks/summary.md +++ b/tfhe/docs/getting_started/benchmarks/summary.md @@ -2,6 +2,7 @@ This document summarizes the timings of some homomorphic operations over 64-bit encrypted integers, depending on the hardware. More details are given for [the CPU](cpu\_benchmarks.md), [the GPU](gpu\_benchmarks.md), or [zeros-knowledge proofs](zk\_proof\_benchmarks.md). +Beware that the noise used in the cryptographic parameters follows a tweaked uniform (TUniform) distribution, instead of a Gaussian. You can get the parameters used for benchmarks by cloning the repository and checking out the commit you want to use (starting with the v0.11.0 release) and run the following make command: ```console @@ -10,4 +11,4 @@ make print_doc_bench_parameters ### Operation time (ms) over FheUint 64 -{% embed url="https://docs.google.com/spreadsheets/d/1ZbgsKnFH8eKrFjy9khFeaLYnUhbSV8Xu4H6rwulo0o8/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1OMdGSakEUbIFSEQKhAinTolJjvmPBbafi3DEe3UfzsQ/edit?usp=sharing" %} diff --git a/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md b/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md index 2cb3a298b6..d6836b20e3 100644 --- a/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md +++ b/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md @@ -4,4 +4,4 @@ This document details the performance benchmarks of [zero-knowledge proofs](../. Benchmarks for the zero-knowledge proofs have been run on a `m6i.4xlarge` with 16 cores to simulate an usual client configuration. The verification are done on a `hpc7a.96xlarge` AWS instances to mimic a powerful server. -{% embed url="https://docs.google.com/spreadsheets/d/1llCYHCz2CyLdTwXkiqhjVzJLzxW_RqdjHxmk72m1jm4/edit?usp=sharing" %} +{% embed url="https://docs.google.com/spreadsheets/d/1x12I7Tkdx63Q6sNllygg6urSd5KC1sj1wj4L9jWiET4/edit?usp=sharing" %}