diff --git a/tfhe/docs/SUMMARY.md b/tfhe/docs/SUMMARY.md index 7e4667ea46..aa9d283757 100644 --- a/tfhe/docs/SUMMARY.md +++ b/tfhe/docs/SUMMARY.md @@ -8,7 +8,10 @@ * [Installation](getting\_started/installation.md) * [Quick start](getting\_started/quick\_start.md) * [Types & Operations](getting\_started/operations.md) -* [Benchmarks](getting\_started/benchmarks.md) +* [Benchmarks](getting\_started/benchmarks/summary.md) + * [CPU Benchmarks](getting\_started/benchmarks/cpu\_benchmarks.md) + * [GPU Benchmarks](getting\_started/benchmarks/gpu\_benchmarks.md) + * [Zero-knowledge proof benchmarks](getting_started/benchmarks/zk_proof_benchmarks.md) * [Security and cryptography](getting\_started/security\_and\_cryptography.md) ## Fundamentals diff --git a/tfhe/docs/getting_started/benchmarks.md b/tfhe/docs/getting_started/benchmarks.md deleted file mode 100644 index b48e97a20f..0000000000 --- a/tfhe/docs/getting_started/benchmarks.md +++ /dev/null @@ -1,126 +0,0 @@ -# Benchmarks - -This document details the performance benchmarks of homomorphic operations using **TFHE-rs**. - -By their nature, homomorphic operations run slower than their cleartext equivalents. The following are the timings for basic operations, including benchmarks from other libraries for comparison. - -{% hint style="info" %} -All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped with an `AMD EPYC 9R14 CPU @ 2.60GHz` and 740GB of RAM. -{% endhint %} - -## Integer operations - -The following tables benchmark the execution time of some operation sets using `FheUint` (unsigned integers). The `FheInt` (signed integers) performs similarly. - -The next table shows the operation timings on CPU when all inputs are encrypted: - -| Operation \ Size | `FheUint8` | `FheUint16` | `FheUint32` | `FheUint64` | `FheUint128` | `FheUint256` | -| ------------------------------------------------------ | ---------- | ----------- | ----------- | ----------- | ------------ | ------------ | -| Negation (`-`) | 65.1 ms | 97.0 ms | 116 ms | 141 ms | 186 ms | 227 ms | -| Add / Sub (`+`,`-`) | 75.8 ms | 96.7 ms | 118 ms | 150 ms | 186 ms | 230 ms | -| Mul (`x`) | 96.1 ms | 180 ms | 251 ms | 425 ms | 1.1 s | 3.66 s | -| Equal / Not Equal (`eq`, `ne`) | 32.2 ms | 35.0 ms | 55.4 ms | 56.0 ms | 59.5 ms | 60.7 ms | -| Comparisons (`ge`, `gt`, `le`, `lt`) | 57.1 ms | 72.9 ms | 93.0 ms | 116 ms | 138 ms | 164 ms | -| Max / Min (`max`,`min`) | 94.3 ms | 114 ms | 138 ms | 159 ms | 189 ms | 233 ms | -| Bitwise operations (`&`, `\|`, `^`) | 19.6 ms | 20.1 ms | 20.2 ms | 21.7 ms | 23.9 ms | 25.7 ms | -| Div / Rem (`/`, `%`) | 711 ms | 1.81 s | 4.43 s | 10.5 s | 25.1 s | 63.2 s | -| Left / Right Shifts (`<<`, `>>`) | 99.5 ms | 125 ms | 155 ms | 190 ms | 234 ms | 434 ms | -| Left / Right Rotations (`left_rotate`, `right_rotate`) | 101 ms | 125 ms | 154 ms | 188 ms | 234 ms | 430 ms | -| Leading / Trailing zeros/ones | 96.7 ms | 155 ms | 181 ms | 241 ms | 307 ms | 367 ms | -| Log2 | 112 ms | 176 ms | 200 ms | 265 ms | 320 ms | 379 ms | - - -The next table shows the operation timings on CPU when the left input is encrypted and the right is a clear scalar of the same size: - -| Operation \ Size | `FheUint8` | `FheUint16` | `FheUint32` | `FheUint64` | `FheUint128` | `FheUint256` | -|--------------------------------------------------------|------------|-------------|-------------|-------------|--------------|--------------| -| Add / Sub (`+`,`-`) | 75.9 ms | 95.3 ms | 119 ms | 150 ms | 182 ms | 224 ms | -| Mul (`x`) | 79.3 ms | 163 ms | 211 ms | 273 ms | 467 ms | 1.09 s | -| Equal / Not Equal (`eq`, `ne`) | 31.2 ms | 30.9 ms | 34.4 ms | 54.5 ms | 57.0 ms | 58.0 ms | -| Comparisons (`ge`, `gt`, `le`, `lt`) | 38.6 ms | 56.3 ms | 76.1 ms | 99.0 ms | 124 ms | 141 ms | -| Max / Min (`max`,`min`) | 74.0 ms | 103 ms | 122 ms | 144 ms | 171 ms | 214 ms | -| Bitwise operations (`&`, `\|`, `^`) | 19.0 ms | 19.8 ms | 20.5 ms | 21.6 ms | 23.8 ms | 25.8 ms | -| Div (`/`) | 192 ms | 255 ms | 322 ms | 459 ms | 877 ms | 2.61 s | -| Rem (`%`) | 336 ms | 482 ms | 650 ms | 871 ms | 1.39 s | 3.05 s | -| Left / Right Shifts (`<<`, `>>`) | 19.5 ms | 20.2 ms | 20.7 ms | 22.1 ms | 23.8 ms | 25.6 ms | -| Left / Right Rotations (`left_rotate`, `right_rotate`) | 19.0 ms | 20.0 ms | 20.8 ms | 21.7 ms | 23.9 ms | 25.7 ms | - -All timings are based on parallelized Radix-based integer operations where each block is encrypted using the default parameters `PARAM_MESSAGE_2_CARRY_2_KS_PBS`. To ensure predictable timings, we perform operations in the `default` mode, which propagates the carry bit as needed. You can minimize operational costs by selecting from 'unchecked', 'checked', or 'smart' modes, each balancing performance and security differently. - -For more details about parameters, see [here](../references/fine-grained-apis/shortint/parameters.md). You can find the benchmark results on GPU for all these operations [here](../guides/run\_on\_gpu.md#benchmarks). - -## Shortint operations - -The next table shows the execution time of some operations using various parameter sets of tfhe-rs::shortint. Except for `unchecked_add`, we perform all the operations in the `default` mode. This mode ensures predictable timings along the entire circuit by clearing the carry space after each operation. The configuration is Concrete FFT + AVX-512. - -| Parameter set | PARAM\_MESSAGE\_1\_CARRY\_1 | PARAM\_MESSAGE\_2\_CARRY\_2 | PARAM\_MESSAGE\_3\_CARRY\_3 | PARAM\_MESSAGE\_4\_CARRY\_4 | -| ---------------------------------- |-----------------------------|-----------------------------|-----------------------------|-----------------------------| -| unchecked\_add | 559 ns | 544 ns | 2.26 µs | 9.53 µs | -| add | 9.98 ms | 14.1 ms | 113 ms | 873 ms | -| mul\_lsb | 9.79 ms | 13.8 ms | 113 ms | 794 ms | -| keyswitch\_programmable\_bootstrap | 9.85 ms | 13.9 ms | 114 ms | 791 ms | - -## Boolean operations - -The next table shows the execution time of a single binary Boolean gate. - -### tfhe-rs::boolean - -| Parameter set | Concrete FFT + AVX-512 | -| ---------------------------------------------------- |------------------------| -| DEFAULT\_PARAMETERS\_KS\_PBS | 9.98 ms | -| PARAMETERS\_ERROR\_PROB\_2\_POW\_MINUS\_165\_KS\_PBS | 17.0 ms | -| TFHE\_LIB\_PARAMETERS | 9.64 ms | - -#### tfhe-lib - -Using the same hpc7a.96xlarge machine as the one for tfhe-rs, the timings are as follows: - -| Parameter set | spqlios-fma | -| ------------------------------------------------ | ----------- | -| default\_128bit\_gate\_bootstrapping\_parameters | 13.5 ms | - -### OpenFHE (v1.1.2) - -Following the official instructions from OpenFHE, we use `clang14` and the following command to setup the project: `cmake -DNATIVE_SIZE=32 -DWITH_NATIVEOPT=ON -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DWITH_OPENMP=OFF ..` - -The following example shows how to initialize the configuration to use the HEXL library: - -```bash -export CXX=clang++ -export CC=clang - -scripts/configure.sh -Release -> y -hexl -> y - -scripts/build-openfhe-development-hexl.sh -``` - -Using the same hpc7a.96xlarge machine as the one for tfhe-rs, the timings are as follows: - -| Parameter set | GINX | GINX w/ Intel HEXL | -| --------------------------------- | ------- |--------------------| -| FHEW\_BINGATE/STD128\_OR | 25.5 ms | 24,0 ms | -| FHEW\_BINGATE/STD128\_LMKCDEY\_OR | 25.4 ms | 23.6 ms | - -## Reproducing TFHE-rs benchmarks - -**TFHE-rs** benchmarks can be easily reproduced from the [source](https://github.com/zama-ai/tfhe-rs). - -{% hint style="info" %} -AVX512 is now enabled by default for benchmarks when available -{% endhint %} - -The following example shows how to reproduce **TFHE-rs** benchmarks: - -```shell -#Boolean benchmarks: -make bench_boolean - -#Integer benchmarks: -make bench_integer - -#Shortint benchmarks: -make bench_shortint -``` diff --git a/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md b/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md new file mode 100644 index 0000000000..84cf4c70ca --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md @@ -0,0 +1,52 @@ +# CPU Benchmarks + +This document details the CPU performance benchmarks of homomorphic operations using **TFHE-rs**. + +By their nature, homomorphic operations run slower than their cleartext equivalents. The following are the timings for basic operations, including benchmarks from other libraries for comparison. + +{% hint style="info" %} +All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped with an `AMD EPYC 9R14 CPU @ 2.60GHz` and 740GB of RAM. +{% endhint %} + +## Integer operations + +The following tables benchmark the execution time of some operation sets using `FheUint` (unsigned integers). The `FheInt` (signed integers) performs similarly. + +The next table shows the operation timings on CPU when all inputs are encrypted + +{% embed url="https://docs.google.com/spreadsheets/d/1Z2NZvWEkDnbHPYE4Su0Oh2Zz1VBnT9dWbo3E29-LcDg/edit?usp=sharing" %} + +The next table shows the operation timings on CPU when the left input is encrypted and the right is a clear scalar of the same size: + +{% embed url="https://docs.google.com/spreadsheets/d/1NGPnuBhRasES9Ghaij4ixJJTpXVMqDzbqMniX-qIMGc/edit?usp=sharing" %} + +All timings are based on parallelized Radix-based integer operations where each block is encrypted using the default parameters `PARAM_MESSAGE_2_CARRY_2_KS_PBS`. To ensure predictable timings, we perform operations in the `default` mode, which ensures that the input and output encoding are similar (i.e., the carries are always emptied). + +You can minimize operational costs by selecting from 'unchecked', 'checked', or 'smart' modes from [the fine-grained APIs](../../references/fine-grained-apis/quick\_start.md), each balancing performance and correctness differently. For more details about parameters, see [here](../../references/fine-grained-apis/shortint/parameters.md). You can find the benchmark results on GPU for all these operations [here](../../guides/run\_on\_gpu.md#benchmarks). + +## Programmable bootstrapping + +The next table shows the execution time of a keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameter set is given. The configuration is Concrete FFT + AVX-512. + +{% embed url="https://docs.google.com/spreadsheets/d/1OdZrsk0dHTWSLLvstkpiv0u5G5tE0mCqItTb7WixGdg/edit?usp=sharing" %} + +## Reproducing TFHE-rs benchmarks + +**TFHE-rs** benchmarks can be easily reproduced from the [source](https://github.com/zama-ai/tfhe-rs). + +{% hint style="info" %} +AVX512 is now enabled by default for benchmarks when available +{% endhint %} + +The following example shows how to reproduce **TFHE-rs** benchmarks: + +```shell +#Boolean benchmarks: +make bench_boolean + +#Integer benchmarks: +make bench_integer + +#Shortint benchmarks: +make bench_shortint +``` diff --git a/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md b/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md new file mode 100644 index 0000000000..14deca317e --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md @@ -0,0 +1,27 @@ +# GPU Benchmarks + +This document details the GPU performance benchmarks of homomorphic operations using **TFHE-rs**. + +All GPU benchmarks presented here were obtained on H100 GPUs, and rely on the multithreaded PBS algorithm. The cryptographic parameters `PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS` were used. + +## 1xH100 +Below come the results for the execution on a single H100. +The following table shows the performance when the inputs of the benchmarked operation are encrypted: + +{% embed url="https://docs.google.com/spreadsheets/d/1dhNYXm7oY0l2qjX3dNpSZKjIBJElkEZtPDIWHZ4FA_A/edit?usp=sharing" %} + +The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: + +{% embed url="https://docs.google.com/spreadsheets/d/1wtnFnOwHrSOvfTWluUEaDoTULyveseVl1ZsYo3AOFKk/edit?usp=sharing" %} + +## 2xH100 + +Below come the results for the execution on two H100's. +The following table shows the performance when the inputs of the benchmarked operation are encrypted: + +{% embed url="https://docs.google.com/spreadsheets/d/1_2AUeu3ua8_PXxMfeJCh-pp6b9e529PGVEYUuZRAThg/edit?usp=sharing" %} + + +The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: + +{% embed url="https://docs.google.com/spreadsheets/d/1nLPt_m1MbkSdhMop0iKDnSN_c605l_JdMpK5JC90N_Q/edit?usp=sharing" %} diff --git a/tfhe/docs/getting_started/benchmarks/summary.md b/tfhe/docs/getting_started/benchmarks/summary.md new file mode 100644 index 0000000000..4734b7166e --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/summary.md @@ -0,0 +1,7 @@ +# Benchmarks + +This document summarizes the timings of some homomorphic operations over 64-bit encrypted integers, depending on the hardware. More details are given for [the CPU](cpu\_benchmarks.md), [the GPU](gpu\_benchmarks.md), or [zeros-knowledge proofs](zk\_proof\_benchmarks.md). + +### Operation time (ms) over FheUint 64 + +{% embed url="https://docs.google.com/spreadsheets/d/1ZbgsKnFH8eKrFjy9khFeaLYnUhbSV8Xu4H6rwulo0o8/edit?usp=sharing" %} diff --git a/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md b/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md new file mode 100644 index 0000000000..2cb3a298b6 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/zk_proof_benchmarks.md @@ -0,0 +1,7 @@ +# Zero-knowledge proof benchmarks + +This document details the performance benchmarks of [zero-knowledge proofs](../../guides/zk-pok.md) for [compact public key encryption](../../guides/public_key.md) using **TFHE-rs**. + +Benchmarks for the zero-knowledge proofs have been run on a `m6i.4xlarge` with 16 cores to simulate an usual client configuration. The verification are done on a `hpc7a.96xlarge` AWS instances to mimic a powerful server. + +{% embed url="https://docs.google.com/spreadsheets/d/1llCYHCz2CyLdTwXkiqhjVzJLzxW_RqdjHxmk72m1jm4/edit?usp=sharing" %} diff --git a/tfhe/docs/guides/run_on_gpu.md b/tfhe/docs/guides/run_on_gpu.md index d9def76818..da28334964 100644 --- a/tfhe/docs/guides/run_on_gpu.md +++ b/tfhe/docs/guides/run_on_gpu.md @@ -178,70 +178,5 @@ Depending on the platform, this can restrict the number of GPUs used to perform There is **nothing to change in the code to execute on multiple GPUs**, when they are available and have peer access to GPU 0 via NVLink. To keep the API as user-friendly as possible, the configuration is automatically set, i.e., the user has no fine-grained control over the number of GPUs to be used. -## Benchmarks - -All GPU benchmarks presented here were obtained on H100 GPUs, and rely on the multithreaded PBS algorithm. The cryptographic parameters `PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS` were used. - -### 1xH100 -Below come the results for the execution on a single H100. -The following table shows the performance when the inputs of the benchmarked operation are encrypted: - -| Operation \ Size | `FheUint8` | `FheUint16` | `FheUint32` | `FheUint64` | `FheUint128` | `FheUint256` | -|--------------------------------------------------------|------------|-------------|-------------|-------------|--------------|--------------| -| Negation (`-`) | 18.6 ms | 24.9 ms | 34.9 ms | 52.4 ms | 101 ms | 197 ms | -| Add / Sub (`+`,`-`) | 18.7 ms | 25.0 ms | 35.0 ms | 52.4 ms | 101 ms | 197 ms | -| Mul (`x`) | 35.0 ms | 59.7 ms | 124 ms | 378 ms | 1.31 s | 5.01 s | -| Equal / Not Equal (`eq`, `ne`) | 10.5 ms | 11.1 ms | 17.2 ms | 19.5 ms | 27.9 ms | 45.2 ms | -| Comparisons (`ge`, `gt`, `le`, `lt`) | 19.8 ms | 25.0 ms | 31.3 ms | 40.2 ms | 53.2 ms | 85.2 ms | -| Max / Min (`max`,`min`) | 30.2 ms | 37.1 ms | 46.6 ms | 61.4 ms | 91.8 ms | 154 ms | -| Bitwise operations (`&`, `\|`, `^`) | 4.83 ms | 5.3 ms | 6.36 ms | 8.26 ms | 15.3 ms | 25.4 ms | -| Div / Rem (`/`, `%`) | 221 ms | 528 ms | 1.31 s | 3.6 s | 11.0 s | 40.0 s | -| Left / Right Shifts (`<<`, `>>`) | 30.4 ms | 41.4 ms | 60.0 ms | 119 ms | 221 ms | 435 ms | -| Left / Right Rotations (`left_rotate`, `right_rotate`) | 30.4 ms | 41.4 ms | 60.1 ms | 119 ms | 221 ms | 435 ms | - -The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: - -| Operation \ Size | `FheUint8` | `FheUint16` | `FheUint32` | `FheUint64` | `FheUint128` | `FheUint256` | -|--------------------------------------------------------|------------|-------------|-------------|-------------|--------------|--------------| -| Add / Sub (`+`,`-`) | 19.0 ms | 25.0 ms | 35.0 ms | 52.4 ms | 101 ms | 197 ms | -| Mul (`x`) | 28.1 ms | 43.9 ms | 75.4 ms | 177 ms | 544 ms | 1.92 s | -| Equal / Not Equal (`eq`, `ne`) | 11.5 ms | 11.9 ms | 12.5 ms | 18.9 ms | 21.7 ms | 30.6 ms | -| Comparisons (`ge`, `gt`, `le`, `lt`) | 12.5 ms | 17.4 ms | 22.7 ms | 29.9 ms | 39.1 ms | 57.2 ms | -| Max / Min (`max`,`min`) | 22.5 ms | 28.9 ms | 37.4 ms | 50.6 ms | 77.4 ms | 126 ms | -| Bitwise operations (`&`, `\|`, `^`) | 4.92 ms | 5.51 ms | 6.47 ms | 8.37 ms | 15.5 ms | 25.6 ms | -| Div (`/`) | 46.8 ms | 70.0 ms | 138 ms | 354 ms | 1.10 s | 3.83 s | -| Rem (`%`) | 90.0 ms | 140 ms | 250 ms | 592 ms | 1.75 s | 6.06 s | -| Left / Right Shifts (`<<`, `>>`) | 4.82 ms | 5.36 ms | 6.38 ms | 8.26 ms | 15.3 ms | 25.4 ms | -| Left / Right Rotations (`left_rotate`, `right_rotate`) | 4.81 ms | 5.36 ms | 6.30 ms | 8.19 ms | 15.3 ms | 25.3 ms | - -### 2xH100 - -Below come the results for the execution on two H100's. -The following table shows the performance when the inputs of the benchmarked operation are encrypted: - -| Operation \ Size | `FheUint8` | `FheUint16` | `FheUint32` | `FheUint64` | `FheUint128` | `FheUint256` | -| ------------------------------------------------------ | ---------- | ----------- | ----------- | ----------- | ------------ | ------------ | -| Negation (`-`) | 16.1 ms | 20.3 ms | 27.7 ms | 38.2 ms | 54.7 ms | 83.0 ms | -| Add / Sub (`+`,`-`) | 16.1 ms | 20.4 ms | 27.8 ms | 38.3 ms | 54.9 ms | 83.2 ms | -| Mul (`x`) | 31.0 ms | 49.6 ms | 92.4 ms | 267 ms | 892 ms | 3.45 s | -| Equal / Not Equal (`eq`, `ne`) | 11.2 ms | 12.9 ms | 20.4 ms | 27.3 ms | 38.8 ms | 67.0 ms | -| Max / Min (`max`,`min`) | 53.4 ms | 59.3 ms | 70.4 ms | 89.6 ms | 120 ms | 177 ms | -| Bitwise operations (`&`, `\|`, `^`) | 4.16 ms | 4.62 ms | 5.61 ms | 7.52 ms | 10.2 ms | 15.7 ms | -| Div / Rem (`/`, `%`) | 299 ms | 595 ms | 1.36 s | 3.12 s | 7.8 s | 21.1 s | -| Left / Right Shifts (`<<`, `>>`) | 26.9 ms | 34.5 ms | 48.7 ms | 70.2 ms | 108 ms | 220 ms | -| Left / Right Rotations (`left_rotate`, `right_rotate`) | 26.8 ms | 34.5 ms | 48.7 ms | 70.1 ms | 108 ms | 220 ms | - - -The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: - -| Operation \ Size | `FheUint8` | `FheUint16` | `FheUint32` | `FheUint64` | `FheUint128` | `FheUint256` | -| ------------------------------------------------------ |------------|-------------|-------------|-------------|--------------|--------------| -| Add / Sub (`+`,`-`) | 16.4 ms | 20.5 ms | 28.0 ms | 38.4 ms | 54.9 ms | 83.1 ms | -| Mul (`x`) | 25.3 ms | 36.8 ms | 62.0 ms | 130 ms | 377 ms | 1.35 s | -| Equal / Not Equal (`eq`, `ne`) | 36.4 ms | 36.5 ms | 39.3 ms | 47.1 ms | 58.0 ms | 78.0 ms | -| Max / Min (`max`,`min`) | 53.6 ms | 60.8 ms | 71.9 ms | 89.4 ms | 119 ms | 173 ms | -| Bitwise operations (`&`, `\|`, `^`) | 4.33 ms | 4.76 ms | 6.4 ms | 7.65 ms | 10.4 ms | 15.7 ms | -| Div (`/`) | 40.9 ms | 59.7 ms | 109.0 ms | 248.5 ms | 806.1 ms | 2.9 s | -| Rem (`%`) | 80.6 ms | 116.1 ms | 199.9 ms | 412.9 ms | 1.2 s | 4.3 s | -| Left / Right Shifts (`<<`, `>>`) | 4.15 ms | 4.57 ms | 6.19 ms | 7.48 ms | 10.3 ms | 15.7 ms | -| Left / Right Rotations (`left_rotate`, `right_rotate`) | 4.15 ms | 4.57 ms | 6.18 ms | 7.46 ms | 10.2 ms | 15.6 ms | +## Benchmark +Please refer to the [GPU benchmarks](../getting_started/benchmarks/gpu_benchmarks.md) for detailed performance benchmark results. diff --git a/tfhe/docs/guides/zk-pok.md b/tfhe/docs/guides/zk-pok.md index 09d30d6706..790f11f96f 100644 --- a/tfhe/docs/guides/zk-pok.md +++ b/tfhe/docs/guides/zk-pok.md @@ -134,20 +134,5 @@ pub fn main() -> Result<(), Box> { } ``` -### Benchmarks -Benchmarks for the proofs have been run on a `m6i.4xlarge` with 16 cores to simulate an usual client configuration. The verification are done on a `hpc7a.96xlarge` AWS instances to mimic a powerful server. - -Timings in the case where the workload is mainly on the prover, i.e., with the `ZkComputeLoad::Proof` option. - -| Inputs | Proving | Verifying | -|--------------|---------|-----------| -| 1xFheUint64 | 2.79s | 197ms | -| 10xFheUint64 | 3.68s | 251ms | - - -Timings in the case where the workload is mainly on the verifier, i.e., with the `ZkComputeLoad::Verify` option. - -| Inputs | Proving | Verifying | -|--------------|---------|-----------| -| 1xFheUint64 | 730ms | 522ms | -| 10xFheUint64 | 1.08s | 682ms | +## Benchmark +Please refer to the [Zero-knowledge proof benchmarks](../getting_started/benchmarks/zk_proof_benchmarks.md) for detailed performance benchmark results. diff --git a/tfhe/docs/references/fine-grained-apis/shortint/parameters.md b/tfhe/docs/references/fine-grained-apis/shortint/parameters.md index 70c0881f4a..e2eeaf8482 100644 --- a/tfhe/docs/references/fine-grained-apis/shortint/parameters.md +++ b/tfhe/docs/references/fine-grained-apis/shortint/parameters.md @@ -34,7 +34,7 @@ fn main() { ## Impact of parameters on the operations -As shown [here](../../../getting\_started/benchmarks.md), the choice of the parameter set impacts the operations available and their efficiency. +As shown [here](../../../getting\_started/benchmarks/cpu\_benchmarks.md), the choice of the parameter set impacts the operations available and their efficiency. ### Generic bi-variate functions.