-
Notifications
You must be signed in to change notification settings - Fork 155
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
chore(docs): benchmark regrouping and visualization
- Loading branch information
1 parent
41fae73
commit 7ec22b3
Showing
9 changed files
with
103 additions
and
212 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# CPU Benchmarks | ||
|
||
This document details the CPU performance benchmarks of homomorphic operations using **TFHE-rs**. | ||
|
||
By their nature, homomorphic operations run slower than their cleartext equivalents. The following are the timings for basic operations, including benchmarks from other libraries for comparison. | ||
|
||
{% hint style="info" %} | ||
All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped with an `AMD EPYC 9R14 CPU @ 2.60GHz` and 740GB of RAM. | ||
{% endhint %} | ||
|
||
## Integer operations | ||
|
||
The following tables benchmark the execution time of some operation sets using `FheUint` (unsigned integers). The `FheInt` (signed integers) performs similarly. | ||
|
||
The next table shows the operation timings on CPU when all inputs are encrypted | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1Z2NZvWEkDnbHPYE4Su0Oh2Zz1VBnT9dWbo3E29-LcDg/edit?usp=sharing" %} | ||
|
||
The next table shows the operation timings on CPU when the left input is encrypted and the right is a clear scalar of the same size: | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1NGPnuBhRasES9Ghaij4ixJJTpXVMqDzbqMniX-qIMGc/edit?usp=sharing" %} | ||
|
||
All timings are based on parallelized Radix-based integer operations where each block is encrypted using the default parameters `PARAM_MESSAGE_2_CARRY_2_KS_PBS`. To ensure predictable timings, we perform operations in the `default` mode, which ensures that the input and output encoding are similar (i.e., the carries are always emptied). | ||
|
||
You can minimize operational costs by selecting from 'unchecked', 'checked', or 'smart' modes from [the fine-grained APIs](../../references/fine-grained-apis/quick\_start.md), each balancing performance and correctness differently. For more details about parameters, see [here](../../references/fine-grained-apis/shortint/parameters.md). You can find the benchmark results on GPU for all these operations [here](../../guides/run\_on\_gpu.md#benchmarks). | ||
|
||
## Programmable bootstrapping | ||
|
||
The next table shows the execution time of a keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameter set is given. The configuration is Concrete FFT + AVX-512. | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1OdZrsk0dHTWSLLvstkpiv0u5G5tE0mCqItTb7WixGdg/edit?usp=sharing" %} | ||
|
||
## Reproducing TFHE-rs benchmarks | ||
|
||
**TFHE-rs** benchmarks can be easily reproduced from the [source](https://github.com/zama-ai/tfhe-rs). | ||
|
||
{% hint style="info" %} | ||
AVX512 is now enabled by default for benchmarks when available | ||
{% endhint %} | ||
|
||
The following example shows how to reproduce **TFHE-rs** benchmarks: | ||
|
||
```shell | ||
#Boolean benchmarks: | ||
make bench_boolean | ||
|
||
#Integer benchmarks: | ||
make bench_integer | ||
|
||
#Shortint benchmarks: | ||
make bench_shortint | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# GPU Benchmarks | ||
|
||
This document details the GPU performance benchmarks of homomorphic operations using **TFHE-rs**. | ||
|
||
All GPU benchmarks presented here were obtained on H100 GPUs, and rely on the multithreaded PBS algorithm. The cryptographic parameters `PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS` were used. | ||
|
||
## 1xH100 | ||
Below come the results for the execution on a single H100. | ||
The following table shows the performance when the inputs of the benchmarked operation are encrypted: | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1dhNYXm7oY0l2qjX3dNpSZKjIBJElkEZtPDIWHZ4FA_A/edit?usp=sharing" %} | ||
|
||
The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1wtnFnOwHrSOvfTWluUEaDoTULyveseVl1ZsYo3AOFKk/edit?usp=sharing" %} | ||
|
||
## 2xH100 | ||
|
||
Below come the results for the execution on two H100's. | ||
The following table shows the performance when the inputs of the benchmarked operation are encrypted: | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1_2AUeu3ua8_PXxMfeJCh-pp6b9e529PGVEYUuZRAThg/edit?usp=sharing" %} | ||
|
||
|
||
The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1nLPt_m1MbkSdhMop0iKDnSN_c605l_JdMpK5JC90N_Q/edit?usp=sharing" %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Benchmarks | ||
|
||
This document summarizes the timings of some homomorphic operations over 64-bit encrypted integers, depending on the hardware. More details are given for [the CPU](cpu\_benchmarks.md), [the GPU](gpu\_benchmarks.md), or [zeros-knowledge proofs](zk\_proof\_benchmarks.md). | ||
|
||
### Operation time (ms) over FheUint 64 | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1ZbgsKnFH8eKrFjy9khFeaLYnUhbSV8Xu4H6rwulo0o8/edit?usp=sharing" %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Zero-knowledge proof benchmarks | ||
|
||
This document details the performance benchmarks of [zero-knowledge proofs](../../guides/zk-pok.md) for [compact public key encryption](../../guides/public_key.md) using **TFHE-rs**. | ||
|
||
Benchmarks for the zero-knowledge proofs have been run on a `m6i.4xlarge` with 16 cores to simulate an usual client configuration. The verification are done on a `hpc7a.96xlarge` AWS instances to mimic a powerful server. | ||
|
||
{% embed url="https://docs.google.com/spreadsheets/d/1llCYHCz2CyLdTwXkiqhjVzJLzxW_RqdjHxmk72m1jm4/edit?usp=sharing" %} | ||
|
Oops, something went wrong.