improve NTT performance of avx2 ml-dsa #571

franziskuskiefer · 2024-09-11T06:22:01Z

[ML-DSA] AVX2 performance improvements in NTT #584
~~- [ ] Things to investigate (increasing order of effort)~~
~~- [ ] try different instructions for shuffling (e.g. vpshufd vs vmovshdup)~~
~~- [ ] additionally unroll layers 2 through 0~~
- [ ] use a different shuffling strategy altogether (i.e. instead of shuffle-in -> butterfly -> shuffle-out, per layer in layers 2 - 0, shuffle once in layers 5-3 and unshuffle when writing out the final result (?))
Apply effective optimizations to inverse NTT #657
~~- Puzzle, potential waste of time: In our multiplication, subtractions seem to be disproportionately expensive although they shouldn't be, going from instruction count. Why is that?~~

The text was updated successfully, but these errors were encountered:

jschneider-bensch · 2024-11-06T07:59:28Z

We found that playing around with different shufflings made essentially no difference, maybe with the potential exception of keeping vectors in the NTT domain in a shuffled state to avoid some shufflings altogether. This would require to touch all places where NTT domain vectors are handled and is probably not worth it in terms of performance, since there are easier changes to be made still, e.g. applying the butterfly optimization to the inverse NTT.

franziskuskiefer assigned jschneider-bensch Sep 11, 2024

jschneider-bensch mentioned this issue Sep 11, 2024

Fix ML-DSA benchmarks #573

Merged

franziskuskiefer mentioned this issue Sep 16, 2024

[ML-DSA] AVX2 performance improvements in NTT #584

Merged

2 tasks

franziskuskiefer mentioned this issue Sep 11, 2024

ML-DSA 65 AVX2 Implementation #275

Closed

10 tasks

jschneider-bensch linked a pull request Nov 14, 2024 that will close this issue

More efficient butterfly in inverse NTT layers 0-2 #671

Merged

franziskuskiefer closed this as completed in #671 Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve NTT performance of avx2 ml-dsa #571

improve NTT performance of avx2 ml-dsa #571

franziskuskiefer commented Sep 11, 2024 •

edited by jschneider-bensch

Loading

jschneider-bensch commented Nov 6, 2024

improve NTT performance of avx2 ml-dsa #571

improve NTT performance of avx2 ml-dsa #571

Comments

franziskuskiefer commented Sep 11, 2024 • edited by jschneider-bensch Loading

jschneider-bensch commented Nov 6, 2024

franziskuskiefer commented Sep 11, 2024 •

edited by jschneider-bensch

Loading