Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve NTT performance of avx2 ml-dsa #571

Closed
2 tasks done
Tracked by #275
franziskuskiefer opened this issue Sep 11, 2024 · 1 comment · Fixed by #671
Closed
2 tasks done
Tracked by #275

improve NTT performance of avx2 ml-dsa #571

franziskuskiefer opened this issue Sep 11, 2024 · 1 comment · Fixed by #671
Assignees

Comments

@franziskuskiefer
Copy link
Member

franziskuskiefer commented Sep 11, 2024

  • [ML-DSA] AVX2 performance improvements in NTT #584
    - [ ] Things to investigate (increasing order of effort)
    - [ ] try different instructions for shuffling (e.g. vpshufd vs vmovshdup)
    - [ ] additionally unroll layers 2 through 0
    - [ ] use a different shuffling strategy altogether (i.e. instead of shuffle-in -> butterfly -> shuffle-out, per layer in layers 2 - 0, shuffle once in layers 5-3 and unshuffle when writing out the final result (?))
  • Apply effective optimizations to inverse NTT #657
    - Puzzle, potential waste of time: In our multiplication, subtractions seem to be disproportionately expensive although they shouldn't be, going from instruction count. Why is that?
@jschneider-bensch
Copy link
Collaborator

We found that playing around with different shufflings made essentially no difference, maybe with the potential exception of keeping vectors in the NTT domain in a shuffled state to avoid some shufflings altogether. This would require to touch all places where NTT domain vectors are handled and is probably not worth it in terms of performance, since there are easier changes to be made still, e.g. applying the butterfly optimization to the inverse NTT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants