Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient butterfly in inverse NTT layers 0-2 #671

Merged
merged 6 commits into from
Nov 15, 2024

Conversation

jschneider-bensch
Copy link
Collaborator

@jschneider-bensch jschneider-bensch commented Nov 13, 2024

This PR applies the optimization of batching two butterflies in AVX2 for layers 0 through 2 of the inverse NTT, as is already done on the forward NTT.
I've also pushed all of the inverse NTT into the respective instantiation modules with the hope of improving optimizations done by the compiler.

Copy link
Member

@franziskuskiefer franziskuskiefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let's get this in.

The main slow down is from #646 not propagating inlines for the hashing. Simply adding them back gets us into trouble with the stack again, but back to what I've seen after my changes the last time.

@franziskuskiefer franziskuskiefer added this pull request to the merge queue Nov 15, 2024
Merged via the queue into main with commit dc479b8 Nov 15, 2024
53 checks passed
@franziskuskiefer franziskuskiefer deleted the jonas/invntt-butterfly branch November 15, 2024 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply effective optimizations to inverse NTT improve NTT performance of avx2 ml-dsa
2 participants