Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove fast_pbkdf2 dependency #11

Closed
wants to merge 1 commit into from
Closed

Conversation

benoitc
Copy link

@benoitc benoitc commented Aug 14, 2023

This PR remove the usage of fast_pbldf2 by using latest crypto API in Erlang. If needed i will add some legacy support for fast_pbkdf2. Or should it be an option?

I did remove this library because any libray based on openssl is hard to to deploy in multiple environnement, and I don't think the performance s lower with this change though I didn't measure it precisely first.

Copy link

@patatoid patatoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had questions about the KeyLength but looks here to be the same as fast_pbkdf2.

LGTM

@NelsonVides
Copy link
Collaborator

NelsonVides commented Aug 14, 2023

Amazing idea, I wanted to get to try this for a while, thank you so much for the contribution!

Nevertheless, I'll get picky with performance: it is indeed pretty much in all cases 1.85x slower.

I tried benchee with these inputs:

Benchee.run(
 %{
   "fast" => fn {hash, pass, salt, it} -> :fast_pbkdf2.pbkdf2(hash, pass, salt, it) end,
   "otp" => fn {hash, pass, salt, it} ->
                %{ size: len } = :crypto.hash_info(hash)
                :crypto.pbkdf2_hmac(hash, pass, salt, it, len)
            end
 },
 inputs: %{
   "sha256-12password-1024it" => {:sha256,
                :base64.encode(:crypto.strong_rand_bytes(8)),
                :base64.encode(:crypto.strong_rand_bytes(8)),
                1024},
   "sha256-12password-10000it" => {:sha256,
                :base64.encode(:crypto.strong_rand_bytes(8)),
                :base64.encode(:crypto.strong_rand_bytes(8)),
                10000},
   "sha256-12password-500000it" => {:sha256,
                :base64.encode(:crypto.strong_rand_bytes(8)),
                :base64.encode(:crypto.strong_rand_bytes(8)),
                500000},

   "sha256-32password-1024it" => {:sha256,
                :base64.encode(:crypto.strong_rand_bytes(22)),
                :base64.encode(:crypto.strong_rand_bytes(8)),
                1024},
   "sha256-32password-10000it" => {:sha256,
                :base64.encode(:crypto.strong_rand_bytes(22)),
                :base64.encode(:crypto.strong_rand_bytes(8)),
                10000},
   "sha256-32password-500000it" => {:sha256,
                :base64.encode(:crypto.strong_rand_bytes(22)),
                :base64.encode(:crypto.strong_rand_bytes(8)),
                500000}
 },
 memory_time: 5
)

All of them memory-wise take 72B for fast_pbkdf2 and 328B for libcrypto, so memory being a constant I don't mind at all not even half of a kilobyte. But speed-wise it's an interesting difference. To me that 1.85x is quite close to 2x, with makes me suspicious of libcrypto's implentation of PBKDF. Usually the complexity of the algorithm grows linearly with the iteration count, so complexity is O(n) but under a naive implementation the complexity on the number of compression functions is 4n, which in a proper implementation is 2 + 2i. So while both are O(n), the 2x constant makes a difference.

My implementation surely does the proper (that's the whole point why I did it :D), so that 1.85x diff makes me think libcrypto got something unoptimised there 😅

Even more interesting, when setting parallelism to 12 (the number of cores in my machine), libcrypto's one is then 1.35x slower, so good for us. But more interestingly, when setting parallelism to 60 (5x number of cores), libcrypto's one then varies a lot for the iteration count: 1024 is as before 1.85x slower, 10000 is 1.30x slower for the small password and 2.3x slower for the big password, and 500000 (0.5M) is 4.3x slower for both password sizes. So it makes me think it's using dirty schedulers and blocking operations, my hypothesis.

Still, I agree simplicity of dependencies is a great win, so I wouldn't say no even for a 2x factor. But at such factor I'd like to have it possible to choose. After all, half a million iterations is not unheard of. Do you know think we can make this configurable? 🙂

@NelsonVides NelsonVides self-assigned this Aug 14, 2023
@NelsonVides NelsonVides self-requested a review August 14, 2023 19:21
@NelsonVides
Copy link
Collaborator

@benoitc any news on this contribution?
I also saw that CI fails for older OTP versions but I'm totally in favour of pushing for newest ones anyway.
Let me know if I can help you with this 🙂

@NelsonVides
Copy link
Collaborator

Closed in favour of #13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants