Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make vector times matrix faster #1937

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Make vector times matrix faster #1937

wants to merge 2 commits into from

Conversation

thofma
Copy link
Member

@thofma thofma commented Oct 31, 2024

@fieker I is working well for vector * matrix, but I am having trouble in the other case. I guess he does not like my

bb.rows = reinterpret(Ptr{Ptr{ZZRingElem}}, pointer([pointer(be) + i for i in 0:length(b) - 1]))

but I am not sure what I am doing wrong.

julia> A = matrix(ZZ, 2, 2, [1, 2, 3, 4]); b = ZZRingElem[10, 0];

julia> A * b
2-element Vector{ZZRingElem}:
 7690
 0

@fieker
Copy link
Contributor

fieker commented Oct 31, 2024 via email

@thofma
Copy link
Member Author

thofma commented Nov 1, 2024

ah right, thanks

@lgoettgens
Copy link
Collaborator

I am surprised that this is faster than just calling flint in these cases. Can you (once it works) post some benchmarks here?

@thofma
Copy link
Member Author

thofma commented Nov 1, 2024

It is still calling flint, but eventually fmpz_mat_mul and not fmpz_mat_mul_fmpz_vec_ptr anymore.

@thofma
Copy link
Member Author

thofma commented Nov 2, 2024

Timings for $b \cdot A$, old versus new with $n \times n$ matrices and entries of size $2^B$. The bigger the number, the faster the new method. (The rows correspond to $n$.)

n \ B 5 10 25 50 100 200 500 1000
5 0.843564 0.851028 0.844431 2.0885 1.3113 0.845667 0.979706 0.97876
10 1.82766 1.7989 1.74988 4.35319 2.11531 1.16725 1.04208 1.01656
25 2.65245 2.67014 2.70464 8.41888 3.09119 1.35045 1.07404 1.05941
50 2.96443 3.02035 2.96937 9.97713 3.35594 1.33684 1.10265 1.0275
100 2.84638 2.90909 2.905 10.4487 3.4957 1.23185 1.10099 1.03541
200 2.17148 2.69597 2.63228 8.77398 3.05716 1.20955 1.10312 1.0478
500 1.15165 1.07107 1.06398 4.81586 2.81056 0.991455 1.34582 1.02989
1000 1.125 0.989656 0.993372 4.69268 3.41699 1.18048 1.19918 1.01886

Here is the same for new versions fmpz_mat_mul (larger means that fmpz_mat_mul is faster):

n \ B 5 10 25 50 100 200 500 1000
5 2.16462 2.32268 2.23238 2.17938 1.41913 1.19593 1.06215 1.02352
10 1.48754 1.47796 1.46382 1.39716 1.13425 1.06692 1.01278 1.0
25 1.14498 1.18499 1.13666 1.1293 1.0352 1.01304 1.00559 1.00327
50 1.04787 1.04418 1.00511 1.04592 1.00784 1.00937 0.995975 1.00301
100 1.04294 1.0394 1.01154 1.01961 0.972145 1.01542 1.01409 1.00043
200 0.927182 0.747128 1.01613 1.16729 1.00899 1.00981 0.993286 1.01096
500 0.985183 0.991277 1.01906 0.99012 1.03796 1.03933 1.0104 1.01454
1000 0.999137 1.00473 1.00524 0.995211 0.99897 1.01552 1.02201 0.996246

Here the same for $A \cdot b$:

n \ B 5 10 25 50 100 200 500 1000
5 0.713331 0.709169 0.717162 1.78398 1.14858 0.832763 0.964895 0.970126
10 1.49522 1.48134 1.41823 3.90519 1.91063 1.11299 1.03962 0.991704
25 2.24352 2.4983 2.52025 7.38161 2.80567 1.23921 1.07674 1.03358
50 2.86203 2.75761 2.84369 9.62656 3.05153 1.16304 1.07525 1.02103
100 2.86225 2.82905 2.72735 10.25 2.98033 1.10712 1.07306 1.03141
200 2.14486 2.16051 1.8628 9.48113 3.21884 1.17413 1.08709 1.04623
500 1.04457 0.997934 1.00282 4.26627 1.66195 1.04183 0.950636 0.978479
1000 1.01089 0.928815 0.929939 4.42736 3.82359 1.30189 1.09177 0.983552
n \ B 5 10 25 50 100 200 500 1000
5 2.00944 2.06203 2.01763 2.05396 1.39988 1.18995 1.06161 1.02158
10 1.43761 1.41539 1.40961 1.39092 1.1386 1.05608 1.01143 1.00837
25 1.14498 1.15764 1.14169 1.14774 1.0663 1.01935 1.01487 0.990403
50 1.03928 1.072 1.05319 1.03687 1.02635 1.00876 1.01412 1.00174
100 1.019 1.01891 1.05361 1.03803 1.01385 1.00785 1.00676 1.00027
200 1.15244 1.11367 1.30248 0.778307 1.00798 1.00869 1.00363 1.01093
500 0.986566 1.00716 1.00155 1.01458 1.00353 0.996473 1.08001 1.00105
1000 1.00048 1.00431 1.00267 1.00689 1.00857 1.17954 1.02424 1.00669

So we might have to do some tuning for small dimensions.

Copy link

codecov bot commented Nov 2, 2024

Codecov Report

Attention: Patch coverage is 87.17949% with 5 lines in your changes missing coverage. Please review.

Project coverage is 87.47%. Comparing base (3286337) to head (727faf5).

Files with missing lines Patch % Lines
src/flint/fmpz_mat.jl 87.17% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1937      +/-   ##
==========================================
- Coverage   87.51%   87.47%   -0.04%     
==========================================
  Files          97       97              
  Lines       35537    35574      +37     
==========================================
+ Hits        31101    31120      +19     
- Misses       4436     4454      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -1784,13 +1784,69 @@ end
addmul!(z::ZZMatrixOrPtr, a::ZZMatrixOrPtr, b::Integer) = addmul!(z, a, flintify(b))
addmul!(z::ZZMatrixOrPtr, a::IntegerUnionOrPtr, b::ZZMatrixOrPtr) = addmul!(z, b, a)

function _very_unsafe_convert(::Type{ZZMatrix}, a::Vector{ZZRingElem}, row = true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function _very_unsafe_convert(::Type{ZZMatrix}, a::Vector{ZZRingElem}, row = true)
function _very_unsafe_convert(::Type{ZZMatrix}, a::Vector{ZZRingElem}, Val{row} = Val(true)) where {row}

this should push performance ever so slightly further as it eliminates a jump in if row

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants