Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of the full range of AUX matrices #33

Merged
merged 2 commits into from
Nov 1, 2024

Conversation

gdonval
Copy link
Contributor

@gdonval gdonval commented Oct 5, 2023

The gist of it is that we have this signature mr2d_malloc(Int n), bounding the internal malloc to Int, which is Scalapack's indexing type. This is not how malloc is defined and means that maximum AUX matrix size is artifically limited on 64-bit systems to 2GB.

This PR operates this transformation: mr2d_malloc(Int n) -> mr2d_malloc(size_t n), yet ensures Int -> size_t conversion does not involve Int-negative values and ensures that no 64-bit values are passed to malloc on 32-bit systems.

The main advantage of this on 64-bit system allows the use of the full signed 32-bit indexing range instead of range / element size. E.g. the max AUX matrix size is now 16GB instead of 2GB previously.

With this, "standard" 32-bit Scalapack and Blas/Lapack can still be used in programs like GPAW.

Details (and full commit history) are in
Reference-ScaLAPACK/scalapack#85 which does not seem to receive much attention.

This patch is a way to provide that feature to conda users in the meantime.

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-webservices
Copy link

conda-forge-webservices bot commented Oct 5, 2023

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

@gdonval
Copy link
Contributor Author

gdonval commented Oct 5, 2023

@conda-forge-admin, please rerender

@gdonval
Copy link
Contributor Author

gdonval commented Oct 5, 2023

Those osx failures are unrelated to my code (different file, different folder, error completely unrelated to anything I have done).

@gdonval
Copy link
Contributor Author

gdonval commented Oct 5, 2023

Alright, I took the easy approach to solving the osx problems: ignore those specific warnings.

  1. Those warnings are also shown with Linux and can only be solved with a new Scalapack release.
  2. Those warnings are ignored on Linux already (and one of them is already ignored in the existing recipe for osx-arm specifically).
  3. The builds are tested on Linux and Osx so a problem here should show up there.

In short, this is ready for release.

@Chronum94
Copy link

Any news/updates on this? Or is this specific feedstock abandoned?

@gdonval
Copy link
Contributor Author

gdonval commented Mar 13, 2024

Well. still waiting for review... and since I'm not added as a maintainer yet, nothing I can do.

@dalcinl
Copy link
Contributor

dalcinl commented Mar 13, 2024

@gdonval I'm not in a position to review this PR. But if you ask me explicitly to merge and you take accountability for the outcome, I'm very happy to click the button.

@dalcinl
Copy link
Contributor

dalcinl commented Mar 13, 2024

@conda-forge-admin rerender

@gdonval
Copy link
Contributor Author

gdonval commented Mar 13, 2024

@dalcinl: I'll recheck tomorrow to make sure all looks good and I'll ping you if so. Thanks a lot!

@traversaro
Copy link
Contributor

In short, this is ready for release.

In the future, please add a mention to the mantainer team (i.e. @conda-forge/scalapack) when you write that a PR is ready for review. Otherwise, it is quite easy for mantainers to miss a message.

recipe/build.sh Outdated
Comment on lines 5 to 9
# Workaround for implicit function declarations on osx
export CFLAGS="$CFLAGS -std=gnu99 -Wno-implicit-function-declaration"
export CXXFLAGS="$CXXFLAGS -std=gnu99 -Wno-implicit-function-declaration"
export FFLAGS="${FFLAGS} -fallow-argument-mismatch"
export OMPI_FCFLAGS=${FFLAGS}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this is necessary?

recipe/build.sh Outdated
@@ -8,7 +17,7 @@ if [[ "$target_platform" == "osx-64" ]]; then
fi
fi

if [[ "$target_platform" == linux-* || "$target_platform" == "osx-arm64" ]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the workaround is not needed anymore in osx-arm64 due to an update in the compiler, at this point probably we can also remove the comment in the if referring to gfortran 9?

@Chronum94
Copy link

@gdonval did you get a chance to check things? Just a ping since this will improve a few downstream things.

@traversaro
Copy link
Contributor

@gdonval @Chronum94 are you still interested in this? @Chronum94 even if you were not the original author of the PR, if the original author does not reply you can open a new PR with your own improvements/addressing the reviews.

@minrk minrk mentioned this pull request Nov 1, 2024
The gist of it is that we have this signature `mr2d_malloc(Int n)`,
bounding the internal `malloc` to `Int`, which is Scalapack's indexing type.
This is not how `malloc` is defined and means that maximum AUX matrix
size is artifically limited on 64-bit systems to 2GB.

This PR operates this transformation: `mr2d_malloc(Int n) -> mr2d_malloc(size_t n)`,
yet ensures `Int -> size_t` conversion does not involve Int-negative values
and ensures that no 64-bit values are passed to `malloc` on 32-bit systems.

The main advantage of this on 64-bit system allows the use of
the **full** _signed_ 32-bit indexing range instead of `range / element size`.
E.g. the max AUX matrix size is now 16GB instead of 2GB previously.

With this, "standard" 32-bit Scalapack and Blas/Lapack can still be used
in programs like GPAW.

Details (and full commit history) are in
Reference-ScaLAPACK/scalapack#85
which does not seem to receive much attention.

This patch is a way to provide that feature to conda users in the meantime.
@minrk
Copy link
Member

minrk commented Nov 1, 2024

Rebased to just the patch commit, since this no longer needs any build.sh patches. Thanks @gdonval!

@minrk minrk merged commit eb393b2 into conda-forge:main Nov 1, 2024
13 checks passed
@gdonval
Copy link
Contributor Author

gdonval commented Nov 2, 2024

Sorry for the delay. Happy this got merged! 🎉

@gdonval gdonval deleted the fix-malloc branch November 2, 2024 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants