-
Notifications
You must be signed in to change notification settings - Fork 144
Conference call notes 20211027
Kenneth Hoste edited this page Oct 27, 2021
·
7 revisions
(back to Conference calls)
Notes on the 184th EasyBuild conference call, Wednesday Oct 27th 2021 (15:00 UTC)
Alphabetical list of attendees (10):
- Sebastian Achilles (Jülich Supercomputing Centre, Germany)
- Simon Branford (Univ. of Birmingham, UK)
- Alex Domingo (Vrije Universiteit Brussel, Belgium)
- Jorge Guerra (Universidad Polit écnica de Madrid, Spain)
- Kenneth Hoste (HPC-UGent, Belgium)
- Adam Huffman (Big Data Institute, Oxford, UK)
- Kurt Lust (Univ. of Antwerp, Belgium + LUMI User Support Team)
- Alan O'Cais (Jülich Supercomputing Centre, Germany)
- Jurij Pečar (EMBL, Germany)
- Jörg Saßmannshausen (NIHR Biomedical Research Centre, UK)
- overview of recent developments
- update on progress towards EasyBuild v4.5.0 release
- promoting candidates for '2021b' common toolchains
- Q&A
- release timeline
- latest release: EasyBuild v4.4.2 (Sept 7th 2021)
- next release
- EasyBuild v4.5.0 (some significant enhancements already merged to
develop
) - ETA: end of this week (?)
- project with target issues/PRs for this release: https://github.com/orgs/easybuilders/projects/14
- TODO:
- merge experimental support for installing R extensions in parallel
- deprecate old toolchains (< 2019?)
- promote
foss/2021.07
tofoss/2021b
+intel/2021.09
tointel/2021b
- add documentation for:
- experimental support for installing (R) extensions in parallel
- integration with Rich (progress bars)
- EasyBuild v4.5.0 (some significant enhancements already merged to
- recent changes
-
framework
-
bug fixes
- refactor EasyBlock to decouple collecting of information on extension source/patch files from downloading them (PR #3860)
- fixes downloading of sources for extensions with
--module-only
(issue #3849) - deprecated
fetch_extension_sources
(replaced withcollect_exts_file_info
)
- fixes downloading of sources for extensions with
- fix library paths to add to $LDFLAGS for intel-compilers toolchain component (PR #3866)
- remove '
--depth 1
' from git clone when 'commit
' is specified ingit_config
(PR #3871 + PR #3872)- fixes regression introduced in EasyBuild v4.4.2 for easyconfigs that use
git_config
with a specificcommit
for downloading sources (issue #3870)
- fixes regression introduced in EasyBuild v4.4.2 for easyconfigs that use
- refactor EasyBlock to decouple collecting of information on extension source/patch files from downloading them (PR #3860)
-
enhancements
- use separate different progress bars for different aspects of the installations being performed (PR #3844, PR #3864, PR #3867)
- filter out duplicate paths added to module files (PR #3770 + PR #3874)
- warning is printed when duplicate paths for module file are suppressed: "
Suppressed adding the following path(s) to $%s of the module as they were already added
"
- warning is printed when duplicate paths for module file are suppressed: "
- add
check_async_cmd
function to facilitate checking on asynchronously running commands (PR #3865) - add support for
--insecure-download
configuration option (PR #3859) - make
intelfftw
toolchain component aware ofimkl-FFTW
module (PR #3859)- for
intel/2021.09
(soon to be promoted tointel/2021b
), imkl FFTW wrappers were separated fromimkl
installation; see easyconfigs PR #14195)
- for
-
changes
- ...
-
bug fixes
-
easyblocks
-
bug fixes
- restore RPATH wrappers for OpenMPI sanity check (PR #2582)
- avoid that path to CUDA install directory is added to $PATH (PR #2593)
- make version regex in OpenSSL wrapper easyblock less strict (version string does not always end with a letter) (PR #2597)
- redefine
collect_exts_file_info
instead of now deprecatedfetch_extension_sources
in OCaml easyblock (PR #2603)- required because of changes in https://github.com/easybuilders/easybuild-framework/pull/3860
- fix support for recent imkl version in numexpr easyblock (PR #2606)
- enhancements
- new easyblocks
-
imkl-FFTW
(PR #)
-
-
changes
- (none)
-
bug fixes
-
easyconfigs
- ~Over 75 easyconfig PRs merged since last conf call
-
bug fixes
- fix source URL for SCOTCH 6.1.0 (PR #14099)
- add patch for OpenBLAS 0.3.17 + 0.3.18 to fix segfault triggered by scipy tests (PR #14178)
- fix AmberTools v20 easyconfig using intel/2020a toolchain (PR #14028)
- add patch to fix PMIx detection in OpenMPI v4.0.3, v4.0.5, v4.1.0 (PR #14177)
- fix
spatstat.*
downloads for Seurat v4.0.1 (PR #14199)
- enhancements
-
new software
- ...
- noteworthy software updates
- changes
- updates for
foss/2021.07
(to be promoted tofoss/2021b
): - updates for
intel/2021.09
- use
imkl
withsystem
toolchain +imkl-FFTW
(PR #14195
- use
- updates for
-
framework
- to merge/fix/tackle soon
-
framework
-
reported bugs / bug fixes
- failing GitHub tests in CI with recent
git
version (issue #3873) - FlexiBLAS module and toolchain use $EBROOTFLEXIBLAS/include and not $EBROOTFLEXIBLAS/include/flexiblas (issue #3868)
- failing GitHub tests in CI with recent
-
enhancements
- add initial/experimental support for installing extensions in parallel (PR #3667)
- opt-in via
--parallel-extensions-install --experimental
- two requirements for extensions:
- must be able to determine list of required dependencies (via
required_deps
property method in easyblock) - must be able to start installation asynchronously via
run_async
+ check for completion viaasync_cmd_check
- must be able to determine list of required dependencies (via
- works well with
R
easyconfigs: installation is several hours faster with enough cores available - known limitations:
- doesn't work yet for
R-bundle-Bioconductor
, because all required dependencies must be listed inexts_list
- skipping of installed extensions and sanity check for extensions is still done sequentially
- doesn't work yet for
- opt-in via
- detect download failure due to outdated certificate and print helpful warning? (issue #3863)
- add initial/experimental support for installing extensions in parallel (PR #3667)
-
changes
- print the hook messages only for debug-mode (PR #3843)
-
reported bugs / bug fixes
-
easyblocks
-
reported bugs / bug fixes
- ...
-
enhancements
- add support for installing R extensions in parallel (PR #2408)
- detect problem with compiling CPU detection code in configure output in GROMACS easyblock (PR #2609)
- add custom custom easyconfig parameter '
blas_libs
' in FlexiBLAS easyblock to specify backends (PR #2605)-
backends
would be a better name?
-
-
changes
- don't use
--config=mkl
for TensorFlow 2.4+ (PR #2583)- cfr. reported performance problems for CPU-only TensorFlow installations (issue #2577), which can worked around via
export OMP_NUM_THREADS=1
- also fixes other problem with TensorFlow
tf.matmul
that ends up using CPU backend for 32bit floats (issue #14120) - TensorFlow 2.5.0 and 2.6.0 easyconfigs were tweaked to skip broken
mkl_fused_batch_norm_op_test
test (easyconfigs PR #14151 + PR #14153); see also https://github.com/tensorflow/tensorflow/issues/52151
- cfr. reported performance problems for CPU-only TensorFlow installations (issue #2577), which can worked around via
- don't use
-
new software
- (nothing major?)
-
reported bugs / bug fixes
- easyconfigs
-
framework
- ready to promote
foss/2021.07
tofoss/2021b
andintel/2021.09
tointel/2021b
? - first merge FFTW 3.3.10 + update in
foss/2021.07
- Kurt: status of easystack support?
- some issues with using multiple versionsuffix for a specific software version?
- Kurt: how compatible is a custom repository with easyconfigs/easyblocks using GPLv3 with the central repositories using GPLv2
- some sites are using a GPLv3 license in their repository, does that lead to a compatibility problem?
- see https://www.ifross.org/en/what-difference-between-gplv2-and-gplv3
- Kurt: some interest from PDC in Sweden to work together with LUMI on using EasyBuild common toolchains on Cray Slingshot systems
- Alex: NVIDIA only provides NCCL builds with specific CUDA versions, while we make different combinations, is that a problem?
- recent NCCL versions are built from source in EasyBuild, so should be OK?
- NVIDIA only provides recent NCCL builds for CUDA 11.0 and 11.4
- probably because CUDA 11.0 is considered a long-term release?
- similar issue with cuDNN compatibility matrix
- can we make the NCCL easyblock run tests?
- Alan: impact of splitting up
imkl
andimkl-FFTW
- opens the door to pull down
SciPy-bundle
down to compiler-only level (if we flesh outmpi4py
)
- opens the door to pull down
- Sebastian: is ROCm being built with EasyBuild at LUMI?
- Kurt: not currently, ROCm is provided as a part of the Cray software stack, but there's interest in changing this (cfr. collaboration with PDC)
- Sebastian is asking in the context of a meeting with AMD (Michael Klemm)
- it seems they're willing to help out if needed
- see Jürgen's work on ROCm at https://github.com/easybuilders/easybuild-easyconfigs/pull/14156
- Sebastian is planning to reach out to Jürgen regarding potential collaboration with AMD
- some hardcoded stuff (like assuming
/opt/rocm
or having everything installed in a single place) are slowly being fleshed out from ROCm (partially due to pressure some Spack) - would be nice to figure out first what we would like to get out of it + which questions we have
- Kurt: multiple different compilers from AMD, unclear which is best option for what, how compatible they are, etc.
- should we ask Michael Klemm is he's up for doing an EasyBuild Tech Talk on this?
- ROCm software stack, HIP & related tools, different AMD compilers, ...
- Jörg: is Intel oneAPI free for use in academia?
- as far as we know: yes (no license is needed)
- you only need to pay to get support
- this was changed compared to previous versions
- cfr. https://www.intel.com/content/www/us/en/developer/articles/news/free-intel-software-developer-tools.html