-
Notifications
You must be signed in to change notification settings - Fork 144
Math libraries on AMD Rome (20201130)
Kenneth Hoste edited this page Dec 5, 2020
·
2 revisions
- talk by Sebastian Achilles (JSC): performance of math libraries on AMD Rome (Zen2)
- significant performance benefits for BLIS compared to Intel MKL (2020.2) and OpenBLAS
- BLIS: version 2.2 (AMD fork), but same performance as stock BLIS
- AMD-specific kernels have been backported to upstream
- full node tests (128 cores @ JUSUF system at JSC)
- large performance gaps for several BLAS function (dgemm, zgemm, etc.)
- single-threaded performance difference is smaller, but still in favor of BLIS
- BLIS: version 2.2 (AMD fork), but same performance as stock BLIS
- switch from OpenBLAS to BLIS in foss toolchain?
- needs testing on Intel systems as well, compare BLIS with OpenBLAS
- Sebastian will share his benchmarks scripts so others can test as well
- also compared FFTW 3.3.8 vs patched FFTW 3.3.8 by AMD
- both significantly faster than Intel MKL 2020.2 on AMD Rome
- patched FFTW shows even better performance
- should we use AMD-patched FFTW in
foss
toolchain?- AMD patches introduce
--enable-amd
configuration option, so may be safe to also apply on Intel systems - makes providing optimized FFTW for AMD easier in
foss
toolchains - if needed we can pick which FFTW installation to use on AMD systems at installation time
- AMD patches introduce
- notes:
- Intel MKL 2020.2 has some Zen2-specific kernels, but still falls back to Intel Pentium 4 code paths
- some performance improvements in Intel MKL 2020.4, but BLIS is still significantly better
- Intel MKL can be "convinced" to use AVX2-optimized code paths
- easy to do in imkl 2020.0, just use
export MKL_DEBUG_CPU_TYPE=5
- harder in more recent imkl 2020 versions, requires patching of binaries/libraries (see https://danieldk.eu/Posts/2020-08-31-MKL-Zen.html)
- and that's against the EULA, and you don't really know what you're getting (or if it works correctly)
- easy to do in imkl 2020.0, just use