-
Notifications
You must be signed in to change notification settings - Fork 50
Compiling Chroma
The whole suite of USQCD software consists of a handful of building modules. Before compiling, it is important to choose which modules to use. Not every module can be compiled on every architecture, not every modules works well with other modules.
All modules use the GNU Autotools for configuration and compilation. In
principle, all modules should be compilable with ./configure
, make
,
and make install
. However, there are many configure flags to set
correctly. Also the versions of GNU Autotools installed on the computer
and used in the module can be different. In that case, one needs to
update the files that are shipped with the module. It has proven helpful
to run the following commands (Bash shell) before running configure
:
if [[ -f .gitmodules ]]; then
git submodule foreach autoreconf -f
fi
autoreconf -f
While dynamic linkage is preferred for system wide installations of software, a supercomputer benefits from static linkage. This eases the loading of the program into every node. Even with options to force the static linkage, some dynamic linkage might slip through. Therefore the shared object files can be deleted after each installation step like this:
pushd $prefix/lib
rm -f *.so *.so.*
popd
Modules like Chroma and QDP++ use a lot of git submodules. It is very
important to clone recursively, with git clone --recursive
. Some of
the repositories use SSH remote URLs for GitHub. If you do not have an
SSH key on the JURECA frontend that is registered with GitHub, you will
be denied to clone the repository, even though it is a public one with a
message like the following:
Cloning into 'other_libs/filedb'...
Warning: Permanently added the RSA host key for IP address '192.30.253.113' to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Clone of 'git@github.com:usqcd-software/filedb.git' into submodule path 'other_libs/filedb' failed
There are a couple of options:
-
Copy the SSH key that you have registered with GitHub to the frontend. This way, git can log into GitHub via SSH. This is not recommendable because somebody gaining access to that key can access all your GitHub repositories.
-
Create a new SSH key pair on the frontend and register that with GitHub. Same disadvantage as above, at least one can remove the key from GitHub later on.
-
Change all the remote URLs from SSH to HTTPS. This needs to be done on every level of submodule tree and is a bit cumbersome.
-
The best way is to create an SSH key pair on the frontend. Then register that SSH key as a deploy key for an arbitrary repository on your GitHub account. Currently, this is done by navigating to the Settings and then to the Deploy Keys. This will only grant pull permissions by default, so nothing happens for a public repository. Then you can clone repositories over SSH because the SSH key that is used is registered somewhere in GitHub.
The JURECA system consists of E5-2680 v3 Haswell CPUs. They support the instruction set architecture. For an Intel Xeon, the selection that has served well is libxml2, QMP, QDP++, QPhiX, and Chroma. The Remez algorithm for the rational approximation also needs the GMP (GNU Multi Precision) library.
In this section, all the needed steps to get Chroma running JURECA will be presented. The whole script can be downloaded and executed, it should bootstrap the whole software stack automatically.
The recommended compiler for JURECA without GPUs is the Intel C++ compiler. It can take best advantage of the features of the Xeon Phi. QPhiX also uses some non-standard Intel extensions which can only be used with the Intel compiler. It is recommended to specify the version to load. The default version might change any time and then the compilation might not work out any more. If a newer version of the compiler got installed, it is worth checking whether that gives more performance than the old one.
MPI versions of the compiler are called mpiicc
and mpiicpc
for the C
and C++ compilers, respectively. The full paths need to be passed to
configure
using the CC
and CXX
variables.
During the compilation of Chroma, one might encounter a couple errors. Here are some of the typical errors listed with a way to work around them.
Problem:
fatal: unable to connect to git.gnome.org:
Solution:
There seems to be some firewall rule on JURECA that prevents this. Just clone from JUDAC.
Problem:
configure: error: Cannot compile/link a program with libxml2.
Solution:
The binary xml2-config
of the local installation must be somewhere in
the $PATH
. If the system wide installation of libxml2
is found
first, it will be used. That version has been compiled with the standard
system compiler (usually an older GCC), therefore this causes troubles.
Problem:
./configure: line 13042: syntax error near unexpected token `Z,zlib,'
./configure: line 13042: ` PKG_CHECK_MODULES(Z,zlib,'
Solution:
The PKG_CHECK_MODULES
is a GNU M4 macro that has not been properly
resolved during the generation of the configure
script. The needed
macros are defined /usr/share/aclocal/pkg.m4
. In the source of
libxml2
, one needs to create a directory called m4
and symlink that
file into it. Another run of autoreconf -f
will pick up the changes
and create a configure
file without unresolved macros.
Problem:
error: cannot open source file "qio_config_internal.h"
Solution:
The source checkout of QIO that is a submodule of QDP++ might ship with
a Makefile
that is not up to date. Delete other_libs/qio/Makefile
and run ./autoreconf.sh
inside other_libs/qio
.
Problem:
configure.ac:6: error: version mismatch. This is Automake 1.15,
Solution:
The git repositories sometimes contain GNU Autotools files from a
version different than the one installed on the frontend. In that case
one needs to run aclocal
in each submodule to update those files.
Problem:
CDPATH="${ZSH_VERSION+.}:" && cd /homec/hbn28/hbn28e/Sources/qphix && aclocal-1.13
/bin/sh: aclocal-1.13: command not found
Solution:
Also a version missmatch. Run the Autotools reset commands given on Page .
Problem:
QMP_mem.c(345): error: a value of type "void *" cannot be assigned to an entity of type "char *"
mh->base = mm->mem;
^
Solution:
In C, this casting is allowed, in C++ it is not. This can happen if the C and C++ compilers are mixed up and C code is compiled with a C++ compiler.
Problem:
error: 'asm' undeclared (first use in this function)
Solution:
The asm
keyword cannot be used with GCC with the ISO C. Use
--std=gnu99
.
Problem:
Error: unrecognized opcode: `qvlfdx'
Solution:
This and similar instructions are QPX intrinsics that only work on BG/Q. GCC does not seem to support this on JUQUEEN, use LLVM instead.
Problem:
error: inlining failed in call to always_inline '__m256d _mm256_set1_pd(double)': target specific option mismatch
Solution:
GCC needs the option -mavx2
for this to compile.
Problem:
../include/qphix/dslash_body.h:584:17: error: expected ']' before ':' token
Solution:
CEAN is an Intel C++ extension. Drop --enable-cean
.