Skip to content

Commit

Permalink
V2.1.01 (#249)
Browse files Browse the repository at this point in the history
* incorrect max diffusion with resistivity (#244)

Fix a bug that could result in too restrictive timesteps when resistivity is enabled

fix #242

* fix documentation for reflective boundary conditions (#246)

fix #228

* Per proc normalisation (#247)

- show performance per sub-domain during integration
- add performance measures in documentation
- update link to method paper
- update acknowledgements

* Documentation fixes (#248)

* directly ask kokkos for its execution space

* remove replace source files, as this doesn't work with header files
(.hpp)

* add proper readme

* clean up hdf5 mess in readme (is already in the full doc)

* add Async malloc option to JZ configuration
  • Loading branch information
glesur authored Jun 19, 2024
1 parent 158f2aa commit 767fbcd
Show file tree
Hide file tree
Showing 12 changed files with 131 additions and 49 deletions.
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,19 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.1.01] 2024-06-20
### Changed
- Fix a bug that could result in too restrictive timesteps when resistivity is enabled (#244)
- Fix documentation for reflective boundary conditions (#246)
- Changed performance metric: the performance is now measured per MPI process (and not globally) (#249)
- Remove documentation for replace_idefix_source, as this can't work for .hpp file (#248)

### Added
- Kokkos execution space configuration is now shown on startup (#248)
- Add CUDA_MALLOC_ASYNC flags in Jean Zay documentation to deal with MPI issues when using Kokkos 4.3 (#248)
- Add a description and link to documentation in readme (#248)
- Add indicative expected performances in documentation (#249)

## [2.1.0] 2024-05-10
### Changed
- VTK slices are automatically produced along with standard VTK when an emergency abort is triggered.
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ set (CMAKE_CXX_STANDARD 17)

set(Idefix_VERSION_MAJOR 2)
set(Idefix_VERSION_MINOR 1)
set(Idefix_VERSION_PATCH 00)
set(Idefix_VERSION_PATCH 01)

project (idefix VERSION 2.1.00)
option(Idefix_MHD "enable MHD" OFF)
Expand Down
33 changes: 30 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

<!-- toc -->

- [What is Idefix?](#what-is-idefix)
- [Documentation](#documentation)
- [Download:](#download)
- [Installation:](#installation)
- [Compile an example:](#compile-an-example)
Expand All @@ -17,6 +19,33 @@

<!-- tocstop -->

What is Idefix?
---------------
Idefix is a computational fluid dynamics code based on a finite-volume high-order Godunov method, originally designed for astrophysical fluid dynamics applications. Idefix is designed to be performance-portable, and uses the [Kokkos](https://github.com/kokkos/kokkos) framework to achieve this goal. This means that it can run both on your laptop's cpu and on the largest GPU Exascale clusters. More technically, Idefix can run in serial, use OpenMP and/or MPI (message passing interface) for parallelization, and use GPU acceleration when available (based on Nvidia Cuda, AMD HIP, etc...). All these capabilities are embedded within one single code, so the code relies on relatively abstracted classes and objects available in C++17, which are not necessarily
familiar to astrophysicists. A large effort has been devoted to simplify this level of abstraction so that the code can be modified by researchers and students familiar with C and who are aware of basic object-oriented concepts.


Idefix currently supports the following physics:

* Compressible hydrodynamics in 1D, 2D, 3D
* Compressible magnetohydrodynamics using constrained transport in 1D, 2D, 3D
* Multiple geometry (cartesian, polar, spherical)
* Variable mesh spacing
* Multiple parallelisation strategies (OpenMP, MPI, GPU offloading, etc...)
* Full non-ideal MHD (Ohmic, ambipolar, Hall)
* Viscosity and thermal diffusion
* Super-timestepping for all parabolic terms
* Orbital advection (Fargo-like)
* Self-gravity
* Multi dust species modelled as pressureless fluids
* Multiple planets interraction

Documentation
-------------

A full online documentation is available on [readTheDoc](https://idefix.readthedocs.io/latest/).


Download:
---------

Expand Down Expand Up @@ -56,10 +85,8 @@ Configure the code launching cmake (version >= 3.16) in the example directory:
cmake $IDEFIX_DIR
```

Several options can be enabled from the command line (a complete list is available with `cmake $IDEFIX_DIR -LH`). For instance: `-DIdefix_RECONSTRUCTION=Parabolic` (enable PPM reconstruction), `-DIdefix_MPI=ON` (enable mpi), `-DKokkos_ENABLE_OPENMP=ON` (enable openmp parallelisation), etc... For more complex target architectures, it is recommended to use cmake GUI launching `ccmake $IDEFIX_DIR` in place of `cmake` and then switching on the required options.
Several options can be enabled from the command line (a complete list is available with `cmake $IDEFIX_DIR -LH`). For instance: `-DIdefix_RECONSTRUCTION=Parabolic` (enable PPM reconstruction), `-DIdefix_MPI=ON` (enable mpi), `-DKokkos_ENABLE_OPENMP=ON` (enable openmp parallelisation), etc... For more complex target architectures, it is recommended to use cmake GUI launching `ccmake $IDEFIX_DIR` in place of `cmake` and then switching on the required options. See the [online documentation](https://idefix.readthedocs.io/latest/) for details.

Optional xdmf(hdf5+xmf) file dumping feature has been added to `Idefix`. This uses either serial or parallel implementation of `hdf5` library which needs to be made available. These xdmf file pairs can be easily visualized in `ParaView` or `VisIt` by loading the `xmf` files. The `hdf5` files can also be loaded easily in `python` (using `h5py`) for post-processing and post-run analysis. One can turn on `xdmf` data dumps by using `-DIdefix_HDF5=ON`. The `[Output]` block of `.ini` file is checked during runtime for a `xdmf` entry whih controls the frequency of xdmf file dumps during code execution.
<!-- TODO: HDF5 Chunking and Compression filters -->

One can then compile the code:

Expand Down
2 changes: 1 addition & 1 deletion doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
author = 'Geoffroy Lesur'

# The full version, including alpha/beta/rc tags
release = '2.1.00'
release = '2.1.01'



Expand Down
14 changes: 11 additions & 3 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Terms and condition of Use
===========================
*Idefix* is distributed freely under the `CeCILL license <https://en.wikipedia.org/wiki/CeCILL>`_, a free software license adapted to both international and French legal matters, in the spirit of and retaining
compatibility with the GNU General Public License (GPL). We expect *Idefix* to be referenced and acknowledeged by authors in their publications. At the minimum, the authors
should cite the *Idefix* `method paper <https://ui.adsabs.harvard.edu/abs/2023arXiv230413746L/abstract>`_.
should cite the *Idefix* `method paper <https://ui.adsabs.harvard.edu/abs/2023A%26A...677A...9L/abstract>`_.

*Idefix* data structure and algorithm are derived from Andrea Mignone's `PLUTO code <http://plutocode.ph.unito.it/>`_, released under the GPL license.
*Idefix* also relies on the `Kokkos <https://github.com/kokkos/kokkos>`_ performance portability programming ecosystem released under the terms
Expand All @@ -74,6 +74,9 @@ Soufiane Baghdadi
Gaylor Wafflard-Fernandez
planet-disc interaction

Jonah Mauxion
self-gravity module

Clément Robert
gitlab integration, linter

Expand All @@ -96,8 +99,12 @@ This documentation has automatically been generated on |today| from the followin
Acknowledgements
===================

The developement of *Idefix* is supported by the European Research Council (ERC)
under the European Union Horizon 2020 research and innovation programme (Grant agreement No. 815559 (MHDiscs))
The developement of *Idefix* was supported by the European Research Council (ERC)
under the European Union Horizon 2020 research and innovation programme (Grant agreement No. 815559 (MHDiscs)).
Idefix developement team is partly funded by the `PEPR Origins <https://pepr-origins.fr>`_ through the project "MHD@Exascale".
The Idefix collaboration benefited from funding from the “Programme National de Physique Stellaire” (PNPS),
“Programme National Soleil-Terre” (PNST), “Programme National de Hautes Energies” (PNHE) and
“Programme National de Planétologie” (PNP) of CNRS/INSU co-funded by CEA and CNES.


.. toctree::
Expand All @@ -108,6 +115,7 @@ under the European Union Horizon 2020 research and innovation programme (Grant a
reference
modules
programmingguide
performances
kokkos
contributing
faq
Expand Down
48 changes: 48 additions & 0 deletions doc/source/performances.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
======================
Performances
======================

We report below the performances obtained on various architectures using Idefix. The reference test
is the 3D MHD Orszag-Tang test problem with 2nd order reconstruction and uct_contact EMFS bundled in
Idefix test suite, computed with a 128\ :sup:`3` resolution per MPI sub-domain on GPUs or 32\ :sup:`3`
per MPI sub-domain on CPUs. All of the performances measures have been obtained enabling MPI on
*one full node*, but we report here the performance *per GPU*
(i.e. with 2 GCDs on AMD Mi250) or *per core* (on CPU), i.e. dividing the node performance by the number of GPU/core
to simplify the comparison with other clusters.

The complete scalability tests are available in Idefix `method paper <https://ui.adsabs.harvard.edu/abs/2023A%26A...677A...9L/abstract>`_.
The performances mentionned below are updated for each major revision of Idefix, so they might slightly differ from the method paper.

.. note::

You might expect
slower performances with lower resolution when using GPUs. The overall performances also depends on
the physical modules activated, the reconstruction scheme, and the efficiency of the parallel network
on which you are running. The performances reported below are therefore purely indicative. We encourage
you to use the embedded profiler (see :ref:`commandLine` ) when performances are smaller than expected.


CPU performances
================

+---------------------+--------------------+----------------------------------------------------+
| Cluster name | Processor | Performances (in 10\ :sup:`6` cell/s/core) |
+=====================+====================+====================================================+
| TGCC/Irene Rome | AMD EPYC Rome | 0.29 |
+---------------------+--------------------+----------------------------------------------------+
| IDRIS/Jean Zay | Intel Cascade Lake | 0.62 |
+---------------------+--------------------+----------------------------------------------------+


GPU performances
================

+----------------------+--------------------+----------------------------------------------------+
| Cluster name | GPU | Performances (in 10\ :sup:`6` cell/s/GPU) |
+======================+====================+====================================================+
| IDRIS/Jean Zay | NVIDIA V100 | 110 |
+----------------------+--------------------+----------------------------------------------------+
| IDRIS/Jean Zay | NVIDIA A100 | 194 |
+----------------------+--------------------+----------------------------------------------------+
| CINES/Adastra | AMD Mi250 | 250 |
+----------------------+--------------------+----------------------------------------------------+
3 changes: 2 additions & 1 deletion doc/source/reference/idefix.ini.rst
Original file line number Diff line number Diff line change
Expand Up @@ -343,7 +343,8 @@ and ``X1-end``, ``X2-end``, ``X3-end`` for the right boundaries. Each boundary c
+----------------+------------------------------------------------------------------------------------------------------------------+
| periodic | Periodic boundary conditions. Each field is copied between beg and end sides of the boundary. |
+----------------+------------------------------------------------------------------------------------------------------------------+
| reflective | The normal component of the velocity is systematically reversed. Otherwise identical to ``outflow``. |
| reflective | | Mirror the normal component of the velocity field and the tangential components of the magnetic field. |
| | | Zero gradient on the other components (tangential velocity and normal field). |
+----------------+------------------------------------------------------------------------------------------------------------------+
| shearingbox | Shearing-box boudary conditions. |
+----------------+------------------------------------------------------------------------------------------------------------------+
Expand Down
44 changes: 16 additions & 28 deletions doc/source/reference/makefile.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,15 +108,17 @@ We recommend the following modules and environement variables on AdAstra:

.. code-block:: bash
module load PrgEnv-cray-amd
module load cray-mpich
module load craype-network-ofi
module load cce
module load cpe
module load rocm/5.2.0
export LDFLAGS="-L${ROCM_PATH}/lib -lamdhip64 -lstdc++fs"
The last line being there to guarantee the link to the HIP library and the access to specific
module load cpe/23.12
module load craype-accel-amd-gfx90a craype-x86-trento
module load PrgEnv-cray
module load amd-mixed/5.7.1
module load rocm/5.7.1 # nécessaire a cause d'un bug de path pas encore fix..
export HIPCC_COMPILE_FLAGS_APPEND="-isystem ${CRAY_MPICH_PREFIX}/include"
export HIPCC_LINK_FLAGS_APPEND="-L${CRAY_MPICH_PREFIX}/lib -lmpi ${PE_MPICH_GTL_DIR_amd_gfx90a} ${PE_MPICH_GTL_LIBS_amd_gfx90a} -lstdc++fs"
export CXX=hipcc
export CC=hipcc
The `-lstdc++fs` option being there to guarantee the link to the HIP library and the access to specific
C++17 <filesystem> functions.

Finally, *Idefix* can be configured to run on Mi250 by enabling HIP and the desired architecture with the following options to ccmake:
Expand Down Expand Up @@ -144,15 +146,16 @@ We recommend the following modules and environement variables on Jean Zay:

.. code-block:: bash
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_VOLTA70=ON
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_VOLTA70=ON -DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF
While Ampere A100 GPUs are enabled with

.. code-block:: bash
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_AMPERE80=ON
-DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_AMPERE80=ON -DKokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC=OFF
MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual.
MPI (multi-GPU) can be enabled by adding ``-DIdefix_MPI=ON`` as usual. The malloc async option is here to prevent a bug when using PSM2 with async
cuda malloc possibly leading to openmpi crash or hangs on the Jean Zay machine.

.. _setupSpecificOptions:

Expand All @@ -174,7 +177,7 @@ explicitely the options as they are required, using the functions ``set_idefix_p
.. _customSourceFiles:

Add/replace custom source files
Add custom source files
+++++++++++++++++++++++++++++++

It is possible to add custom source files to be compiled and linked against *Idefix*. This can be useful
Expand All @@ -189,21 +192,6 @@ say you want to add source files for an analysis, your ``CMakeLists.txt`` should
add_idefix_source(analysis.hpp)
*Idefix* also allows one to replace a source file in `$IDEFIX_DIR` by your own implementation. This is useful when developping new functionnalities without touching
the main directory of your *Idefix* repository. For instance, say one wants to replace the implementation of viscosity in `$IDEFIX_SRC/src/hydro/viscosity.cpp`,
with a customised `myviscosity.cpp` in the problem directory, one should add a ``CMakeLists.txt`` in the problem directory reading

.. code-block::
:caption: CMakeLists.txt
replace_idefix_source(hydro/viscosity.cpp myviscosity.cpp)
Note that the first parameter of ``replace_idefix_source`` is used as a search pattern in `$IDEFIX_DIR`. Hence it is possible to ommit the parent directory
of the file being replaced if there is only one file with that name in the *Idefix* source directory, which is not guaranteed (some classes may implement
methods with the same name). It is therefore recommended to add the parent directory in the first argument of ``replace_idefix_source``.


.. tip::

Don't forget to delete `CMakeCache.txt` before attempting to reconfigure the code when adding a problem-specific
Expand Down
2 changes: 1 addition & 1 deletion src/fluid/addNonIdealMHDFlux.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ void Fluid<Phys>::AddNonIdealMHDFlux(const real t) {
#if HAVE_ENERGY
Flux(ENG,k,j,i) += - Bx1 * eta * Jx2 + Bx2 * eta * Jx1;
#endif
dMax(k,j,i) += eta;
locdmax += eta;
}

if(haveAmbipolar) {
Expand Down
15 changes: 6 additions & 9 deletions src/input.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,12 @@ void Input::ShowConfig() {
idfx::cout << "-----------------------------------------------------------------------------"
<< std::endl;

std::stringstream os;
Kokkos::DefaultExecutionSpace().print_configuration(os, true);
idfx::cout << "Input: Kokkos configuration" << std::endl << os.str();
idfx::cout << "-----------------------------------------------------------------------------"
<< std::endl;

#ifdef SINGLE_PRECISION
idfx::cout << "Input: Compiled with SINGLE PRECISION arithmetic." << std::endl;
#else
Expand All @@ -237,15 +243,6 @@ void Input::ShowConfig() {
#ifdef WITH_MPI
idfx::cout << "Input: MPI ENABLED." << std::endl;
#endif
#ifdef KOKKOS_ENABLE_HIP
idfx::cout << "Input: Kokkos HIP target ENABLED." << std::endl;
#endif
#ifdef KOKKOS_ENABLE_CUDA
idfx::cout << "Input: Kokkos CUDA target ENABLED." << std::endl;
#endif
#ifdef KOKKOS_ENABLE_OPENMP
idfx::cout << "Input: Kokkos OpenMP ENABLED." << std::endl;
#endif
}

// This routine is called whenever a specific OS signal is caught
Expand Down
2 changes: 1 addition & 1 deletion src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ int main( int argc, char* argv[] ) {
n_seconds = divres.rem;

double perfs = timer.seconds() / grid.np_int[IDIR] / grid.np_int[JDIR]
/ grid.np_int[KDIR] / Tint.GetNCycles();
/ grid.np_int[KDIR] / Tint.GetNCycles() * idfx::psize;

idfx::cout << "Main: Reached t=" << data.t << std::endl;
idfx::cout << "Main: Completed in ";
Expand Down
2 changes: 1 addition & 1 deletion src/timeIntegrator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ TimeIntegrator::TimeIntegrator(Input & input, DataBlock & data) {
void TimeIntegrator::ShowLog(DataBlock &data) {
if(isSilent) return;
double rawperf = (timer.seconds()-lastLog)/data.mygrid->np_int[IDIR]/data.mygrid->np_int[JDIR]
/data.mygrid->np_int[KDIR]/cyclePeriod;
/data.mygrid->np_int[KDIR]/cyclePeriod * idfx::psize;
#ifdef WITH_MPI
// measure time spent in expensive MPI calls
double mpiCycleTime = idfx::mpiCallsTimer - lastMpiLog;
Expand Down

0 comments on commit 767fbcd

Please sign in to comment.