Skip to content

Latest commit

 

History

History
195 lines (142 loc) · 21.8 KB

README.md

File metadata and controls

195 lines (142 loc) · 21.8 KB

Legio X TeaLeaf

Project for Advanced Computer Architectures course @ Politecnico di Milano.

TeaLeaf heat conduction mini-app over Legio MPI fault-tolerance library.

The objective is to perform a performance and functionality analysis in scenarios where the MPI infrastructure is affected by failures.

👨‍👨‍👦‍👦 Authors

TeaLeaf :: About

A C++based implementation of the TeaLeaf heat conduction mini-app. This implementation of TeaLeaf replicates the functionality of the reference version of TeaLeaf (https://github.com/UK-MAC/TeaLeaf_ref).

This implementation has support for building with and without MPI. When MPI is enabled, all models will adjust accordingly for asynchronous MPI send/recv.

Legio :: About

Legio is a library that introduces fault-tolerance in MPI applications in the form of graceful degradation. It's designed for embarrassingly parallel applications. It is based on ULFM.

One of the key aspects of Legio is the transparency of integration: no changes in the code are needed, integration is performed via linking. Legio leverages PMPI to catch all the calls toward MPI and wraps them with the appropriate code needed.

TeaLeaf :: Programming models

Together with MPI, TeaLeaf is currently implemented in the following parallel programming models, listed in no particular order.

NAME COMPILER REFERENCE
- serial
OpenMP 3, 4.5 omp
CUDA cuda
HIP hip
Kokkos >= 4 kokkos
C++ Parallel STL (StdPar) std-indices
SYCL sycl-acc
SYCL 2020 ^ or sycl-usm

Installing Legio

Prerequisites

  • CMake >= 3.10
  • ULFM features in MPI implementation

Steps

Follow the steps defined in Legio repository.

Building Legio-X-TeaLeaf

Prerequisites

  • CMake >= 3.13

Steps

# Configure the build
# 'Release' as default build-type
# -DMODEL option is required
foo@bar:~/path/to/Legio-X-TeaLeaf$ cmake -Bbuild -H. -DMODEL=<model> -DENABLE_MPI=ON <model options through -D...> -Dlegio_DIR=<path/to/Legio/install/lib/cmake>

# Compile
foo@bar:~/path/to/Legio-X-TeaLeaf$ cmake --build build

Notes

  • MODEL :: selects the programming model implementation of TeaLeaf to build (references shown above); the source code for each model's implementations is located in ./src/<model>

Executing Legio-X-TeaLeaf

Steps

# Run executables in ./build
foo@bar:~/path/to/Legio-X-TeaLeaf$ <mpirun> -n <num_ranks> --with-ft ulfm ./build/<model>-tealeaf

Notes

  • <mpirun> :: mpirun executable with ULFM features
  • --with-ft ulfm :: fault-tolerance support via ULFM (built-in by default in OpenMPI v5.0.x)
  • ./build/<model>-tealeaf :: executable path and filename generated according to the defined `model

TeaLeaf :: File Input

The contents of tea.in defines the geometric and runtime information, apart from task and thread counts.

A complete list of options is given below, where <R> shows the option takes a real number as an argument. Similarly <I> is an integer argument.

There is not a full implementation of the configuration properties of the TeaLeaf_ref application.

OPTION DESCRIPTION
xmin <R>
xmax <R>
ymin <R>
ymax <R>
Size of the computational domain. The default domain size is a 10cm square.
x_cells <I>
y_cells <I>
Number of discrete cells through which decompose the computational domain along the two axis. The default is 10 cells in each direction.
state 1 density <R> energy <R> State of the ambient material of the computational domain - here geometry information is ignored. Regions not covered by other defined states receive the energy and density of state 1.
state <I> density <R> energy <R> geometry rectangle xmin <R> ymin <R> xmax <R> ymax <R> State of a rectangular region in the computational domain.
Note that the generator is simple and the defined state completely fills a cell with which it intersects.
In case of over lapping regions, the last state takes priority.
state <I> density <R> energy <R> geometry circular xmin <R> ymin <R> radius <R> State of a circular region in the computational domain.
Note that the generator is simple and the defined state completely fills a cell with which it intersects.
In case of over lapping regions, the last state takes priority.
Hence, a circular region will have a stepped interface.
state <I> density <R> energy <R> geometry point xmin <R> ymin <R> State of a point in the computational domain.
Note that the generator is simple and the defined state completely fills a cell with which it intersects.
In case of over lapping regions, the last state takes priority.
Hence, a point region will fill the cell it lies in.
visit_frequency <I> Step frequency of visualisation dumps. The files produced are text base VTK files and are easily viewed on apps such as ViSit. The default is to output no graphical data. The default is to output no graphical data.
Note that the overhead of output is high, so should not be invoked when performance benchmarking is being carried out.
summary_frequency <I> Step frequency of summary dumps. This requires a global reduction and associated synchronisation, so performance will be slightly affected as the frequency is increased. The default is for a summary dump to be produced every 10 steps and at the end of the simulation.
initial_timestep <R> Initial time step. This time step stays constant through the entire simulation. The default value is 0.1.
end_time <R> End time for the simulation. When the simulation time is greater than this number the simulation will stop.
end_step <I> Number of the end step for the simulation. When the simulation step is equal to this then simulation will stop. In case both this and the previous options are set, the simulation will terminate on whichever completes first.
preconditioner_on Whether to apply a preconditioner before linear solving.
N.d.R. This property seems read but not used throughout the TeaLeaf code.
use_jacobi Jacobi method to solve the linear system. Note that this a very slowly converging method compared to other options. This is the default method is no method is explicitly selected.
use_cg Conjugate Gradient method to solve the linear system.
use_ppcg Conjugate Gradient method to solve the linear system.
use_chebyshev Chebyshev method to solve the linear system.
presteps <I> Number of Conjugate Gradient iterations to be completed before the Chebyshev method is started. This is necessary to provide approximate minimum and maximum eigen values to start the Chebyshev method. The default value is 30.
ppcg_inner_steps <I> Number of inner steps to run when using the PPCG solver. The default value is 10.
errswitch If enabled alongside Chebshev/PPCG solver, switch when a certain error is reached instead of when a certain number of steps is reached. The default for this is off.
epslim Default error to switch from CG to Chebyshev when using Chebyshev solver with the tl_cg_ch_errswitch option enabled. The default value is 1e-5.
max_iters <I> Provides an upper limit of the number of iterations used for the linear solve in a step. If this limit is reached, then the solution vector at this iteration is used as the solution anyway. The default value is 1000.
eps <R> Convergence criteria for the selected solver. It uses the least squares measure of the residual. The default value is 1.0e-10.
coefficient_density Use the density as the conduction coefficient. This is the default option.
coefficient_inverse_density Use the inverse density as the conduction coefficient.
halo_depth
num_chunks_per_rank N.d.R. Actually, settable but works only with 1 chunk per rank.

New properties added by TeaLeaf w.r.t. to TeaLeaf_ref are

OPTION DESCRIPTION
check_result Standard test with a "known" solution.
Solutions are iterated until the right sequence of x_cells, y_cells and end_step is found.
Note that the known solution for an iterative solver is not an analytic solution but is the solution for a single core simulation with IEEE options enabled with the Intel compiler and a strict convergence of 1.0e-15.
The difference with the expected solution is reported at the end of simulation in the tea.out file.
There is no default value for this option.
use_fortran_kernels
use_c_kernels

Following properties have been implemented in this fork.

OPTION DESCRIPTION
visit_frequency <I> Step frequency of visualisation dumps. The files produced are text base VTK files and are easily viewed on apps such as ViSit, ParaView, etc.. The default is to output no graphical data.
Note that the visit overhead is high, so it should not be invoked when performance benchmarking is being carried out.

Legio-X-TeaLeaf postprocessing

Just like TeaLeaf_ref, this application has been improved to make each node produce its own VTK file - Visualization ToolKit format. Each VTK file can be opened and visualized in applications such as ViSit and ParaView.

To improve VTK files management on these applications, a postprocessing script is supplied to merge VTK files produced by different nodes but related to the same iteration.

Prerequisites

  • Python >= 3
  • PIP >= 24.0

Steps

# UNA TANTUM :: Install required packages
foo@bar:~/path/to/Legio-X-TeaLeaf$ pip3 install -r postprocess-requirements.txt

# Postprocess
foo@bar:~/path/to/Legio-X-TeaLeaf$ python3 postprocess.py [-options]

Following, the options to run the script.

OPTION DEFAULT DESCRIPTION
-i <string>
--input <string>
target/vtk The directory containing the input VTK files.
-o <string>
--output <string>
target/vtk/postprocess The directory to produce the merged VTK files in.
-p <string>
--output-prefix <string>
tea The prefix to introduce to output VTK filenames for `... naming.
-v <string>
--visit <string>
target/vtk The directory containing the 'tea.visit' file. `
--bin <bool> false Whether the VTK files should be generated in a bin`ary format.
--rm <bool> false Whether to remove the VTK files in the input directory after postprocessing.

Utilities

Shortcut to delete all VTK files in default paths

Useful if you want to run another execution with either different end_step or visit_frequency.

# Remove all target/**/*.vtk files
foo@bar:~/path/to/Legio-X-TeaLeaf$ ./clear-vtk.sh