DEISA is a library that ensures coupling MPI simulation codes with Dask analytics.
DEISA plugin is built on PDI Data Interface.
DEISA as a PDI plugin requires the PDI Data Interface to be installed with python support.
DEISA requires Dask and Dask Distributed deisa verion that has been adapted to work with the new introduced concepts in DEISA
Check here for spack installation.
Or it can be installed on top of PDI by running:
cmake -DCMAKE_INSTALL_PREFIX=$HOME/local/ -DPython3_EXECUTABLE=~/.conda/envs/yourenv/bin/python3.8 ../
make install
A simulation can be instrumented with PDI to make its internal data available for DEISA thus Dask . At the beginning each simulation process reads the yaml configuration file and loads the DEISA and the MPI plugins of PDI.
Internally, a DEISA Bridge is created per MPI process, and they connect to Dask. The bridge which is associated with the process rank 0, reads the deisa_virtual_arrays
section in the yaml file and send it to the DASK client.
Once a piece of data is shared with PDI, the Bridge checks if it is included in the contract then it sends it to a worker that has been chosen in a round-robin fashion with a specific key, else it returns.
DEISA python library implements a DEISA Adaptor. This component is used from the Dask client-side to create Dask arrays describing the data generated by the simulation. The DEISA Adaptor waits for contract to be sent from the DEISA Bridge in MPI rank 0, it selects needed data and sign back the contract. It uses the information containted in the contract to create Dask arrays, that can be retrieved by calling get_deisa_arrays()
method then select the needed array.
An example is included in this repository, it includes the submission scripts that suppose a previous spack installation.