-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xios interface with tests #496
Conversation
78e35a4
to
e9c96f5
Compare
I can see that you have the same
|
When I try to run
|
Yeah but I figured there's no point fixing it if we move to a |
4c26312
to
e9c96f5
Compare
bb5401d
to
b0a551d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overview
The XIOS interface changes all seem sensible to me, as do the tests. I can confirm that all tests pass locally for me, including the new one with 2 MPI processes.
Installation notes
Starting from a working nextsim (develop
branch) and xios (trunk
), I tried to manually rebuild nextsim using this branch (xios-interface-with-tests
) with xios enabled. I kept hitting an error saying that netcdf_par.h
couldn't be found. There is a fix on the xios side (search for netcdf_par
on this webpage), but I couldn't figure out how to get this picked up by nextsim. In the end I resorted to the spack approach and it worked the first time.
Worth noting somewhere in the docs that running with -DENABLE_XIOS=ON
requires that -DENABLE_MPI=ON
, too.
## Nextsim CI \*\*now with added Docker\*\* This PR mainly pertains to the ubuntu build of the code. Previously the CI workflow built `nextsim` using a GitHub VM instance that would `apt install` basic dependencies required for the build process. However, with the addition of `xios` (#496 ) this is no longer a tractable solution. Furthermore, it would take ~15 mins just to build `xios` before then going on to build `nextsim`. Now I have created a `Dockerfile` and associated pre-built image (currently on my Dockerhub: `tommelt/nextsimdg-dev-env:with-xios`) which contains all the prerequisites to build and test the code for our typical CI workflow. In theory it could also be repurposed by developers as a dev environment (although I personally don't love this). ### more about the dockerfile The `Dockerfile` is based on a `spack` environment file (`spack.yaml`) was generated originally using the following command: ``` spack containerize > Dockerfile ``` However, I needed to make some manual adjustments to the `Dockerfile` (necessary for installing `xios` and some other dependencies). `Dockerfiles/Dockerfile` is the one that was used to generate the docker image `tommelt/nextsimdg-dev-env:with-xios`. The image is `2.41GB` and contains all prerequisites for `MPI` and `XIOS`. When uploaded to dockerhub (`docker push tommelt/nextsimdg-dev-env:with-xios`) it is compressed to `625.97 MB`. ### benefits of Docker in CI * It's fast - only takes 45s to download the image and then 150s to build `nextsim` :fire: - a full CI (for ubuntu) is now less than 4 minutes -- building `XIOS` alone used to take ~15 minutes * It's reproducible - one of the problems of github CI is it's hard to debug when things don't work - you can now pull the docker image (`docker pull tommelt/nextsimdg-dev-env:with-xios`) and test stuff locally * It's easy - `nextsim` is necessarily adding complexity, which increases the required dependencies - the docker image can be built with all dependencies ready-to-go ### Changes to the parallel (MPI) test naming convention * There was only one existing MPI test before I added the xios one (`testRectGrid_MPI`). * I think it would be sensible to name all MPI tests something like `testname_MPIX` where `X` = the number of ranks the test should be run with * This is required because `doctest` specifies minimum/maximum numbers of procs for the test to run. A test that requires 2 procs to run, run with 1 MPI proc instead will "pass" but with a message like "test skipped". This is not a situation we want to end up i.e., thinking our tests pass when they have actually not been run. ### To Do: - [x] TM to add docker image to [github container registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry) - [x] ~@timspainNERSC to fix MacOS build.~ I tried to fix MPI/xios build on mac (using [tmate plugin](https://github.com/mxschmitt/action-tmate) to debug github action live) but the fix requires a lot of effort. (Compiling most/all dependencies from source) and it may not be worth it. We could consider not running the MPI/xios tests on mac? For now I have simply disabled MPI and XIOS on the mac build. One other work around would be create a dedicated spack build cache. (decided to not run tests for MPI/XIOS build on mac) - [x] add instructions to build the docker image (`docker build .`) - [x] add documentation around new docker image - [x] rebase to include Kacpers latest changes and Tim's fix of mac build - [x] ~should we keep existing docker images? or "rebase" them? or just have one with all dependencies and users can build their own version of nextsim to meet their needs?~ dev environment build is sufficient for now. - [x] move pre-built image to nextsim dockerhub repo (do we have a NextSim docker account yet?)
* updated/tidied Dockerfiles * modified the CI to use the docker image hosted on ghcr.io * disabled XIOS and MPI for the mac build * added spack.yaml needed to generate dockerfile
64836aa
to
f23042a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GTG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get some doxygen comments on these methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to convert to the nextSIM-DG time classes before merging this PR?
Nextsim XIOS Interface
The previous XIOS work has diverged from the current development branch.
This PR has a minimum working example (MWE) in the form of a test
testXiosInit
(source./core/test/XiosInit_test.cpp
).The test checks most of the functionality added by the XIOS C++ interface
./core/src/Xios.cpp
./core/src/include/Xios.hpp
which depend on a C interface included here:
./core/src/include/xios_c_interface.hpp
The core functionality of XIOS is included as part of this PR as the other two components will require it.
XIOS functionality
Still to discuss:
lxios
,lpnetcdf
etc.Also how do we want to handle mac builds going forward? I cannot simulate mac (virt OS/docker) so it's not easy for me to "fix" CIwe will not support MPI/xios on macThis PR does the following:
XIOS
tocmake
XIOS
to the CI workflowdoctest
I think this can be merged in relatively soon. The remaining steps are (for a future PR):
testXiosInit
is a stand-alone binary)XIOS
datetime types withnextsim
'sSteps required to run
testXiosInit
XIOS
(see the github workflow for an idea how to installXIOS
)nextsim
withXIOS
support using-DENABLE_XIOS=ON
and-Dxios_DIR=/path/to/xios/install
cd
tobuild/core/test
and runmpirun -n 2 ./testXiosInit
The test should pass all tests:
And produce the following output:
I don't currently verify the output of
diagnostic.nc
as part oftestXiosInit
but this can be added.This PR is to update the state of XIOS related PRs and Issues. This will close some stale issues
closes #218
closes #291
closes #293