-
Notifications
You must be signed in to change notification settings - Fork 0
Sync meeting 2023 12 12
Caspar van Leeuwen edited this page Dec 12, 2023
·
1 revision
attending: Kenneth (HPC-UGent), Caspar (SURF), Thomas (UiB), Bob (RUG), Danilo (HPCNow!), Elisabeth (HPCNow!), Erica (HPCNow!), Helena (HPCNow!), Nadia (HPCNow!), Neja (NIC), Pedro (RUG), Lara (HPC-UGent), Susana (HPCNow!)
Updates by task, and decide on action for M12 tasks (see https://github.com/orgs/multixscale/projects/1):
-
Task 1.1
- [Move?] Build software for EESSI pilot 2023.06 #3. ESPResSo, waLBerla, LAMMPS, OpenFOAM, ALL. Some in
pilot.eessi-hpc.org
, not all. Redefine task to 'build for eessi.io' and move to M15?- Same for Build Walberla #55 and add ESPResSo
- ideally these should be available by time of project review in
software.eessi.io
- [Keep] Support for Nvidia GPUs #1
- PR for installing in
host_injections
merged software-layer #368 - TODO: ship install scripts in repo
- TODO: get bot to install CUDA in
host_injections
on build cluster. - MultiXscale deliverable D1.1 mentions that initial GPU support is available (since it's an explicit milestone)
- PR for installing in
- [Close / create new issue?] Setting up the CVMFS network #14. Enough was done for deliverable, but more to do wrt private stratum 1 support.
- Bob will close issue #14 and open more focused issues on creating additional Stratum-1 servers to increase resilience, start-up performance, add synchronisation server, etc.
- [Move?] EESSI 2023.06 release (software.eessi.io)#92. Overview of building software (can always do more), add ESPResSo, build waLBerla, tests for all MultiXscale sw
- issue #92 can be closed, since 2023.06 version of software.eessi.io is in place, and there are separate issues for installing additional software
- [Keep] Deliverable D1.1: Report on shared software stack prototype #101
- basically ready for final review by Alan, Pedro will contact Alan on this
- [Move?] Build software for EESSI pilot 2023.06 #3. ESPResSo, waLBerla, LAMMPS, OpenFOAM, ALL. Some in
-
Task 1.2 (emerging architecture)
- some initial steps being done w.r.t. Arm support, like figuring out how to deal with broken tests, missing installations
-
Task 1.3
- [Move] Implement low level test for interconnect #30
- [Keep] Implement low level test for GPU #111
- not required for M12 milestone, can be done early next year, focus is on getting GPU support in place first
- [Keep] Deliverable D1.2: Plan for the design of a portable test suite #102
- Alan has already done final review, Caspar will review final minor changes done by Alan
-
Task 5.1
- [Close?] Set up support rotation #27
- Some change needed for Alan for first 3 months of 2023, so not ready to close yet
- [Keep] Deliverable D5.2: support portal #28
- Kenneth will finish D5.2 today, so it can get final review by Alan
- D5.2 got major revision in the last couple of days after initial feedback provided by Thomas
- [Close?] Set up support rotation #27
-
Task 5.2
- [Closed] Decide on schedule for periodic testing #35
- We now test 1 and 2 node runs, on Vega, Karolina and AWS Magic Castle through daily cronjobs
- CI configuration defined by test suite #93
- [Closed] Alter AWS and Vega test ruins so they use v0.1 of test-suite #112
- [Closed] Setup CI runs on AWS MC #128
- [Closed] Decide on schedule for periodic testing #35
-
Task 5.3
- [Move] Integrate testing in build-and-deploy workflow #23
- Modify
software-layer
'stest.sh
to run test suite - end date should be moved to end of 2024
- should be broken up into smaller tests, like actually run tests in software layer, testing in different OS,
- Modify
- [Move] GitHub App (v1.0) #44
- can be rescheduled to end of 2024, until test step +
- [Move] GitHub App (overview ticket, different versions)#41
- [Close?] Set up infrastructure to build/test/deploy software in EESSI using bot #45
- Magic Castle cluster is 'done'? => yes in AWS
- Thomas will open issue on enabling cross-cloud testing before deployment
- [Closed] document contribution policy for adding software to EESSI #108
- policy is live at https://www.eessi.io/docs/adding_software/contribution_policy
- [Keep] Deliverable 5.1 #48
- [Move] Integrate testing in build-and-deploy workflow #23
-
WP 6 updates:
- Magic Castle: Terraforming the Cloud to Teach HPC
- at SC23 by Alan + Félix-Antoine, ~20 participants
- Best Practices for CernVM-FS in HPC (https://www.youtube.com/watch?v=L0Mmy7NBXDU)
- online by Kenneth, Mon 4 Dec'23
- over 200 registrations, ~130 attending in Zoom
- tutorial material available at https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices
- developed in collaboration with CernVM-FS developers
- Streaming optimised scientific software: an introduction to EESSI (https://www.youtube.com/watch?v=KAYI9oKFLxA)
- over 100 registrations, ~60 attending in Zoom
- Talk on EESSI by Bob at SURF advanced computing user Day (Dutch HPC community)
- Elevator pitch is available on the shared files
- Also used in the CVMFS tutorial, we can cut that out from the recording and reuse
- Can do a dedicated recording of elevator pitch
- MultiXscale website will be down for a few days for maintenance
- Petra Papez will be working on this
- EuroHPC talk by Daniel Paralka. Discussed which CoE's fit well in the EuroHPC strategy. Gave MultiXscale as an example of a good fit
- see 11:38:00 at https://webcast.ec.europa.eu/eurohpc-ju-user-day-2023-12-11
- invitation to join in TeraTec forum
- see https://www.forumteratec.com
- 29-30 May 2024 in Paris
- European corner, 15-20 projects max.
- participation cost ~1,200 euro
- we need to let them know by Jan'24 if we're interested
- DoItNow has participated in past editions, and also has a booth there
- meeting was set up with organisers (Elisabeth, Susana & ??)
- can costs be covered by MultiXscale
- Magic Castle: Terraforming the Cloud to Teach HPC
-
WP 7 updates:
- New newsletter coming up, in preparation. Draft will be sent around.
- make sure to include tutorials & Paralka's reference of MultiXscale
- New newsletter coming up, in preparation. Draft will be sent around.
-
WP 8:
- [Keep] First periodic report #130
- [Ask Neja to elaborate on what needs to be done]
- Deadline: 8 Jan 2024
- work done + impact by each WP leader
- update of exploitation plan by every partners
- preliminary financial statements
- see template that was sent via email
- notify Neja date when milestones were accomplished
- needs to be entered in EU portal
- [Keep] First periodic report #130
-
Castiel 2 updates
- CASTIEL2 will provide template presentation for project review
-
status check on deliverables M12
- (see notes above)
- D1.1 (RUG - Bob/Pedro) Report on shared software stack prototype
- Written by Pedro/Bob
- Reviewed by Caspar. Pedro is finalizing implementing comments.
- D1.2 (SURF - Caspar) Plan for the design of a portable test suite
- Written by Caspar
- Reviewed by Kenneth. Review comments processed.
- Next review: Alan (?)
- D5.1 (UiB) Community contribution policy and GitHub App
- ...
- D5.2 (UGent) Support portal
- ...