Skip to content

Sync meeting 2023 12 12

Caspar van Leeuwen edited this page Dec 12, 2023 · 1 revision

attending: Kenneth (HPC-UGent), Caspar (SURF), Thomas (UiB), Bob (RUG), Danilo (HPCNow!), Elisabeth (HPCNow!), Erica (HPCNow!), Helena (HPCNow!), Nadia (HPCNow!), Neja (NIC), Pedro (RUG), Lara (HPC-UGent), Susana (HPCNow!)

Updates by task, and decide on action for M12 tasks (see https://github.com/orgs/multixscale/projects/1):

  • Task 1.1

    • [Move?] Build software for EESSI pilot 2023.06 #3. ESPResSo, waLBerla, LAMMPS, OpenFOAM, ALL. Some in pilot.eessi-hpc.org, not all. Redefine task to 'build for eessi.io' and move to M15?
    • [Keep] Support for Nvidia GPUs #1
      • PR for installing in host_injections merged software-layer #368
      • TODO: ship install scripts in repo
      • TODO: get bot to install CUDA in host_injections on build cluster.
      • MultiXscale deliverable D1.1 mentions that initial GPU support is available (since it's an explicit milestone)
    • [Close / create new issue?] Setting up the CVMFS network #14. Enough was done for deliverable, but more to do wrt private stratum 1 support.
      • Bob will close issue #14 and open more focused issues on creating additional Stratum-1 servers to increase resilience, start-up performance, add synchronisation server, etc.
    • [Move?] EESSI 2023.06 release (software.eessi.io)#92. Overview of building software (can always do more), add ESPResSo, build waLBerla, tests for all MultiXscale sw
      • issue #92 can be closed, since 2023.06 version of software.eessi.io is in place, and there are separate issues for installing additional software
    • [Keep] Deliverable D1.1: Report on shared software stack prototype #101
      • basically ready for final review by Alan, Pedro will contact Alan on this
  • Task 1.2 (emerging architecture)

    • some initial steps being done w.r.t. Arm support, like figuring out how to deal with broken tests, missing installations
  • Task 1.3

    • [Move] Implement low level test for interconnect #30
    • [Keep] Implement low level test for GPU #111
      • not required for M12 milestone, can be done early next year, focus is on getting GPU support in place first
    • [Keep] Deliverable D1.2: Plan for the design of a portable test suite #102
      • Alan has already done final review, Caspar will review final minor changes done by Alan
  • Task 5.1

    • [Close?] Set up support rotation #27
      • Some change needed for Alan for first 3 months of 2023, so not ready to close yet
    • [Keep] Deliverable D5.2: support portal #28
      • Kenneth will finish D5.2 today, so it can get final review by Alan
      • D5.2 got major revision in the last couple of days after initial feedback provided by Thomas
  • Task 5.2

    • [Closed] Decide on schedule for periodic testing #35
      • We now test 1 and 2 node runs, on Vega, Karolina and AWS Magic Castle through daily cronjobs
      • CI configuration defined by test suite #93
    • [Closed] Alter AWS and Vega test ruins so they use v0.1 of test-suite #112
    • [Closed] Setup CI runs on AWS MC #128
  • Task 5.3

    • [Move] Integrate testing in build-and-deploy workflow #23
      • Modify software-layer's test.sh to run test suite
      • end date should be moved to end of 2024
      • should be broken up into smaller tests, like actually run tests in software layer, testing in different OS,
    • [Move] GitHub App (v1.0) #44
      • can be rescheduled to end of 2024, until test step +
    • [Move] GitHub App (overview ticket, different versions)#41
    • [Close?] Set up infrastructure to build/test/deploy software in EESSI using bot #45
      • Magic Castle cluster is 'done'? => yes in AWS
      • Thomas will open issue on enabling cross-cloud testing before deployment
    • [Closed] document contribution policy for adding software to EESSI #108
    • [Keep] Deliverable 5.1 #48
  • WP 6 updates:

    • Magic Castle: Terraforming the Cloud to Teach HPC
      • at SC23 by Alan + Félix-Antoine, ~20 participants
    • Best Practices for CernVM-FS in HPC (https://www.youtube.com/watch?v=L0Mmy7NBXDU)
    • Streaming optimised scientific software: an introduction to EESSI (https://www.youtube.com/watch?v=KAYI9oKFLxA)
      • over 100 registrations, ~60 attending in Zoom
    • Talk on EESSI by Bob at SURF advanced computing user Day (Dutch HPC community)
    • Elevator pitch is available on the shared files
      • Also used in the CVMFS tutorial, we can cut that out from the recording and reuse
      • Can do a dedicated recording of elevator pitch
    • MultiXscale website will be down for a few days for maintenance
      • Petra Papez will be working on this
    • EuroHPC talk by Daniel Paralka. Discussed which CoE's fit well in the EuroHPC strategy. Gave MultiXscale as an example of a good fit
    • invitation to join in TeraTec forum
      • see https://www.forumteratec.com
      • 29-30 May 2024 in Paris
      • European corner, 15-20 projects max.
      • participation cost ~1,200 euro
      • we need to let them know by Jan'24 if we're interested
      • DoItNow has participated in past editions, and also has a booth there
      • meeting was set up with organisers (Elisabeth, Susana & ??)
      • can costs be covered by MultiXscale
  • WP 7 updates:

    • New newsletter coming up, in preparation. Draft will be sent around.
      • make sure to include tutorials & Paralka's reference of MultiXscale
  • WP 8:

    • [Keep] First periodic report #130
      • [Ask Neja to elaborate on what needs to be done]
      • Deadline: 8 Jan 2024
      • work done + impact by each WP leader
      • update of exploitation plan by every partners
      • preliminary financial statements
      • see template that was sent via email
    • notify Neja date when milestones were accomplished
      • needs to be entered in EU portal
  • Castiel 2 updates

    • CASTIEL2 will provide template presentation for project review
  • status check on deliverables M12

    • (see notes above)
    • D1.1 (RUG - Bob/Pedro) Report on shared software stack prototype
      • Written by Pedro/Bob
      • Reviewed by Caspar. Pedro is finalizing implementing comments.
    • D1.2 (SURF - Caspar) Plan for the design of a portable test suite
      • Written by Caspar
      • Reviewed by Kenneth. Review comments processed.
      • Next review: Alan (?)
    • D5.1 (UiB) Community contribution policy and GitHub App
      • ...
    • D5.2 (UGent) Support portal
      • ...
Clone this wiki locally