MPI benchmark driver #138

thomasgibson · 2022-08-15T13:26:57Z

This PR modifies the current driver main.cpp and adds MPI support for launching the benchmark across multiple devices. The main takeaways here:

Each MPI rank is assigned a specific GPU and launches the benchmark
There is no direct GPU-to-GPU communication happening
- For the dot-kernel, the resulting sums are reduced across all MPI ranks (on the host) and broadcasted to each rank (via MPI_Allreduce).
- Benchmark error checking is performed on all ranks.
Measured bandwidths are aggregated across all ranks

The only major question I have is how MPI should be treated by CMake. I am open to suggestions and happy to comply with whatever you all prefer.

tomdeakin

Please see above for comments on this. I'm also nervous of merging this without some evidence that this works with all supported models.

Can you also think about how we can update the CI to test this is valid?

src/main.cpp

tomdeakin · 2023-01-30T12:22:10Z

src/main.cpp

@@ -137,8 +182,15 @@ std::vector<std::vector<double>> run_all(Stream<T> *stream, T& sum)
    timings[3].push_back(std::chrono::duration_cast<std::chrono::duration<double> >(t2 - t1).count());

    // Execute Dot
+#if USE_MPI
+    // Synchronize ranks before computing dot-product
+    MPI_Barrier(MPI_COMM_WORLD);


This overhead is not timed, and waits for 4 kernels to finish before starting the timer for dot. Can you explain the motivation for not including synchronisation time for each kernel?

tomdeakin · 2023-01-30T12:26:36Z

src/main.cpp

+      {
+        // MiB = 2^20
+        std::cout << std::setprecision(1) << std::fixed
+#if USE_MPI


There must be a better way to do this, as this is hard to read. Maybe creating a string variable with the label that is initialised depending the MPI case. E.g.,

#ifdef USE_MPI char * array_size_str = "Array size (per rank): "; #else char * array_size_str = "Array size: "; #endif

tomdeakin · 2023-01-30T12:28:29Z

src/main.cpp

+#if USE_MPI
+      MPI_Datatype MPI_DTYPE = use_float ? MPI_FLOAT : MPI_DOUBLE;
+
+      // Collect global min/max timings


It's not clear that the output will be for a single device, even though this is an MPI code. Is that what we expect, or do we want to (additionally?) report aggregate bandwidth?

tomdeakin · 2023-01-30T12:29:25Z

src/main.cpp

-          << std::left << std::setw(12) << std::setprecision(5) << average
-          << std::endl;
+          << "--------------------------------"
+          << std::endl << std::fixed


Why do you need to change this? Especially adding the line?

tomdeakin · 2024-05-13T17:01:26Z

We've got a large general refactor coming for the main driver coming in #186.
We should also think some more about what bandwidth we expect an MPI+X version should be measuring given there is no communication apart from the dot product. I think we discussed it, but it would be good to document the reasons for wanting MPI+X versions of BabelStream vs running this benchmark on multiple nodes concurrently with pdsh, srun, etc and post-processing.

thomasgibson added 4 commits October 18, 2022 00:21

Find MPI in CMake build

0ed56af

Add MPI support

a33e593

Clean up mpi-driver

e527317

Allow for manually specifying MPI flags

468d002

thomasgibson force-pushed the mpi-bench branch from ffe986d to 468d002 Compare October 18, 2022 05:23

tomdeakin requested changes Jan 30, 2023

View reviewed changes

tomdeakin added this to the v6.0 milestone Oct 6, 2023

tomdeakin removed this from the v6.0 milestone May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPI benchmark driver #138

MPI benchmark driver #138

thomasgibson commented Aug 15, 2022

tomdeakin left a comment

tomdeakin Jan 30, 2023

tomdeakin Jan 30, 2023

tomdeakin Jan 30, 2023

tomdeakin Jan 30, 2023

tomdeakin commented May 13, 2024

MPI benchmark driver #138

Are you sure you want to change the base?

MPI benchmark driver #138

Conversation

thomasgibson commented Aug 15, 2022

tomdeakin left a comment

Choose a reason for hiding this comment

tomdeakin Jan 30, 2023

Choose a reason for hiding this comment

tomdeakin Jan 30, 2023

Choose a reason for hiding this comment

tomdeakin Jan 30, 2023

Choose a reason for hiding this comment

tomdeakin Jan 30, 2023

Choose a reason for hiding this comment

tomdeakin commented May 13, 2024