Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add devicemap to spmm benchmark #294

Merged
merged 2 commits into from
Jul 26, 2024
Merged

Conversation

devreal
Copy link
Contributor

@devreal devreal commented Jun 27, 2024

Add a simple device mapping for the SPMM benchmark. This mapping mashes together i and j such that all k for [i, j] run on the same device. Maybe there are better approaches?

Here is the performance on Seawulf:

1 device:

TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.163009 gflops/s= 3952.21
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.115536 gflops/s= 5576.14
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.115596 gflops/s= 5573.25
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.104487 gflops/s= 6165.79
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.104266 gflops/s= 6178.86

2 devices without mapping:

TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.2646 gflops/s= 2434.79
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.370376 gflops/s= 1739.44
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.413399 gflops/s= 1558.41
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.411325 gflops/s= 1566.27
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.27798 gflops/s= 2317.6

2 devices with mapping:

TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.120567 gflops/s= 5343.46
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.076282 gflops/s= 8445.57
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.076096 gflops/s= 8466.21
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.075933 gflops/s= 8484.39
TTG-PARSEC PxQxR=   1 1 1 1 average_NB= 1.00033e+06 M= 1200 N= 16384 K= 16384 t= 1024 T=1024 Tiling= RandomIrregularTiling A_density= 1 B_density= 1 gflops= 644.245 seconds= 0.08367 gflops/s= 7699.83

Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
@devreal devreal requested review from evaleev and therault June 27, 2024 13:42
Thanks to Nilesh for catching this.

Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
Copy link
Contributor

@therault therault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment inline. Otherwise, the checks are failing because some tests in PaRSEC fail to compile... Is this a new behavior? This is unrelated to the changes in this PR.

examples/spmm/spmm_cuda.cc Show resolved Hide resolved
Copy link
Contributor

@therault therault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@devreal
Copy link
Contributor Author

devreal commented Jul 26, 2024

@therault comments are noted. I will merge this to incorporate it into #273 and include it in the restructuring happening there. We will revisit the mapping later.

@devreal devreal merged commit 7fb4027 into TESSEorg:master Jul 26, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants