Add coefficient of variance to the bench mark report. #1554

chengjunlu · 2024-07-03T05:04:06Z

Add coefficient of variance to the bench mark report in the micro-benchmark report.

pbchekin · 2024-07-03T16:52:44Z

Please test with the actual workflow (you can run "Triton benchmarks" for your branch). Currently the changes do not work:
https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/9781720957/job/27006440292

pbchekin · 2024-07-08T05:02:15Z

Successful run: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/9833683619/job/27144280954

softmax-performance:
         N  Triton-GB/s  XeTLA-GB/s  Triton-GB/s-min  XeTLA-GB/s-min  Triton-GB/s-max  XeTLA-GB/s-max  Triton-TFlops  XeTLA-TFlops  Triton-TFlops-min  XeTLA-TFlops-min  Triton-TFlops-max  XeTLA-TFlops-max  Triton-CV  XeTLA-CV
0    256.0   666.959247  751.912338       639.375598      476.625457       708.497308      873.813292       0.666959      0.751912           0.639376          0.476625           0.708497          0.873813   0.020683  0.112134
1   1024.0   852.178476  871.008855       845.625798      794.375734       866.591724     1205.259785       0.852178      0.871009           0.845626          0.794376           0.866592          1.205260   0.006625  0.057388
2   2048.0  1326.681649  924.058975      1152.281316      822.412594      1407.484513     1327.311359       1.326682      0.924059           1.152281          0.822413           1.407485          1.327311   0.047909  0.077342
3   4096.0   777.372987  774.463015       718.202711      716.975062       812.849658     1158.647559       0.777373      0.774463           0.718203          0.716975           0.812850          1.158648   0.030980  0.068119
4   8192.0   797.892135  746.956533       772.431690      724.404855       870.187483      812.062733       0.797892      0.746957           0.772432          0.724405           0.870187          0.812063   0.020727  0.021783
5  16384.0   771.169338  753.176996       761.908050      745.654015       794.752057      782.519359       0.771169      0.753177           0.761908          0.745654           0.794752          0.782519   0.010341  0.009802
6  32768.0   840.465288  839.332881       834.064957      832.409566       848.834653      852.284270       0.840465      0.839333           0.834065          0.832410           0.848835          0.852284   0.005450  0.006[801](https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/9833683619/job/27144280954#step:16:802)

XeTLA-CV for N=256 is 11%.

chengjunlu · 2024-07-08T05:21:29Z

Successful run: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/9833683619/job/27144280954

softmax-performance:
         N  Triton-GB/s  XeTLA-GB/s  Triton-GB/s-min  XeTLA-GB/s-min  Triton-GB/s-max  XeTLA-GB/s-max  Triton-TFlops  XeTLA-TFlops  Triton-TFlops-min  XeTLA-TFlops-min  Triton-TFlops-max  XeTLA-TFlops-max  Triton-CV  XeTLA-CV
0    256.0   666.959247  751.912338       639.375598      476.625457       708.497308      873.813292       0.666959      0.751912           0.639376          0.476625           0.708497          0.873813   0.020683  0.112134
1   1024.0   852.178476  871.008855       845.625798      794.375734       866.591724     1205.259785       0.852178      0.871009           0.845626          0.794376           0.866592          1.205260   0.006625  0.057388
2   2048.0  1326.681649  924.058975      1152.281316      822.412594      1407.484513     1327.311359       1.326682      0.924059           1.152281          0.822413           1.407485          1.327311   0.047909  0.077342
3   4096.0   777.372987  774.463015       718.202711      716.975062       812.849658     1158.647559       0.777373      0.774463           0.718203          0.716975           0.812850          1.158648   0.030980  0.068119
4   8192.0   797.892135  746.956533       772.431690      724.404855       870.187483      812.062733       0.797892      0.746957           0.772432          0.724405           0.870187          0.812063   0.020727  0.021783
5  16384.0   771.169338  753.176996       761.908050      745.654015       794.752057      782.519359       0.771169      0.753177           0.761908          0.745654           0.794752          0.782519   0.010341  0.009802
6  32768.0   840.465288  839.332881       834.064957      832.409566       848.834653      852.284270       0.840465      0.839333           0.834065          0.832410           0.848835          0.852284   0.005450  0.006[801](https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/9833683619/job/27144280954#step:16:802)

XeTLA-CV for N=256 is 11%.

Let's use this issue #1566 to track the outlier and the other issue found here.

chengjunlu requested review from ESI-SYD and pbchekin July 3, 2024 05:04

chengjunlu linked an issue Jul 3, 2024 that may be closed by this pull request

[softmax] Investigate performance variation / degradation from c141986 to 93d168c #1350

Closed

ESI-SYD approved these changes Jul 3, 2024

View reviewed changes

chengjunlu force-pushed the chengjun/llvm-target-add-cv-in-microbench branch from 277cfe3 to c0f624d Compare July 8, 2024 01:22

pbchekin approved these changes Jul 8, 2024

View reviewed changes

Add coefficient of variance to the bench mark report.

c0f624d

pbchekin merged commit 69eba17 into llvm-target Jul 8, 2024
7 checks passed

pbchekin deleted the chengjun/llvm-target-add-cv-in-microbench branch July 8, 2024 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coefficient of variance to the bench mark report. #1554

Add coefficient of variance to the bench mark report. #1554

chengjunlu commented Jul 3, 2024

pbchekin commented Jul 3, 2024

pbchekin commented Jul 8, 2024

chengjunlu commented Jul 8, 2024

Add coefficient of variance to the bench mark report. #1554

Add coefficient of variance to the bench mark report. #1554

Conversation

chengjunlu commented Jul 3, 2024

pbchekin commented Jul 3, 2024

pbchekin commented Jul 8, 2024

chengjunlu commented Jul 8, 2024