Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tritonbench OSS CI #2243

Closed
wants to merge 11 commits into from
Closed

Add tritonbench OSS CI #2243

wants to merge 11 commits into from

Conversation

xuzhao9
Copy link
Contributor

@xuzhao9 xuzhao9 commented Apr 25, 2024

Generate the userbenchmark json file with --ci option.

Test Plan:

$ python run_benchmark triton --ci

  x_val    naive_softmax-gbps    naive_softmax-latency    triton_softmax-gbps    triton_softmax-latency
-------  --------------------  -----------------------  ---------------------  ------------------------
    256               190.512                 0.044032                585.143                  0.014336
    384               240.941                 0.052224                682.667                  0.018432
    512               287.439                 0.058368                862.316                  0.019456
    640               310.303                 0.067584                930.909                  0.022528
    768               323.368                 0.077824                983.04                   0.0256
    896               333.395                 0.088064               1061.93                   0.027648
   1024               330.99                  0.101376               1057.03                   0.031744
   1152               323.368                 0.116736               1053.26                   0.03584
   1280               315.077                 0.13312                1077.89                   0.038912
   1408               306.503                 0.150528               1126.4                    0.04096
{
    "name": "triton",
    "environ": {
        "pytorch_git_version": "734a000f16b60b3a4e18404e5047a467e2bc96d4",
        "pytorch_version": "2.4.0.dev20240425+cu121",
        "triton_version": "3.0.0+45fff310c8"
    },
    "metrics": {
        "tritonbench_softmax[x_256-naive_softmax-gbps]": 190.51162788428252,
        "tritonbench_softmax[x_256-naive_softmax-latency]": 0.04403200000524521,
        "tritonbench_softmax[x_256-triton_softmax-gbps]": 585.1428491169094,
        "tritonbench_softmax[x_256-triton_softmax-latency]": 0.014336000196635723,
        "tritonbench_softmax[x_384-naive_softmax-gbps]": 240.9411808385707,
        "tritonbench_softmax[x_384-naive_softmax-latency]": 0.05222399905323982,
        "tritonbench_softmax[x_384-triton_softmax-gbps]": 682.6666425201638,
        "tritonbench_softmax[x_384-triton_softmax-latency]": 0.018432000651955605,
        "tritonbench_softmax[x_512-naive_softmax-gbps]": 287.43859091066304,
        "tritonbench_softmax[x_512-naive_softmax-latency]": 0.058368001133203506,
        "tritonbench_softmax[x_512-triton_softmax-gbps]": 862.3158277685969,
        "tritonbench_softmax[x_512-triton_softmax-latency]": 0.01945599913597107,
        "tritonbench_softmax[x_640-naive_softmax-gbps]": 310.3030278794365,
        "tritonbench_softmax[x_640-naive_softmax-latency]": 0.06758400052785873,
        "tritonbench_softmax[x_640-triton_softmax-gbps]": 930.9090836383094,
        "tritonbench_softmax[x_640-triton_softmax-latency]": 0.02252800017595291,
        "tritonbench_softmax[x_768-naive_softmax-gbps]": 323.3684354132239,
        "tritonbench_softmax[x_768-naive_softmax-latency]": 0.07782399654388428,
        "tritonbench_softmax[x_768-triton_softmax-gbps]": 983.0400248336799,
        "tritonbench_softmax[x_768-triton_softmax-latency]": 0.025599999353289604,
        "tritonbench_softmax[x_896-naive_softmax-gbps]": 333.3953487974944,
        "tritonbench_softmax[x_896-naive_softmax-latency]": 0.08806400001049042,
        "tritonbench_softmax[x_896-triton_softmax-gbps]": 1061.9259241356608,
        "tritonbench_softmax[x_896-triton_softmax-latency]": 0.027648000046610832,
        "tritonbench_softmax[x_1024-naive_softmax-gbps]": 330.9899085677046,
        "tritonbench_softmax[x_1024-naive_softmax-latency]": 0.1013759970664978,
        "tritonbench_softmax[x_1024-triton_softmax-gbps]": 1057.0322723626846,
        "tritonbench_softmax[x_1024-triton_softmax-latency]": 0.03174399957060814,
        "tritonbench_softmax[x_1152-naive_softmax-gbps]": 323.36841477449593,
        "tritonbench_softmax[x_1152-naive_softmax-latency]": 0.11673600226640701,
        "tritonbench_softmax[x_1152-triton_softmax-gbps]": 1053.2571147256976,
        "tritonbench_softmax[x_1152-triton_softmax-latency]": 0.035840000957250595,
        "tritonbench_softmax[x_1280-naive_softmax-gbps]": 315.07692221918046,
        "tritonbench_softmax[x_1280-naive_softmax-latency]": 0.13312000036239624,
        "tritonbench_softmax[x_1280-triton_softmax-gbps]": 1077.894784710746,
        "tritonbench_softmax[x_1280-triton_softmax-latency]": 0.03891199827194214,
        "tritonbench_softmax[x_1408-naive_softmax-gbps]": 306.50340379369584,
        "tritonbench_softmax[x_1408-naive_softmax-latency]": 0.15052799880504608,
        "tritonbench_softmax[x_1408-triton_softmax-gbps]": 1126.4000284552583,
        "tritonbench_softmax[x_1408-triton_softmax-latency]": 0.04095999896526337
    }
}

@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@bertmaher
Copy link
Contributor

Cool! Are results going to be visible somewhere after this diff? Or is more to come?

@xuzhao9
Copy link
Contributor Author

xuzhao9 commented Apr 25, 2024

Cool! Are results going to be visible somewhere after this diff? Or is more to come?

There will be follow-up PRs to make the results visible on public, stay tuned!

@facebook-github-bot
Copy link
Contributor

@xuzhao9 merged this pull request in f24df5d.

@xuzhao9 xuzhao9 deleted the xz9/tritonbench branch April 26, 2024 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants