Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add t4 for llm perf leaderboard #238

Merged
merged 18 commits into from
Aug 19, 2024
Merged

Conversation

baptistecolle
Copy link
Collaborator

@baptistecolle baptistecolle commented Aug 1, 2024

Summary

This PR adds Nvidia T4 to the LLM-Perf Leaderboard:

Fixes

  • Fixed the trust_remote_code issue that was broken for the CI/CD pipeline.

Features

  • Added a new machine configuration (T4) to run the LLM performance benchmarking code.
  • Updated the A100 configuration to use the new runs-on definition (without tags).

Related PRs and Discussions

Workflow Trigger Changes

One thing that I am unsure about is that I modified the trigger for the workflow. To reduce unnecessary compute, you can manually trigger the workflow, and it is triggered with each new release of the repo (releases).

I think this could be better. What do you think, @IlyasMoutawwakil?

@baptistecolle baptistecolle changed the title WIP: Add t4 for llm perf leaderboard Add t4 for llm perf leaderboard Aug 8, 2024
@baptistecolle baptistecolle marked this pull request as ready for review August 12, 2024 10:22
Comment on lines 4 to 6
workflow_dispatch: # Manual trigger
release: # Trigger on new release
types: [published]
Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs commenting, and why on release ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can remove the comments

Good question. I think it would be more efficient to run the full benchmark with each release of the pip package, rather than on a daily basis. Running it daily seems wasteful, as the hardware remains unchanged, and we’re simply repeating the benchmark for every code change. Since users are likely to benchmark using the PyPI package, it makes more sense to align this workflow with each release. We could also run them manually if we discover any issues with our benchmarks. However, if you prefer running the benchmark daily, I can revert to that schedule. Just let me know your preference

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess there's a misunderstanding. the daily trigger runs different benchmarks (different model+opt+quant) each time because it skips already benchmarked configurations. it is also a way to benchmark all configurations without being limited by the 6 hours time constraint of runners.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, it makes much more sense now. I removed the release schedule and left the original one

@IlyasMoutawwakil IlyasMoutawwakil merged commit 0b69851 into main Aug 19, 2024
25 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants