Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update bench cluster #229

Merged
merged 59 commits into from
Sep 5, 2024
Merged

update bench cluster #229

merged 59 commits into from
Sep 5, 2024

Conversation

3outeille
Copy link
Member

No description provided.

C-TC and others added 27 commits July 29, 2024 12:32
Fix _RowLinearAsyncCommunication
…heckpoint

Adding checkpoint after traning ends
Memory optimization in async tp-linear
log "No checkpoint path provided" only on rank 0
change naonset args definition to make it compatible with the parser
Match Transformers RoPE implementation
@3outeille 3outeille merged commit 8d2014f into bench_cluster Sep 5, 2024
6 of 7 checks passed
3outeille added a commit that referenced this pull request Sep 5, 2024
This reverts commit 8d2014f, reversing
changes made to 971a46a.
3outeille added a commit that referenced this pull request Sep 5, 2024
This reverts commit 8d2014f, reversing
changes made to 971a46a.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants