Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama v2 7b #1775

Closed
wants to merge 39 commits into from
Closed

llama v2 7b #1775

wants to merge 39 commits into from

Conversation

msaroufim
Copy link
Member

@msaroufim msaroufim commented Jul 19, 2023

No description provided.

@msaroufim msaroufim changed the title msaroufim/llamav2 WIP msaroufim/llamav2 Jul 19, 2023
@msaroufim msaroufim changed the title WIP msaroufim/llamav2 llama v2 Jul 20, 2023
@msaroufim msaroufim closed this Jul 20, 2023
@msaroufim msaroufim reopened this Jul 20, 2023
@msaroufim
Copy link
Member Author

Ok weird there's an OOM error here even though this is passing locally on my A10G

@msaroufim msaroufim requested a review from xuzhao9 July 20, 2023 17:51
@msaroufim msaroufim requested a review from xuzhao9 July 21, 2023 00:47
- device: cuda
test: train
- device: cuda
test: example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering why cuda example is disabled? Is it because of OOM?

The example test will check whether the output tensor is deterministic across runs, and it will be useful for the accuracy tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep at num_heads = 16 this was the only test that OOM'd because of some cloning - lemme get the error message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cloning is an issue, you can add DEEPCOPY = False to disable it, similar to

Can you please add DEEPCOPY = False and try again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright trying it now in CI

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok with deep copy off it still ooms at the full model size

@xuzhao9
Copy link
Contributor

xuzhao9 commented Jul 21, 2023

Could you please paste the command and output of running the model locally?
For example, the output of

python run.py llama2 -d cuda

@msaroufim msaroufim requested a review from xuzhao9 July 24, 2023 14:53
Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding llama v2 model!

@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@msaroufim merged this pull request in 196b3b8.

@msaroufim msaroufim changed the title llama v2 llama v2 7b Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants