Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added codellama #2146

Closed
wants to merge 3 commits into from
Closed

Added codellama #2146

wants to merge 3 commits into from

Conversation

MaanavD
Copy link
Contributor

@MaanavD MaanavD commented Jan 30, 2024

Adding codellama model to canary.
image
(doesn't run on 16GB GPU)

@xuzhao9
Copy link
Contributor

xuzhao9 commented Jan 30, 2024

I am good with adding it to canary.
Just asking, does it run on single 40GB A100?

@MaanavD
Copy link
Contributor Author

MaanavD commented Jan 31, 2024

@xuzhao9 it runs on A100 :)

$ python run.py codellama -d cuda
Warning: The model codellama cannot be found at core set.
/workspace/bowbao/onnxbench/transformers/src/transformers/utils/hub.py:124: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 637/637 [00:00<00:00, 3.35MB/s]
Running eval method from codellama on cuda in eager mode with input batch size 1 and precision fp16.
GPU Time per batch: 45.310 milliseconds
CPU Wall Time per batch: 45.340 milliseconds
Time to first batch: 75980.8640 ms
GPU 0 Peak Memory: 18.8482 GB
CPU Peak Memory: 1.4941 GB

@xuzhao9
Copy link
Contributor

xuzhao9 commented Jan 31, 2024

@MaanavD Nice! In this case, I suggest we add this model to models/ instead of canary. We should disable cpu test because it is too slow and will time out. For A10G, it should not OOM because peak GPU memory is 18GB (A10G has 24GB), but we could also disable it if there is OOM.

@xuzhao9
Copy link
Contributor

xuzhao9 commented Feb 23, 2024

LGTM

@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xuzhao9 merged this pull request in 4386604.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants