Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more examples for pipeline parallel inference #11372

Merged
merged 7 commits into from
Jun 21, 2024

Conversation

sgwhat
Copy link
Contributor

@sgwhat sgwhat commented Jun 20, 2024

Description

This PR is created to add model examples that have been evaluated.

How to test?

  • Local test
  • Unit test

@sgwhat sgwhat requested a review from plusbang June 21, 2024 09:05
@@ -0,0 +1,25 @@
source /opt/intel/oneapi/setvars.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add license for each script.


# To run CodeLlama-13b-Instruct-hf
# CCL_ZE_IPC_EXCHANGE=sockets torchrun --standalone --nnodes=1 --nproc-per-node $NUM_GPUS \
# generate.py --repo-id-or-model-path 'codellama/CodeLlama-7b-Instruct-hf' --gpu-num $NUM_GPUS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'codellama/CodeLlama-7b-Instruct-hf' -> 'codellama/CodeLlama-13b-Instruct-hf'

@sgwhat sgwhat merged commit 0c67639 into intel-analytics:main Jun 21, 2024
30 of 31 checks passed
RyuKosei pushed a commit to RyuKosei/ipex-llm that referenced this pull request Jul 19, 2024
)

* add more model exampels for pipelien parallel inference

* add mixtral and vicuna models

* add yi model and past_kv supprot for chatglm family

* add docs

* doc update

* add license

* update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants