Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Cpp Inferencing API to generate the text #156

Merged
merged 26 commits into from
Nov 12, 2024

Conversation

asmigosw
Copy link
Contributor

@asmigosw asmigosw commented Oct 15, 2024

This example demonstrates how to execute a model on AI 100 using Efficient Transformers and C++ APIs. The Efficient Transformers library is utilized for transforming and compiling the model, while the QPC is executed using C++ APIs.

@anujgupt-github
Copy link
Contributor

can you move the code files from examples/xyz.cpp to examples/cpp_execution/xyz.cpp?

@anujgupt-github
Copy link
Contributor

Can you add a readme.md to cpp_execution directory?

@anujgupt-github anujgupt-github added the good first issue Good for newcomers label Oct 15, 2024
@asmigosw asmigosw force-pushed the cpp_inference branch 3 times, most recently from 95c053c to 99158e7 Compare October 15, 2024 15:33
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
examples/cpp_execution/text_inference_using_cpp.py Outdated Show resolved Hide resolved
examples/cpp_execution/README.md Show resolved Hide resolved
examples/cpp_execution/README.md Outdated Show resolved Hide resolved
examples/cpp_execution/README.md Outdated Show resolved Hide resolved
examples/cpp_execution/README.md Outdated Show resolved Hide resolved
examples/cpp_execution/InferenceSetIOBufferExample.cpp Outdated Show resolved Hide resolved
@quic-rishinr quic-rishinr added the in-review Review process is ongoing label Oct 17, 2024
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Copy link
Contributor

@ochougul ochougul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add
from QEfficient.utils.logging_utils import logger
Replace all print statements with

logger.info

Also, please paste a photo/log of running this file for any model here.

QEfficient/generation/text_generation_inference.py Outdated Show resolved Hide resolved
examples/cpp_execution/text_inference_using_cpp.py Outdated Show resolved Hide resolved
examples/cpp_execution/text_inference_using_cpp.py Outdated Show resolved Hide resolved
examples/cpp_execution/text_inference_using_cpp.py Outdated Show resolved Hide resolved
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
make -j 8

# Run the python script to get the generated text
cd ../../../
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required? Can we not run the file from any other path?


## Prerequisite
1. PyBind11
2. Cpp17 or above
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this clear enough, it's better to include what gcc version you have tested on. You can add a line saying the below readme file is validated with cpp version ** and gcc version **.

This example demonstrates how to execute a model on AI 100 using Efficient Transformers and C++ APIs. The Efficient Transformers library is utilized for transforming and compiling the model, while the QPC is executed using C++ APIs. It is tested on both x86 and ARM platform.

## Prerequisite
1. PyBind11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead add pip3 install pybind11

try:
import InferenceSetIOBuffer # noqa: E402
except ImportError:
logger.info("Error importing InferenceSetIOBuffer Module")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise error here. We should not continue execution when we are not able to import the SO file

except ImportError:
logger.info("Error importing InferenceSetIOBuffer Module")
else:
logger.info("so file's folder not found")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise error saying FileNotFoundError("Please follow README instructions to first compile the cpp files")

aic_enable_depth_first: bool = False,
mos: int = -1,
batch_size: int = 1,
full_batch_size: Optional[int] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines +141 to +143
enable_debug_logs: bool = False,
stream: bool = True,
full_batch_size: Optional[int] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

examples/cpp_execution/README.md Show resolved Hide resolved
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
@quic-rishinr quic-rishinr merged commit 244d81f into quic:main Nov 12, 2024
4 checks passed
quic-akuruvil pushed a commit to quic-akuruvil/efficient_transformers that referenced this pull request Nov 12, 2024
* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* Added Cpp Inferencing API to generate the text

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>

* fixed rasing errors and README

Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>

---------

Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Co-authored-by: Onkar Chougule <quic_ochougul@quicinc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers in-review Review process is ongoing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants