Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What other library that optimum-benchmark support other than transformer #117

Closed
L1-M1ng opened this issue Feb 1, 2024 · 5 comments
Closed

Comments

@L1-M1ng
Copy link

L1-M1ng commented Feb 1, 2024

Can I use Optimum-benchmark to evaluate the performance of qwen.cpp or llama.cpp?

@L1-M1ng
Copy link
Author

L1-M1ng commented Feb 1, 2024

I want to evaluate the inference latency, throughput, parameter numbers of a custom llm

@IlyasMoutawwakil
Copy link
Member

optimum-benchmark support transformers, timm and diffusers as part of the pytorch backend. And then there's optimum and its subpackages like optimum-intel, optimum-nvidia, etc.
There's also a text-generation-server backend which uses docker-py to benchmark the server end-to-end (i.e. with communications overhead).
currently benchmarking llama.cpp is not supported, how would you suggest implementing it ? might possible with a server client api (like tgi) but I'm not sure.

@IlyasMoutawwakil
Copy link
Member

@L1-M1ng I would love to review a PR with llama.cpp support, https://github.com/abetlen/llama-cpp-python seems to be the most starred python bindings

@IlyasMoutawwakil
Copy link
Member

llama cpp support added in #231 🚀

@zaidalyafeai
Copy link

@IlyasMoutawwakil , I am trying to run optimum-benchmark --config-dir examples/ --config-name llama_cpp_text_generation but I got this error

Error getting class at optimum_benchmark.backends.llama_cpp.backend.LlamaCppBackend: Error loading 'optimum_benchmark.backends.llama_cpp.backend.LlamaCppBackend'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants