LLaMA-Inference-Bench

LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators

Metrix of Evaluated Frameworks and Hardwares :

Framework/ Hardware	NVIDIA A100	NVIDIA H100	NVIDIA GH200	AMD MI250	AMD MI300X	Intel Max1550	Habana Gaudi2	Sambanova SN40L
vLLM	Yes	Yes	Yes	Yes	Yes	Yes	No	N/A
llama.cpp	Yes	Yes	Yes	Yes	Yes	Yes	N/A	N/A
TensorRT-LLM	Yes	Yes	Yes	N/A	N/A	N/A	N/A	N/A
DeepSpeed-MII	Yes	No	No	No	No	No	Yes	N/A
Sambaflow	N/A	N/A	N/A	N/A	N/A	N/A	N/A	Yes

Key Insights

Cite this work:

@misc{chittyvenkata2024llminferencebenchinferencebenchmarkinglarge,
     title={LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators}, 
     author={Krishna Teja Chitty-Venkata and Siddhisanket Raskar and Bharat Kale and Farah Ferdaus and Aditya Tanikanti and Ken Raffenetti and Valerie Taylor and Murali Emani and Venkatram Vishwanath},
     year={2024},
     eprint={2411.00136},
     archivePrefix={arXiv},
     primaryClass={cs.LG},
     url={https://arxiv.org/abs/2411.00136}, 
}

Acknowledgements

This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
Deepspeed-MII		Deepspeed-MII
Plots		Plots
Sambaflow		Sambaflow
TensorRT-LLM		TensorRT-LLM
llama.cpp		llama.cpp
vLLM		vLLM
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaMA-Inference-Bench

Metrix of Evaluated Frameworks and Hardwares :

Key Insights

Acknowledgements

About

Releases

Packages

Contributors 4

Languages

License

argonne-lcf/LLM-Inference-Bench

Folders and files

Latest commit

History

Repository files navigation

LLaMA-Inference-Bench

Metrix of Evaluated Frameworks and Hardwares :

Key Insights

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages