A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Nov 6, 2024 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Stable Diffusion web UI
A deep learning package for many-body potential energy representation and molecular dynamics
stdgpu: Efficient STL-like Data Structures on the GPU
Large-scale LLM inference engine
Agenium Scale vectorization library for CPUs and GPUs
Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
HPC solver for nonlinear optimization problems
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.
Zero-knowledge template library
Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.
To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."