rocm

Here are 151 public repositories matching this topic...

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving trainium

Updated Jan 12, 2025
Python

apache / tvm

Star

Open deep learning compiler stack for cpu, gpu and specialized accelerators

javascript machine-learning performance deep-learning metal compiler gpu vulkan opencl tensor spirv rocm tvm

Updated Dec 16, 2024
Python

cupy / cupy

Sponsor

Star

NumPy & SciPy for GPU

python gpu numpy cuda cublas scipy tensor cudnn rocm cupy cusolver nccl curand cusparse nvrtc cutensor nvtx cusparselt

Updated Jan 10, 2025
Python

lshqqytiger / stable-diffusion-webui-amdgpu

Sponsor

Star

Stable Diffusion web UI

web ai deep-learning amd torch image-generation hip amdgpu rocm radeon text2image image2image img2img ai-art directml txt2img stable-diffusion

Updated Dec 25, 2024
Python

dmlc / nnvm

Star

deep-learning deployment metal optimization opencl cuda computation-graph rocm nnvm tvm

Updated Sep 11, 2018
C++

deepmodeling / deepmd-kit

Star

A deep learning package for many-body potential energy representation and molecular dynamics

nodejs python c deep-learning cpp tensorflow cuda molecular-dynamics pytorch computational-chemistry lammps materials-science paddle ipi rocm ase jax potential-energy deepmd

Updated Jan 10, 2025
C++

aphrodite-engine / aphrodite-engine

Star

Large-scale LLM inference engine

machine-learning cuda intel api-rest lora rocm inference-engine tpu inferentia speculative-decoding

Updated Jan 9, 2025
Python

stotko / stdgpu

Star

stdgpu: Efficient STL-like Data Structures on the GPU

cpp gpu modern-cpp openmp cuda stl data-structures gpgpu gpu-acceleration cpp17 stl-containers hip gpu-computing rocm cpp20 stl-like

Updated Nov 26, 2024
C++

ROCm / ROCm-docker

Star

Dockerfiles for the various software layers defined in the ROCm software platform

docker rocm

Updated Aug 21, 2024
Shell

alpaka-group / alpaka

Star

Abstraction Library for Parallel Kernel Acceleration 🦙

cpp hpc gpu openmp cuda header-only cpp17 hip heterogeneous-parallel-programming tbb openacc rocm

Updated Jan 9, 2025
C++

ROCm / rocBLAS

Star

Next generation BLAS implementation for ROCm platform

blas hip rocm

Updated Jan 11, 2025
C++

agenium-scale / nsimd

Star

Agenium Scale vectorization library for CPUs and GPUs

hpc neon cuda avx simd avx2 sse2 simd-programming aarch64 avx512 simd-instructions simd-library sse42 rocm cpp20 sve neon128 cpp20-library vectorization-library

Updated Oct 21, 2021
C

ROCm / k8s-device-plugin

Star

Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster

kubernetes k8s rocm kubernetes-device-plugins

Updated Jan 8, 2025
Go

JuliaGPU / AMDGPU.jl

Star

AMD GPU (ROCm) programming in Julia

julia amdgpu rocm

Updated Jan 11, 2025
Julia

patientx / ComfyUI-Zluda

Star

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. Now ZLUDA enhanced for better AMD GPU performance.

windows amd cuda rocm stable-diffusion comfyui zluda

Updated Jan 12, 2025
Python

ROCm / pytorch

Star

Tensors and Dynamic neural networks in Python with strong GPU acceleration

pytorch rocm

Updated Jan 11, 2025
Python

LLNL / hiop

Star

HPC solver for nonlinear optimization problems

Updated Dec 16, 2024
C++

ROCm / aomp

Star

AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.

amd llvm openmp clang fortran-compiler rocm

Updated Jan 11, 2025
Fortran

eth-cscs / COSMA

Star

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm

linear-algebra mpi cuda scalapack matrix-multiplication gpu-acceleration rocm matmul communication-optimal pdgemm

Updated Dec 11, 2024
C++

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated Jan 10, 2025
C++

Improve this page

Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rocm

Here are 151 public repositories matching this topic...

vllm-project / vllm

apache / tvm

cupy / cupy

lshqqytiger / stable-diffusion-webui-amdgpu

dmlc / nnvm

deepmodeling / deepmd-kit

aphrodite-engine / aphrodite-engine

stotko / stdgpu

ROCm / ROCm-docker

alpaka-group / alpaka

ROCm / rocBLAS

agenium-scale / nsimd

ROCm / k8s-device-plugin

JuliaGPU / AMDGPU.jl

patientx / ComfyUI-Zluda

ROCm / pytorch

LLNL / hiop

ROCm / aomp

eth-cscs / COSMA

ROCm / MIVisionX

Improve this page

Add this topic to your repo