#

fused-gemm

Here is 1 public repository matching this topic...

Bruce-Lee-LY / cuda_back2back_hgemm

Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.

gpu cuda cublas nvidia gemm matrix-multiply tensor-core hgemm back2back-hgemm fused-hgemm back2back-gemm fused-gemm

Updated Nov 3, 2023
Cuda

Improve this page

Add a description, image, and links to the fused-gemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the fused-gemm topic, visit your repo's landing page and select "manage topics."