Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.
gpu cuda cublas nvidia gemm matrix-multiply tensor-core hgemm back2back-hgemm fused-hgemm back2back-gemm fused-gemm
-
Updated
Nov 3, 2023 - Cuda