Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[TUTORIALS] persistent kernel - fp8 matmul (#4099)
Including performance comparison between naive matmul (improved version of tutorial matmul), cuBLAS implementation, persistent kernel w/o and w/ TMA.
- Loading branch information