Optimized-Matrix-Multiplication

This implementation leverages the NVIDIA CUDA framework and the cuBLAS library to optimize matrix multiplication using the cublasGemmEx function. By utilizing the power of GPU acceleration and advanced features like Tensor Cores (if supported), the computation is significantly faster compared to traditional CPU-based methods.

Overview

This application performs basic matrix multiplication: ( A \times B = C ).

Matrix dimensions:
- ( A ): rowsA x rank
- ( B ): rank x colsB
- ( C ): rowsA x colsB
Array representation:
Matrices are represented as 2D arrays using single raw pointers, e.g.:
```
float* A = new float[sizeA];
```

Accessing elements:
Elements are accessed using the following pattern:

for (size_t i = 0; i < rows; ++i)
{
    for (size_t j = 0; j < cols; ++j)
    {
        cout << A[j * rows + i] << " ";
    }
    cout << endl;
}

Data types:
Matrices ( A ) and ( B ) can use either 16-bit or 32-bit floats, but the result matrix ( C ) is always 32-bit.
Performance:
GPU (device) execution significantly outperforms CPU (host, single-threaded) execution.
Starting with cuBLAS version 11, Tensor Cores are utilized automatically. More details are available in the NVIDIA cuBLAS documentation.

Build Instructions

Linux

Install CUDA toolkit dependencies:
```
sudo apt install nvidia-cuda-toolkit
```
Build the application:
```
make all
```
Run the executable:
```
./main.out
```

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
DisplayGpuInfo.h		DisplayGpuInfo.h
ErrorHandling.h		ErrorHandling.h
Makefile		Makefile
MatrixUtilities.h		MatrixUtilities.h
README.md		README.md
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimized-Matrix-Multiplication

Overview

Build Instructions

Linux

About

Releases

Packages

Languages

versi379/Optimized-Matrix-Multiplication

Folders and files

Latest commit

History

Repository files navigation

Optimized-Matrix-Multiplication

Overview

Build Instructions

Linux

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages