概要

cuBLASを用いたMatMulのパフォーマンスを測定します。
各行列サイズ(16, 32, 64, 128, 256, 512, 1024, 2048)に対して任意の回数実行時間を測定して、中央値を出力します。
titan RTX, RTX3090, RTX4090に対応しています。その他を追加する場合はCMakeLists.txtのCMAKE_CUDA_ARCHITECTURESに対応する値を追加してください。

Require

CUDA
g++
CMake
python
- seaborn
- matplotlib
- pandas

How To Use

mkdir build
cd build
cmake .. && make
./main

実行時引数を渡すことでloop回数を変更できます。

./main 1000

plot.pyでresultsからグラフを生成します。

自分用

コードを変更していない場合

cd build && make && ./main 100000 && cd .. && python3 plot.py

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
main.cpp		main.cpp
output_3090_linear.png		output_3090_linear.png
output_3090_log.png		output_3090_log.png
output_4090_linear.png		output_4090_linear.png
output_4090_log.png		output_4090_log.png
output_titan_linear.png		output_titan_linear.png
output_titan_log.png		output_titan_log.png
plot.py		plot.py
results_3090.csv		results_3090.csv
results_4090.csv		results_4090.csv
results_titan.csv		results_titan.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

概要

Require

How To Use

Performance

Linear-scale

titan

3090

4090

Log-scale

titan

3090

4090

About

Releases

Packages

Languages

License

fukushimalab/cuBLAS_benchmark

Folders and files

Latest commit

History

Repository files navigation

概要

Require

How To Use

Performance

Linear-scale

titan

3090

4090

Log-scale

titan

3090

4090

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages