forked from paboyle/Grid
-
Notifications
You must be signed in to change notification settings - Fork 1
Intel i7 980X benchmarks
Guido Cossu edited this page May 30, 2015
·
1 revision
Test system: Intel® Core™ i7-980X Processor Extreme Edition (12M Cache, 3.33 GHz, 6.40 GT/s Intel® QPI)
Compiler: Clang++ 3.5 -O3 -msse4
Wilson Dirac operator
Grid is setup to use 1 threads
norm result 4.88357e+06
norm ref 4.88357e+06
mflop/s = 10988
norm diff 3.47483e-08
Memory bandwidth
/Grid/benchmarks$ ./Grid_memory_bandwidth
Grid is setup to use 1 threads
====================================================================================================
= Benchmarking fused AXPY bandwidth
====================================================================================================
L bytes GB/s Gflop/s
----------------------------------------------------------
4 2.46e+04 40.3 6.71
8 3.93e+05 37.5 6.25
12 1.99e+06 38.4 6.4
16 6.29e+06 37.7 6.28
20 1.54e+07 28.4 4.74
24 3.19e+07 27.6 4.6
28 5.9e+07 26.4 4.4
32 1.01e+08 29.2 4.86
====================================================================================================
= Benchmarking a*x + y bandwidth
====================================================================================================
L bytes GB/s Gflop/s
----------------------------------------------------------
4 2.46e+04 38.3 6.39
8 3.93e+05 38.8 6.47
12 1.99e+06 38.8 6.47
16 6.29e+06 38.6 6.44
20 1.54e+07 30.8 5.13
24 3.19e+07 27.4 4.56
28 5.9e+07 27.3 4.54
32 1.01e+08 29.3 4.88
====================================================================================================
= Benchmarking SCALE bandwidth
====================================================================================================
L bytes GB/s Gflop/s
4 1.64e+04 26.7 3.34
8 2.62e+05 18.9 2.37
12 1.33e+06 18.3 2.29
16 4.19e+06 17.8 2.23
20 1.02e+07 18.2 2.27
24 2.12e+07 18.5 2.31
28 3.93e+07 19.6 2.45
32 6.71e+07 18.2 2.27
====================================================================================================
= Benchmarking READ bandwidth
====================================================================================================
L bytes GB/s Gflop/s
----------------------------------------------------------
4 8.19e+03 19.2 9.59
8 1.31e+05 23.6 11.8
12 6.64e+05 21.8 10.9
16 2.1e+06 21.8 10.9
20 5.12e+06 19.1 9.55
24 1.06e+07 19.3 9.66
28 1.97e+07 22.4 11.2
32 3.36e+07 22.5 11.3
SU3 matrix multiply
Grid/benchmarks$ ./Grid_su3
Grid is setup to use 1 threads
====================================================================================================
= Benchmarking SU3xSU3 x= x*y
====================================================================================================
L bytes GB/s GFlop/s
----------------------------------------------------------
2 2.3e+03 2.27 2.08
4 3.69e+04 5.59 5.13
6 1.87e+05 6.26 5.74
8 5.9e+05 6.29 5.76
10 1.44e+06 6.33 5.8
12 2.99e+06 6.35 5.82
14 5.53e+06 6.33 5.8
16 9.44e+06 5.98 5.49
18 1.51e+07 5.71 5.24
20 2.3e+07 5.64 5.17
22 3.37e+07 5.66 5.19
24 4.78e+07 5.64 5.17
====================================================================================================
= Benchmarking SU3xSU3 z= x*y
====================================================================================================
L bytes GB/s GFlop/s
----------------------------------------------------------
2 3.46e+03 5.75 5.27
4 5.53e+04 14 12.9
6 2.8e+05 15.6 14.3
8 8.85e+05 15.6 14.3
10 2.16e+06 15.8 14.5
12 4.48e+06 15.3 14
14 8.3e+06 15.6 14.3
16 1.42e+07 15.4 14.1
18 2.27e+07 15.5 14.2
20 3.46e+07 15.1 13.8
22 5.06e+07 15.4 14.1
24 7.17e+07 15.1 13.8
====================================================================================================
= Benchmarking SU3xSU3 mult(z,x,y)
====================================================================================================
L bytes GB/s GFlop/s
----------------------------------------------------------
2 3.46e+03 16 14.7
4 5.53e+04 15.1 13.8
6 2.8e+05 15.3 14
8 8.85e+05 15 13.7
10 2.16e+06 14.8 13.6
12 4.48e+06 14.5 13.3
14 8.3e+06 14.8 13.6
16 1.42e+07 14.3 13.1
18 2.27e+07 12.6 11.5
20 3.46e+07 13.5 12.4
22 5.06e+07 14.8 13.6
24 7.17e+07 15 13.8
====================================================================================================
= Benchmarking SU3xSU3 mac(z,x,y)
====================================================================================================
L bytes GB/s GFlop/s
----------------------------------------------------------
2 3.46e+03 1.14 1.14
4 5.53e+04 1.13 1.13
6 2.8e+05 1.15 1.15
8 8.85e+05 1.15 1.15
10 2.16e+06 1.15 1.15
12 4.48e+06 1.15 1.15
14 8.3e+06 1.15 1.15
16 1.42e+07 1.13 1.13
18 2.27e+07 1.14 1.14
20 3.46e+07 1.13 1.13
22 5.06e+07 1.15 1.15
24 7.17e+07 1.15 1.15