Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression from hoisting out of nested loop #71059

Closed
performanceautofiler bot opened this issue Jun 21, 2022 · 11 comments
Closed

Regression from hoisting out of nested loop #71059

performanceautofiler bot opened this issue Jun 21, 2022 · 11 comments
Assignees
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS ubuntu 18.04
Baseline e5acd4dfd7b31ed790687e9f3da642fe1c964dc3
Compare 261574bf4121b40c7023c73571fb7a690397bd0f
Diff Diff

Regressions in Benchstone.MDBenchI.MDGeneralArray

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 13.31 ms 16.47 ms 1.24 0.00 False
Test2 - Duration of single invocation 13.21 ms 16.47 ms 1.25 0.00 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.MDBenchI.MDGeneralArray*'

Payloads

Baseline
Compare

Histogram

Benchstone.MDBenchI.MDGeneralArray.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 16.47376638701923 > 13.9781989521875.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -367.3133210148429 (T) = (0 -16562240.309642797) / Math.Sqrt((970113886.8242489 / (30)) + (1800460994.5065267 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.24455339319226346 = (13307778.03526843 - 16562240.309642797) / 13307778.03526843 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### Benchstone.MDBenchI.MDGeneralArray.Test2

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 16.472966578125 > 13.863741196189906.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -314.5082831379247 (T) = (0 -16561237.119994191) / Math.Sqrt((2077293600.8144262 / (30)) + (1676453099.0390358 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.2518517315082016 = (13229391.870586464 - 16561237.119994191) / 13229391.870586464 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS ubuntu 18.04
Baseline e5acd4dfd7b31ed790687e9f3da642fe1c964dc3
Compare 261574bf4121b40c7023c73571fb7a690397bd0f
Diff Diff

Regressions in Burgers

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test3 - Duration of single invocation 87.95 ms 115.96 ms 1.32 0.08 False
Test1 - Duration of single invocation 170.74 ms 261.06 ms 1.53 0.00 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Burgers*'

Payloads

Baseline
Compare

Histogram

Burgers.Test3


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 115.955026975 > 92.70167878125.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -76.86974915808065 (T) = (0 -117696004.90405983) / Math.Sqrt((712968655004.949 / (30)) + (4743413455200.672 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.3316230571523148 = (88385376.23083334 - 117696004.90405983) / 88385376.23083334 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### Burgers.Test1

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 261.05918285714284 > 179.29468991538465.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -1718.0951023682271 (T) = (0 -261284212.0427163) / Math.Sqrt((23363070275.099754 / (30)) + (77899970943.01614 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.530140609621464 = (170758301.8186508 - 261284212.0427163) / 170758301.8186508 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS ubuntu 18.04
Baseline e5acd4dfd7b31ed790687e9f3da642fe1c964dc3
Compare 261574bf4121b40c7023c73571fb7a690397bd0f
Diff Diff

Regressions in Benchstone.BenchI.Array2

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 679.53 ms 906.73 ms 1.33 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.BenchI.Array2*'

Payloads

Baseline
Compare

Histogram

Benchstone.BenchI.Array2.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 906.7278614615384 > 712.94808081875.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -605.4595292637864 (T) = (0 -907427594.9044238) / Math.Sqrt((3461697173857.5376 / (30)) + (1015343504891.7698 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.33497085726601056 = (679735883.3456594 - 907427594.9044238) / 679735883.3456594 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added CoreClr untriaged New issue has not been triaged by the area owner labels Jun 21, 2022
@kunalspathak kunalspathak changed the title [Perf] Changes at 6/14/2022 2:27:40 PM Regression from hoisting out of nested loop Jun 21, 2022
@kunalspathak kunalspathak transferred this issue from dotnet/perf-autofiling-issues Jun 21, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@kunalspathak kunalspathak assigned kunalspathak and unassigned EgorBo Jun 21, 2022
@kunalspathak
Copy link
Member

#68061

@kunalspathak
Copy link
Member

kunalspathak commented Jun 21, 2022

@kunalspathak
Copy link
Member

There is some interesting diff to look into. Specially the loop variable is stored and loaded in inner loops

image

Full diff: https://www.diffchecker.com/BDiOlcSB

@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 22, 2022
@ghost
Copy link

ghost commented Jun 22, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS ubuntu 18.04
Baseline e5acd4dfd7b31ed790687e9f3da642fe1c964dc3
Compare 261574bf4121b40c7023c73571fb7a690397bd0f
Diff Diff

Regressions in Benchstone.MDBenchI.MDGeneralArray

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 13.31 ms 16.47 ms 1.24 0.00 False
Test2 - Duration of single invocation 13.21 ms 16.47 ms 1.25 0.00 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.MDBenchI.MDGeneralArray*'

Payloads

Baseline
Compare

Histogram

Benchstone.MDBenchI.MDGeneralArray.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 16.47376638701923 > 13.9781989521875.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -367.3133210148429 (T) = (0 -16562240.309642797) / Math.Sqrt((970113886.8242489 / (30)) + (1800460994.5065267 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.24455339319226346 = (13307778.03526843 - 16562240.309642797) / 13307778.03526843 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### Benchstone.MDBenchI.MDGeneralArray.Test2

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 16.472966578125 > 13.863741196189906.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -314.5082831379247 (T) = (0 -16561237.119994191) / Math.Sqrt((2077293600.8144262 / (30)) + (1676453099.0390358 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.2518517315082016 = (13229391.870586464 - 16561237.119994191) / 13229391.870586464 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture x64
OS ubuntu 18.04
Baseline e5acd4dfd7b31ed790687e9f3da642fe1c964dc3
Compare 261574bf4121b40c7023c73571fb7a690397bd0f
Diff Diff

Regressions in Burgers

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test3 - Duration of single invocation 87.95 ms 115.96 ms 1.32 0.08 False
Test1 - Duration of single invocation 170.74 ms 261.06 ms 1.53 0.00 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Burgers*'

Payloads

Baseline
Compare

Histogram

Burgers.Test3


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 115.955026975 > 92.70167878125.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -76.86974915808065 (T) = (0 -117696004.90405983) / Math.Sqrt((712968655004.949 / (30)) + (4743413455200.672 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.3316230571523148 = (88385376.23083334 - 117696004.90405983) / 88385376.23083334 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### Burgers.Test1

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 261.05918285714284 > 179.29468991538465.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -1718.0951023682271 (T) = (0 -261284212.0427163) / Math.Sqrt((23363070275.099754 / (30)) + (77899970943.01614 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.530140609621464 = (170758301.8186508 - 261284212.0427163) / 170758301.8186508 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS ubuntu 18.04
Baseline e5acd4dfd7b31ed790687e9f3da642fe1c964dc3
Compare 261574bf4121b40c7023c73571fb7a690397bd0f
Diff Diff

Regressions in Benchstone.BenchI.Array2

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
Test - Duration of single invocation 679.53 ms 906.73 ms 1.33 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'Benchstone.BenchI.Array2*'

Payloads

Baseline
Compare

Histogram

Benchstone.BenchI.Array2.Test


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 906.7278614615384 > 712.94808081875.
IsChangePoint: Marked as a change because one of 6/14/2022 10:05:19 AM, 6/21/2022 6:07:15 AM falls between 6/12/2022 5:56:07 PM and 6/21/2022 6:07:15 AM.
IsRegressionStdDev: Marked as regression because -605.4595292637864 (T) = (0 -907427594.9044238) / Math.Sqrt((3461697173857.5376 / (30)) + (1015343504891.7698 / (39))) is less than -1.9960083540247138 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (30) + (39) - 2, .025) and -0.33497085726601056 = (679735883.3456594 - 907427594.9044238) / 679735883.3456594 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: kunalspathak
Labels:

area-CodeGen-coreclr, untriaged, refs/heads/main, ubuntu 18.04, RunKind=micro, Regression, CoreClr, x64

Milestone: -

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jun 23, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Jun 23, 2022
@kunalspathak
Copy link
Member

MD array regressions are fixed by #70271, but I will double check if without the hoisting PR, if it would have improved these benchmarks more.

image

image

@kunalspathak
Copy link
Member

I spent some time investigating Burgers.Test1 and I don't see any significant change post loop hoisting change that would have regressed this benchmark. In fact, it removes a "load from stack" from hot loop. Complete diffs https://www.diffchecker.com/vdWhemwy.

I do notice that post loophoist changes, the hot loop ends with "JCC erratum" and I am guessing that could be the reason for regression.

image

@kunalspathak
Copy link
Member

Likewise for Benchstone.BenchI.Array2, there is no change to the generated code that would have impacted the performance.

https://www.diffchecker.com/NyBN3k3F

@kunalspathak
Copy link
Member

I added a flag to disable "hoisting expression out of nested loop" in kunalspathak@a29daa4 to see what the impact is on performance post #70271. Hoisting doesn't change the generated code as seen in https://www.diffchecker.com/wQsuJBye. Here, left is code generated without "hoisting out of nested loop" and right is code generated with "hoisting out of nested loop" (same as main).

In fact, we hoist fetching upper/lower bounds out of nested loops as seen below.

image

So, to conclude, there could be regression from hoisting changes because of increasing register pressure, but MD optimizations bring that down. Also, by verifying the code without loop hoisting, we don't see much difference and we can conclude that it has not blocked any further improvements that we see in #71059 (comment). There is no further action needed for this issue, so I am closing it.

@mrsharm
Copy link
Member

mrsharm commented Aug 8, 2022

@kunalspathak, this regression showed up in multiple configurations while working on the monthly perf report. Should this regression be triaged as "by design"?

Thanks!

Benchstone.BenchI.Array2.Test

Result Ratio Alloc Delta Operating System Bit Processor Name
Slower 0.86 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 1.02 +0 Windows 11 Arm64 Microsoft SQ1 3.0 GHz
Same 0.94 +0 macOS Monterey 12.3 Arm64 Apple M1 Max
Slower 0.89 +0 Windows 10 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 0.90 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.64 +0 Windows 10 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Slower 0.68 +0 Windows 10 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R)
Slower 0.87 +0 Windows 10 X64 Intel Core i9-10900K CPU 3.70GHz
Same 0.95 +0 Windows 11 X64 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Same 1.01 +0 Windows 11 X64 AMD Ryzen 9 3950X
Slower 0.89 +0 Windows 11 X64 AMD Ryzen 9 5900X
Same 0.97 +0 Windows 11 X64 AMD Ryzen 9 5950X
Slower 0.66 +0 Windows 11 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.94 +0 Windows 11 X64 Intel Core i9-10900K CPU 3.70GHz
Slower 0.66 -288 Windows 11 X64 11th Gen Intel Core i9-11900H 2.50GHz
Same 1.03 +0 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.04 +0 ubuntu 18.04 X64 Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge)
Slower 0.77 +0 ubuntu 18.04 X64 Intel Core i7-8700 CPU 3.20GHz (Coffee Lake)
Same 0.93 +0 ubuntu 20.04 X64 AMD Ryzen 9 5900X
Same 1.02 +0 ubuntu 20.04 X64 Intel Core i9-10900K CPU 3.70GHz
Same 1.00 +0 Windows 10 X86 Intel Xeon CPU E5-1650 v4 3.60GHz
Same 1.00 +0 Windows 10 X86 Intel Core i7-6700 CPU 3.40GHz (Skylake)
Same 0.99 +0 Windows 11 X86 AMD Ryzen Threadripper PRO 3945WX 12-Cores
Slower 0.85 +0 macOS Big Sur 11.6.8 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)
Same 0.92 -672 macOS Monterey 12.3.1 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell)
Same 0.91 +0 macOS Monterey 12.4 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell)

@kunalspathak
Copy link
Member

I re-checked the Array2 benchmarks on unix/x64 as well as on Arm64 with/without loop hoisting and nothing pops out. The code is definitely better where we hoist certain invariants outside the nested loop. There could be JCC erratum triggering (at least for unix x64 case) that is causing it. E.g. https://www.diffchecker.com/XdBiwwH4

Other diffs are included in zip file for all the methods involved in the benchmark.

71059.zip

I will call it as "by design".

@ghost ghost locked as resolved and limited conversation to collaborators Sep 8, 2022
@jeffhandley jeffhandley added runtime-coreclr specific to the CoreCLR runtime os-linux Linux OS (any supported distro) arch-x64 and removed CoreClr labels Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime
Projects
None yet
Development

No branches or pull requests

6 participants