tolerate `run` without `pass/failed` for benchmarks #438

pohly · 2024-09-11T12:38:17Z

Something in Go 1.20 changed so that there is a run event for benchmarks. As before, there is no pass or failed. This triggers the "test is running and thus must have failed" logic in gotestsum, which causes it to report the benchmark as failed.

As this is the behavior of Go (whether it's correct or not...), gotestsum should accept such output. To reduce the risk that genuine problems go undetected, such incomplete benchmarks get reported when a panic was detected (the original motivation for this workaround).

Related-to: #413 (comment)

Something in Go 1.20 changed so that there is a `run` event for benchmarks. As before, there is no `pass` or `failed`. This triggers the "test is running and thus must have failed" logic in gotestsum, which causes it to report the benchmark as failed. As this is the behavior of Go (whether it's correct or not...), gotestsum should accept such output. To reduce the risk that genuine problems go undetected, such incomplete benchmarks get reported when a panic was detected (the original motivation for this workaround).

pohly · 2024-09-11T12:40:35Z

testjson/testdata/summary/go-1-20-benchmark-panic

+BenchmarkDiscardLogInfoOneArg
+BenchmarkDiscardLogInfoOneArg-36    	12695464	        91.33 ns/op
+
+DONE 1 tests, 1 failure in 0.000s


Note that the default format is not informative enough when there was a panic: that panic output is not shown.

Other formats don't have that problem:

$ go run ./ --format=standard-verbose --raw-command cat testjson/testdata/input/go-1-20-panicked-benchmark.out goos: linux goarch: amd64 pkg: github.com/go-logr/logr/benchmark cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz === RUN BenchmarkDiscardLogInfoOneArg BenchmarkDiscardLogInfoOneArg BenchmarkDiscardLogInfoOneArg-36 12695464 91.33 ns/op PASS ok github.com/go-logr/logr/benchmark 1.265s panic: fabricated panic === Failed === FAIL: github.com/go-logr/logr/benchmark BenchmarkDiscardLogInfoOneArg (unknown) === RUN BenchmarkDiscardLogInfoOneArg BenchmarkDiscardLogInfoOneArg BenchmarkDiscardLogInfoOneArg-36 12695464 91.33 ns/op DONE 1 tests, 1 failure in 0.001s

This is unrelated to this PR, I just noticed because the tests use the default format.

pohly · 2024-09-11T12:43:30Z

testjson/execution.go

@@ -276,6 +281,13 @@ func (p *Package) end() []TestEvent {
 			continue
 		}

+		if tc.Test.IsBenchmark() && !p.panicked {


It's debatable whether this should check for a panic. I opted for staying closer to the current behavior unless a benchmark completed successfully.

Alternatively, the exit code could also be considered - but it is not always available?

pohly · 2024-09-16T19:28:23Z

Gentle ping... this is kind of urgent because of kubernetes/kubernetes#127245.

dnephin · 2024-09-16T21:26:05Z

Thank you for the PR! And sorry for the delay. I have been thinking about this change, and I don't think it's quite right.

The problem is that benchmarks don't send run events, but this fix is ignoring failures, which seems like the wrong fix for the problem.

I think #416 is the right fix for this. The problem is that we're adding an empty event into the running map, which then fails later on in this code you are changing. Instead of ignoring the failure, I think we need to fix the missing event.

dnephin · 2024-09-16T21:29:22Z

I added a comment to that issue with some other ideas about a fix. I like your suggestion of looking at the exit status to figure out if we might have failing tests with missing end events, but I'm not sure yet exactly what that could should look like.

As a workaround, is it possible to run the benchmarks without gotestsum, or do they get run together with tests?

pohly · 2024-09-17T10:10:06Z

I think #416 is the right fix for this.

I got odd results when I tried it - see #416 (comment).

As a workaround, is it possible to run the benchmarks without gotestsum, or do they get run together with tests?

The shell scripts which run the benchmarks don't know that they are running benchmarks 😢 It all depends on what's in the package and which parameters are passed through.

The workaround while we try to figure out a solution would be to revert to what Kubernetes was doing previously: run go test -json, capture output, then summarize with gobenchstat. It would still produce erroneous "Failed" output in the summary, but at least the jobs would be considered as passing again because the non-zero exit code of gobenchstat would get ignored.

pohly commented Sep 11, 2024

View reviewed changes

pohly mentioned this pull request Sep 13, 2024

[Failing Test] Strange ci-benchmark-scheduler-perf-master behavior kubernetes/kubernetes#127245

Open

pohly mentioned this pull request Sep 17, 2024

Fix panic from missing root test #416

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tolerate `run` without `pass/failed` for benchmarks #438

tolerate `run` without `pass/failed` for benchmarks #438

pohly commented Sep 11, 2024 •

edited

Loading

pohly Sep 11, 2024

pohly Sep 11, 2024

pohly commented Sep 16, 2024

dnephin commented Sep 16, 2024 •

edited

Loading

dnephin commented Sep 16, 2024

pohly commented Sep 17, 2024

tolerate run without pass/failed for benchmarks #438

Are you sure you want to change the base?

tolerate run without pass/failed for benchmarks #438

Conversation

pohly commented Sep 11, 2024 • edited Loading

pohly Sep 11, 2024

Choose a reason for hiding this comment

pohly Sep 11, 2024

Choose a reason for hiding this comment

pohly commented Sep 16, 2024

dnephin commented Sep 16, 2024 • edited Loading

dnephin commented Sep 16, 2024

pohly commented Sep 17, 2024

tolerate `run` without `pass/failed` for benchmarks #438

tolerate `run` without `pass/failed` for benchmarks #438

pohly commented Sep 11, 2024 •

edited

Loading

dnephin commented Sep 16, 2024 •

edited

Loading