Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show date of last data point and use confidence intervals. #14

Merged
merged 4 commits into from
Nov 13, 2024

Conversation

vext01
Copy link
Contributor

@vext01 vext01 commented Oct 30, 2024

Example output here:
https://ykjit.github.io/yk-benchmarks/

The error bars are for the most part much slimmer. The geomean plot still has wide error bars.

I've checked and double-checked my working and don't see anything obviously wrong, so I assume that's the nature of geomean confidence intervals.

@ltratt
Copy link
Contributor

ltratt commented Oct 30, 2024

Excellent -- thanks!

The geomean plot still has wide error bars.

Notice that the y-axis scale on the geomean has changed: they are still quite big, but nowhere as big as before when you control for that.

@ltratt ltratt added this pull request to the merge queue Oct 30, 2024
@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

Notice that the y-axis scale on the geomean has changed

Ah hah! Thanks.

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 30, 2024
@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

This failed for unrelated reasons. The Queens and Permute benchmarks failed:

12:23:36 Executing run:Queens (YkLua, awfy, 1000) default None None None
12:23:36     cmd: /ci/yklua/src/yklua  harness.lua Queens 1  1000
12:23:36     cwd: /ci/suites/awfy/Lua
12:23:36     Run failed. Return code: 134
12:23:36     Output:
12:23:36 
12:23:36         Starting Queens benchmark ...
12:23:36         thread '<unnamed>' panicked at ykrt/src/compile/jitc_yk/codegen/x64/mod.rs:2299:18:
12:23:36         Unexpected HotLocationKind
12:23:36         note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
12:23:36         Queens: iterations=1 runtime: 4175462us
12:23:36         Queens: iterations=1 average: 4175462us total: 4175462us
12:23:36         
12:23:36         Total Runtime: 4175462us
12:23:36         fatal runtime error: failed to initiate panic, error 5
12:23:36         Aborted (core dumped)
12:23:36         
12:23:36     Execution has failed, benchmark is aborted.
12:24:20 Executing run:Permute (YkLua, awfy, 1000) default None None None
12:24:20     cmd: /ci/yklua/src/yklua  harness.lua Permute 1  1000
12:24:20     cwd: /ci/suites/awfy/Lua
12:24:20     Run failed. Return code: 1
12:24:20     Output:
12:24:20 
12:24:20         Starting Permute benchmark ...
12:24:20         /ci/yklua/src/yklua: ./permute.lua:42: bad 'for' initial value (number expected, got function)
12:24:20         stack traceback:
12:24:20         	./permute.lua:42: in function 'permute.permute'
12:24:20         	./permute.lua:44: in function 'permute.permute'
12:24:20         	./permute.lua:41: in function 'permute.permute'
12:24:20         	./permute.lua:44: in function 'permute.permute'
12:24:20         	./permute.lua:44: in function 'permute.permute'
12:24:20         	./permute.lua:29: in function 'permute.benchmark'
12:24:20         	./benchmark.lua:27: in function 'benchmark.inner_benchmark_loop'
12:24:20         	harness.lua:49: in method 'measure'
12:24:20         	harness.lua:60: in method 'do_runs'
12:24:20         	harness.lua:43: in method 'run_benchmark'
12:24:20         	harness.lua:97: in main chunk
12:24:20         	[C]: in ?
12:24:20         
12:24:20     Execution has failed, benchmark is aborted.

The first one should be easy to track down, but the second is likely miscompilation and could be tricky.

@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

(I'm going to add a benchmarking run to yk CI to try to catch these things before they go in)

@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

I also noticed that the commits aren't signed, so I should do that at some point when we are ready.

@ltratt ltratt added this pull request to the merge queue Oct 30, 2024
@ltratt ltratt removed this pull request from the merge queue due to a manual request Oct 30, 2024
@ltratt
Copy link
Contributor

ltratt commented Oct 30, 2024

Please squash.

@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

Mind if I squash/sign first?

@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

squashed + signed.

@ltratt
Copy link
Contributor

ltratt commented Oct 30, 2024

What's the "reporter" prefix mean?

@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

The reporter is the name of the rust crate that generates the report.

This repo contains both benchmarks and the code to make the html report.

@ltratt
Copy link
Contributor

ltratt commented Oct 30, 2024

Isn't it obvious from the commit files being changed what's been changed? I think the prefixes will make searching for more changes a bit more annoying. If you agree (but you don't have to agree!) please force push an update.

@vext01
Copy link
Contributor Author

vext01 commented Oct 30, 2024

done

@ltratt
Copy link
Contributor

ltratt commented Oct 30, 2024

Thanks!

@ltratt ltratt added this pull request to the merge queue Oct 30, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 30, 2024
@vext01
Copy link
Contributor Author

vext01 commented Oct 31, 2024

This failure is curious:

...
14:25:36 [ykbuild 0.1.0] [5555/5991] Building CXX object tools/lld/wasm/CMakeFiles/lldWasm.dir/OutputSegment.cpp.o
14:25:36 [ykbuild 0.1.0] [5556/5991] Building CXX object tools/lld/wasm/CMakeFiles/lldWasm.dir/MarkLive.cpp.o
14:25:36 [ykbuild 0.1.0] [5557/5991] Building Opts.inc...
14:25:42 time="2024-10-30T14:25:42Z" level=error msg="error waiting for container: unexpected EOF"

Looks like a something went wrong with docker.

I think we could retry this, but after our benchmarking session is complete. I'll ping when it's time.

@ltratt ltratt added this pull request to the merge queue Nov 1, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 1, 2024
@ltratt
Copy link
Contributor

ltratt commented Nov 1, 2024

I'm about to file an issue for permute.lua.

@vext01
Copy link
Contributor Author

vext01 commented Nov 12, 2024

I think this should merge now.

@ltratt ltratt added this pull request to the merge queue Nov 12, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 12, 2024
@ltratt ltratt added this pull request to the merge queue Nov 13, 2024
@vext01 vext01 removed this pull request from the merge queue due to a manual request Nov 13, 2024
@vext01
Copy link
Contributor Author

vext01 commented Nov 13, 2024

This needs a little more work before it will merge. Something is crashing in stats-ci, investigating.

stats-ci crashes if you try to compute a confidence interval from a
single sample:
xdefago/stats-ci#3
@vext01
Copy link
Contributor Author

vext01 commented Nov 13, 2024

That last push fixes the aforementioned crash and updates the plot labels as requested.

No squash required.

@ltratt ltratt added this pull request to the merge queue Nov 13, 2024
Merged via the queue into ykjit:main with commit 0e2f150 Nov 13, 2024
2 checks passed
@vext01 vext01 deleted the ci branch November 13, 2024 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants