Add performance benchmarking scripts and run in CI #1978

EwoutH · 2024-01-19T17:43:51Z

Let's see how many attempts this is going to take

General usage (once this is merged to main):

Go to the main branch.
Run global_benchmarks.py
Go to your development branch
Run global_benchmarks.py again
Run compare_timings.py

It should output a table like this:

Model	Size	Init time [95% CI]	Run time [95% CI]
SchellingModel	small	🔵 +5.7% [+1.7%, +10.8%]	🔵 +5.8% [+1.1%, +12.0%]
SchellingModel	large	🔵 -1.0% [-2.1%, -0.2%]	🔵 -0.5% [-2.1%, +1.2%]
WolfSheep	small	🔵 +0.0% [-0.6%, +0.6%]	🔵 -0.0% [-0.5%, +0.4%]
WolfSheep	large	🔴 +211.2% [+172.4%, +255.0%]	🔵 +4.6% [-8.0%, +22.4%]
BoidFlockers	small	🔵 +0.1% [-9.7%, +11.6%]	🔵 +3.1% [-0.7%, +7.4%]
BoidFlockers	large	🔵 -11.9% [-25.2%, -1.8%]	🟢 -18.2% [-31.6%, -4.7%]

Positive indicates increate in runtime, negative a decrease.

https://github.com/JuliaDynamics/ABMFrameworksComparison/tree/5551d7abf1611d377b3b32346c7774f176af4c65

Few less replications, some more seeds. Every benchmark now takes between 10 and 20 seconds (on my machine).

That allows switching branches without benchmarks results disappearing

Prints some stuff when running and saves a pickle file after running.

- The bootstrap_speedup_confidence_interval function calculates the mean speedup and its confidence interval using bootstrapping, which is more suitable for paired data. - The mean speedup and confidence interval are calculated for both initialization and run times. - Positive values indicate an increase in time (longer duration), and negative values indicate a decrease (shorter duration). - The results are displayed in a DataFrame with the percentage changes and their confidence intervals.

EwoutH · 2024-01-19T17:46:36Z

/rerun-benchmarks

EwoutH · 2024-01-19T17:50:17Z

/rerun-benchmarks

Edit: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#issue_comment

Note: This event will only trigger a workflow run if the workflow file is on the default branch.

EwoutH · 2024-01-19T18:06:23Z

(I'm truly sorry if someone set email notifications for "ready for review")

Corvince · 2024-01-19T18:08:09Z

I think there is a vscode extension that lets you run GitHub actions locally, maybe give it a try?

Corvince · 2024-01-19T18:10:39Z

(I'm truly sorry if someone set email notifications for "ready for review")

I don't, but why are you doing this? To trigger GA actions?

EwoutH · 2024-01-19T20:11:13Z

From my reading GITHUB_TOKEN that you seem to use should have that permission. Are you sure that's the problem?

Unfortunately, I'm 90% sure. The error:

Error: Unhandled error: HttpError: Resource not accessible by integration

implies
The error along with a 403 status code indicates a permissions issue. It most likely means that the GitHub token being used does not have the necessary permissions to create a comment on the issue or pull request.

EwoutH · 2024-01-19T20:17:56Z

Okay, there are a few minor things, like Black and ruff failing because the added benchmark models don't adhere to their standards, but the benchmark scripts are further ready. General usage (once this is merged to main):

Go to the main branch.
Run global_benchmarks.py
Go to your development branch
Run global_benchmarks.py again
Run compare_timings.py

It should output a table like this:

Model	Size	Init time [95% CI]	Run time [95% CI]
SchellingModel	small	🔵 +5.7% [+1.7%, +10.8%]	🔵 +5.8% [+1.1%, +12.0%]
SchellingModel	large	🔵 -1.0% [-2.1%, -0.2%]	🔵 -0.5% [-2.1%, +1.2%]
WolfSheep	small	🔵 +0.0% [-0.6%, +0.6%]	🔵 -0.0% [-0.5%, +0.4%]
WolfSheep	large	🔴 +211.2% [+172.4%, +255.0%]	🔵 +4.6% [-8.0%, +22.4%]
BoidFlockers	small	🔵 +0.1% [-9.7%, +11.6%]	🔵 +3.1% [-0.7%, +7.4%]
BoidFlockers	large	🔵 -11.9% [-25.2%, -1.8%]	🟢 -18.2% [-31.6%, -4.7%]

Positive indicates increate in runtime, negative a decrease. I will add fancy emojis tomorrow to make it really clear if it's good, bad or insignificant.

Corvince · 2024-01-19T20:20:43Z

I think you can set the permission inside the workflow file similar to this

https://github.com/projectmesa/mesa/blob/main/.github%2Fworkflows%2Frelease.yml#L18-L20

EwoutH · 2024-01-19T20:25:01Z

According to ChatGPT it's something different:

Adding the id-token: write permission in your GitHub Actions workflow will not resolve the issue you're experiencing with the "Resource not accessible by integration" error. The id-token: write permission is used to request an OIDC token for use with other services that understand OpenID Connect (OIDC) tokens, and it is unrelated to the permissions needed to comment on issues or pull requests in GitHub.

Corvince · 2024-01-19T20:36:13Z

That was just meant as an example. I would guess you need pull-requests: write

Corvince · 2024-01-19T20:36:32Z

https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions

If the 95% confidence interval is: - fully below -3%: 🟢 - fully above +3%: 🔴 - else: 🔵

See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions

You never knows what helps

EwoutH · 2024-01-19T20:57:48Z

@Corvince Thanks for your insights. They're really useful.

I think I zoomed further in to the issue: https://docs.github.com/en/actions/security-guides/automatic-token-authentication#permissions-for-the-github_token

Check the column "Maximum access for pull requests from public forked repositories". Everything is read, even if you define it explicitly.

I'm pulling this PR in from my own fork EwoutH/mesa. Many contributors will do that, so this workflow has to work from public forks. But I understand very clearly why it's disabled by default, otherwise anyone can open any PR and do anything.

So the solution stays the same as suggested here, we need to make a new very-specific, but permissive token with "pull-requests: write" access that only runs on this workflow and only is allowed to make pull request comments.

EwoutH · 2024-01-19T21:22:02Z

Weird stuff. Can't figure it out without access to tokens and secrets unfortunately. Will wait until I have the permissions.

quaquel · 2024-01-20T08:15:53Z

@EwoutH Do you mind if I commit to this branch with various minor updates?

fix for < 3.12
insertion of mesa into sys.path
ruff related fixes

EwoutH · 2024-01-20T08:19:28Z

Go ahead, and thanks for asking.

With #1979 merged I just wanted to rebase it on that PR, shall I do that first?

EwoutH · 2024-01-20T08:29:38Z

@quaquel with #1979 merged it actually easier to just open a new branch (and PR) from that. On this branch everything is outdated except the CI stuff.

EwoutH · 2024-01-20T08:37:55Z

I'm going to do a bit of weekend, I will take a looks at the CI tomorrow or Monday again.

(@jackiekazil or @tpike3 I would greatly appreciate if I have the permissions by then)

quaquel · 2024-01-20T08:45:31Z

I see a major performance difference between this branch and master (factor 5 or so). I want to dig in and see which commit explains this difference. Will keep you posted.

EwoutH · 2024-01-20T08:58:56Z

In runtime? Quite sure it’s this one bc61345, for this branch I set the replications to 1 to iterate faster.

jackiekazil · 2024-01-21T04:28:13Z

Should this one be closed in lieu of the other PR?

EwoutH · 2024-01-21T16:14:55Z

Closing this, #1979 and #1983 have replaced this PR.

EwoutH added 12 commits January 18, 2024 20:49

benchmarks: Upload initial models from JuliaDynamics

dae101a

https://github.com/JuliaDynamics/ABMFrameworksComparison/tree/5551d7abf1611d377b3b32346c7774f176af4c65

benchmarks: Add dictionary with configurations

0c284be

benchmarks: Update configurations, use relative imports

c67ad70

benchmarks: Add single script to run all benchmarks

9de2647

benchmarks: Update configurations

b72cdb1

Few less replications, some more seeds. Every benchmark now takes between 10 and 20 seconds (on my machine).

benchmarks: Add generated pickle files to gitignore

80d61be

That allows switching branches without benchmarks results disappearing

benchmarks: Update global script to calculate and save stuff

b6611af

Prints some stuff when running and saves a pickle file after running.

benchmarks: Remove seperate benchmark scripts

d31bb27

benchmarks: Black and ruff

5a17e8e

benchmarks: Add GitHub Actions workflow

70149af

update versions

d2f1704

EwoutH marked this pull request as ready for review January 19, 2024 17:54

Update checkout

2915730

EwoutH marked this pull request as draft January 19, 2024 17:56

EwoutH marked this pull request as ready for review January 19, 2024 17:56

Update benchmarks.yml

0b5bba9

EwoutH marked this pull request as draft January 19, 2024 17:58

EwoutH marked this pull request as ready for review January 19, 2024 17:58

install dependencies

e3d9355

EwoutH marked this pull request as draft January 19, 2024 18:01

EwoutH marked this pull request as ready for review January 19, 2024 18:01

test checkout first

2de025e

EwoutH marked this pull request as draft January 19, 2024 18:05

EwoutH marked this pull request as ready for review January 19, 2024 18:05

EwoutH added 2 commits January 19, 2024 21:39

Add fancy colors!

f649925

If the 95% confidence interval is: - fully below -3%: 🟢 - fully above +3%: 🔴 - else: 🔵

Add permission to write to issues

dfacbd3

See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#permissions

EwoutH force-pushed the benchmarks-ci branch from 6de6392 to dfacbd3 Compare January 19, 2024 20:45

Remove explicit token

da5c9d9

You never knows what helps

quaquel mentioned this pull request Jan 19, 2024

Proposal: Formal neighborhood definition for grids #1900

Open

EwoutH added 2 commits January 19, 2024 22:13

Try using pull_request_target

ad116b1

run on ready_for_review and opened

6369622

EwoutH marked this pull request as ready for review January 19, 2024 21:15

EwoutH marked this pull request as draft January 19, 2024 21:21

EwoutH mentioned this pull request Jan 19, 2024

Add performance benchmarking scripts #1979

Merged

EwoutH changed the title ~~Add performance benchmarking scripts~~ Add performance benchmarking scripts and run in CI Jan 19, 2024

rht approved these changes Jan 20, 2024

View reviewed changes

EwoutH mentioned this pull request Jan 20, 2024

Add CI workflow for performance benchmarks #1983

Merged

EwoutH closed this Jan 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add performance benchmarking scripts and run in CI #1978

Add performance benchmarking scripts and run in CI #1978

EwoutH commented Jan 19, 2024 •

edited

Loading

EwoutH commented Jan 19, 2024

EwoutH commented Jan 19, 2024 •

edited

Loading

EwoutH commented Jan 19, 2024

Corvince commented Jan 19, 2024

Corvince commented Jan 19, 2024

EwoutH commented Jan 19, 2024

EwoutH commented Jan 19, 2024 •

edited

Loading

Corvince commented Jan 19, 2024

EwoutH commented Jan 19, 2024

Corvince commented Jan 19, 2024

Corvince commented Jan 19, 2024

EwoutH commented Jan 19, 2024

EwoutH commented Jan 19, 2024

quaquel commented Jan 20, 2024

EwoutH commented Jan 20, 2024

EwoutH commented Jan 20, 2024

EwoutH commented Jan 20, 2024

quaquel commented Jan 20, 2024

EwoutH commented Jan 20, 2024

jackiekazil commented Jan 21, 2024

EwoutH commented Jan 21, 2024

Add performance benchmarking scripts and run in CI #1978

Add performance benchmarking scripts and run in CI #1978

Conversation

EwoutH commented Jan 19, 2024 • edited Loading

EwoutH commented Jan 19, 2024

EwoutH commented Jan 19, 2024 • edited Loading

EwoutH commented Jan 19, 2024

Corvince commented Jan 19, 2024

Corvince commented Jan 19, 2024

EwoutH commented Jan 19, 2024

EwoutH commented Jan 19, 2024 • edited Loading

Corvince commented Jan 19, 2024

EwoutH commented Jan 19, 2024

Corvince commented Jan 19, 2024

Corvince commented Jan 19, 2024

EwoutH commented Jan 19, 2024

EwoutH commented Jan 19, 2024

quaquel commented Jan 20, 2024

EwoutH commented Jan 20, 2024

EwoutH commented Jan 20, 2024

EwoutH commented Jan 20, 2024

quaquel commented Jan 20, 2024

EwoutH commented Jan 20, 2024

jackiekazil commented Jan 21, 2024

EwoutH commented Jan 21, 2024

EwoutH commented Jan 19, 2024 •

edited

Loading

EwoutH commented Jan 19, 2024 •

edited

Loading

EwoutH commented Jan 19, 2024 •

edited

Loading