Add Github Actions workflow to trigger pipeline performance test #4231

lrcouto · 2024-10-15T16:05:04Z

Description

Adds a workflow to trigger a pipeline performance test on the performance test repo if the user adds the "performance" label to the pull request they wish to test.

The test will run a comparison in elapsed time between the branch with the performance label against the latest kedro release.

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

Read the contributing guidelines
Signed off each commit with a Developer Certificate of Origin (DCO)
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added a description of this change in the RELEASE.md file
Added tests to cover my changes
Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

ankatiyar · 2024-10-15T16:18:46Z

My initial thought is that this should be triggered not on every push or based on the commit message because someone might want to trigger this after making all the changes so making a random commit just to trigger a pipeline test seems counterintuitive.
Alternatives - Adding a "performance" label to the PR: https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#pull_request
WDYT?

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto · 2024-10-15T17:22:27Z

My initial thought is that this should be triggered not on every push or based on the commit message because someone might want to trigger this after making all the changes so making a random commit just to trigger a pipeline test seems counterintuitive. Alternatives - Adding a "performance" label to the PR: https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#pull_request WDYT?

I like the idea!

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

noklam · 2024-10-15T23:02:04Z

Agree on above, I like the idea about label better. How can we trigger it multiple times with label? Perhaps is it possible to trigger the workflow manually with parameter (commit hash/branch name?)

ankatiyar · 2024-10-16T10:21:22Z

.github/workflows/pipeline-performance-test.yml

+      run: |
+        echo "Commit message contains 'PERFORMANCE TEST'. Triggering performance test."
+
+    - name: Trigger performance test workflow in test project


Also I think you can use the checkout action to clone the performance test repo - https://github.com/actions/checkout
and then call kedro run

Has this been addressed?

lrcouto · 2024-10-16T14:58:25Z

Agree on above, I like the idea about label better. How can we trigger it multiple times with label? Perhaps is it possible to trigger the workflow manually with parameter (commit hash/branch name?)

Maybe we can have an alternative way to trigger from the test repo itself with a specific branch.

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto · 2024-10-16T18:50:05Z

I've changed the trigger method to use a label, and created a "performance" label to test it out. It will trigger when the label is added, and you can repeat the test by removing the label and adding it again or running the job again.

Now for a little bit of a philosophical question - My intention was to use commits not only to trigger the test itself but also to set those hook/save/load delay parameters that the test has. If we use a label instead of the commit to trigger the run, how could we set the delays? Maybe we could have both methods available, so you can use the label for a simple test but a commit if you want to manually alter the delays?

This is the test branch I'm using, by the way: #4233

merelcht · 2024-10-17T10:58:08Z

I've changed the trigger method to use a label, and created a "performance" label to test it out. It will trigger when the label is added, and you can repeat the test by removing the label and adding it again or running the job again.

I really like the label trigger!

Now for a little bit of a philosophical question - My intention was to use commits not only to trigger the test itself but also to set those hook/save/load delay parameters that the test has. If we use a label instead of the commit to trigger the run, how could we set the delays? Maybe we could have both methods available, so you can use the label for a simple test but a commit if you want to manually alter the delays?

This is the test branch I'm using, by the way: #4233

@lrcouto How do you set those parameters in the commit? And what are the default now?

lrcouto · 2024-10-17T14:54:20Z

@lrcouto How do you set those parameters in the commit? And what are the default now?

The default right now was no delays. My initial idea was passing the parameter line on the commit message and sending it on the POST payload, same as it's being done with the branch name right now. But I'm looking for alternatives that would work with the label solution.

lrcouto · 2024-10-17T15:56:03Z

@lrcouto How do you set those parameters in the commit? And what are the default now?

The default right now was no delays. My initial idea was passing the parameter line on the commit message and sending it on the POST payload, same as it's being done with the branch name right now. But I'm looking for alternatives that would work with the label solution.

I thought of two possible alternatives for this issue that we could use. What I would like to do is allow the user to pass all of the parameters to kedro run, so not only you can use the delay values specific to the test project but you could also, for example, set it to run with a specific runner.

The first option would be using PR comments to pass parameters and run the workflow. It's easy to use but the disadvantage is that we would pollute the PRs with parameter comments, which makes me not like this option very much.

The second would be to have a manually dispatched workflow on the test repo, in which you could go and input the parameters. Then you could use the performance label one for a quick run with all default values, or you could do something with more customization by going to the test repo itself and running it manually. I'm leaning more towards this one.

Let me know what you think!

merelcht · 2024-10-18T10:49:26Z

Maybe for a first iteration just triggering the test as is, is fine and we can create a follow up ticket to explore parameters? #4210 is already meant to check the different runners, so my question is whether it makes sense to also have that as a changing factor for this setup?

noklam · 2024-10-18T12:00:26Z

Agree with @merelcht, I think having parameters in commit are also a bit overkill and quite easy to break. I would prefer a sensible default (that are non-zero, otherwise it doesn't shows anything). If there are more needs to extend the parameters, we can always clone the repository and run with source code directly.

lrcouto · 2024-10-18T13:44:53Z

We can leave this as it is for now, and extend it in the future then.

merelcht

LGTM! 👍

ankatiyar · 2024-10-18T14:59:30Z

I do think you could checkout the other repo in this workflow and time and run it so one could see the results in the PR itself instead of triggering the workflow on the repo that contains the project where we would have to look through the Actions tab to see what the results are.

noklam

This looks good to me besides some comments left by @ankatiyar that haven't been addressed.

noklam · 2024-10-21T14:42:55Z

.github/workflows/pipeline-performance-test.yml

+      run: |
+        echo "Commit message contains 'PERFORMANCE TEST'. Triggering performance test."
+
+    - name: Trigger performance test workflow in test project


Has this been addressed?

Add Github Actions workflow to trigger pipeline performance test

5953f4d

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto requested review from noklam and ankatiyar October 15, 2024 16:05

lrcouto marked this pull request as ready for review October 15, 2024 16:05

lrcouto requested a review from merelcht as a code owner October 15, 2024 16:05

Lint

f405cdb

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

Lint

5fd5453

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

ankatiyar reviewed Oct 16, 2024

View reviewed changes

noklam linked an issue Oct 16, 2024 that may be closed by this pull request

[Stress Testing] Setup Kedro performance test for executing #4200

Open

Merge branch 'main' into add-pipe-performance-workflow

be18d5d

Change trigger to PR label

a9573e5

Signed-off-by: Laura Couto <laurarccouto@gmail.com>

lrcouto linked an issue Oct 17, 2024 that may be closed by this pull request

[Stress Testing] - Create example projects to assess Kedro performance for complex pipelines #3866

Open

Merge branch 'main' into add-pipe-performance-workflow

f1d6c49

merelcht approved these changes Oct 18, 2024

View reviewed changes

lrcouto requested review from merelcht and ankatiyar October 18, 2024 14:38

Merge branch 'main' into add-pipe-performance-workflow

9af45ff

noklam reviewed Oct 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Github Actions workflow to trigger pipeline performance test #4231

Add Github Actions workflow to trigger pipeline performance test #4231

lrcouto commented Oct 15, 2024 •

edited

Loading

ankatiyar commented Oct 15, 2024

lrcouto commented Oct 15, 2024

noklam commented Oct 15, 2024

ankatiyar Oct 16, 2024 •

edited

Loading

noklam Oct 21, 2024

lrcouto commented Oct 16, 2024

lrcouto commented Oct 16, 2024 •

edited

Loading

merelcht commented Oct 17, 2024

lrcouto commented Oct 17, 2024

lrcouto commented Oct 17, 2024

merelcht commented Oct 18, 2024

noklam commented Oct 18, 2024

lrcouto commented Oct 18, 2024

merelcht left a comment

ankatiyar commented Oct 18, 2024

noklam left a comment

noklam Oct 21, 2024

Add Github Actions workflow to trigger pipeline performance test #4231

Are you sure you want to change the base?

Add Github Actions workflow to trigger pipeline performance test #4231

Conversation

lrcouto commented Oct 15, 2024 • edited Loading

Description

Development notes

Developer Certificate of Origin

Checklist

ankatiyar commented Oct 15, 2024

lrcouto commented Oct 15, 2024

noklam commented Oct 15, 2024

ankatiyar Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

noklam Oct 21, 2024

Choose a reason for hiding this comment

lrcouto commented Oct 16, 2024

lrcouto commented Oct 16, 2024 • edited Loading

merelcht commented Oct 17, 2024

lrcouto commented Oct 17, 2024

lrcouto commented Oct 17, 2024

merelcht commented Oct 18, 2024

noklam commented Oct 18, 2024

lrcouto commented Oct 18, 2024

merelcht left a comment

Choose a reason for hiding this comment

ankatiyar commented Oct 18, 2024

noklam left a comment

Choose a reason for hiding this comment

noklam Oct 21, 2024

Choose a reason for hiding this comment

lrcouto commented Oct 15, 2024 •

edited

Loading

ankatiyar Oct 16, 2024 •

edited

Loading

lrcouto commented Oct 16, 2024 •

edited

Loading