Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PipelineRun reconciler panic for computed timeouts #6886

Merged
merged 1 commit into from
Jul 14, 2023

Conversation

lbernick
Copy link
Member

Prior to this commit, log lines in the pipelinerun reconciler assumed that if a pipelineRun had reached its tasks timeout, spec.timeouts.tasks was set, rather than computed from spec.timeouts.pipeline and spec.timeouts.finally, and likewise for the finally timeout.

This commit updates log messages to avoid the controller panic, and adds tests for this fix.

/kind bug
closes #6885

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • n/a Has Docs if any changes are user facing, including updates to minimum requirements e.g. Kubernetes version bumps
  • Has Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings). See some examples of good release notes.
  • n/a Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

bug fix: Avoid controller panics for computed timeouts

@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Jun 28, 2023
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 28, 2023
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/pipelinerun.go 91.5% 91.5% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/pipelinerun.go 91.5% 91.5% 0.0

Copy link
Member

@QuanZhang-William QuanZhang-William left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. lgtm!

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 3, 2023
@@ -636,7 +638,7 @@ func (c *Reconciler) reconcile(ctx context.Context, pr *v1.PipelineRun, getPipel
}
}
if tasksToTimeOut.Len() > 0 {
logger.Infof("PipelineRun tasks timeout of %s reached, cancelling tasks", pr.Spec.Timeouts.Tasks.Duration.String())
logger.Infof("PipelineRun tasks timeout of %s reached, cancelling tasks", tasksTimeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: If taskTimeout is nil, this will not panic but the message won't be very helpful. Also, I wonder if we really need to log at Info level every time a timeout fires. That's also something we can fix in a separate PR though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed log level to debug! I think if tasksTimeout is nil, that would signify a bug in our implementation- not sure what the best way to handle that would be?

Prior to this commit, log lines in the pipelinerun reconciler
assumed that if a pipelineRun had reached its tasks timeout,
`spec.timeouts.tasks` was set, rather than computed from
`spec.timeouts.pipeline` and `spec.timeouts.finally`,
and likewise for the finally timeout.

This commit updates log messages to avoid the controller panic,
and adds tests for this fix.
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/pipelinerun.go 91.5% 91.5% 0.0

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage-df to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/pipelinerun.go 91.5% 91.5% 0.0

@afrittoli afrittoli added this to the Pipelines v0.50 (LTS) milestone Jul 11, 2023
@afrittoli
Copy link
Member

/cherry-pick release-v0.47.x

@tekton-robot
Copy link
Collaborator

@afrittoli: once the present PR merges, I will cherry-pick it on top of release-v0.47.x in a new PR and assign it to you.

In response to this:

/cherry-pick release-v0.47.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@afrittoli
Copy link
Member

/cherry-pick release-v0.44.x

@tekton-robot
Copy link
Collaborator

@afrittoli: once the present PR merges, I will cherry-pick it on top of release-v0.44.x in a new PR and assign it to you.

In response to this:

/cherry-pick release-v0.44.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lbernick
Copy link
Member Author

@afrittoli would you mind taking another look at this PR?

Copy link
Member

@jerop jerop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

thanks @lbernick

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 14, 2023
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jerop, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot merged commit 7ab37a1 into tektoncd:main Jul 14, 2023
2 checks passed
@tekton-robot
Copy link
Collaborator

@afrittoli: #6886 failed to apply on top of branch "release-v0.47.x":

Applying: Fix PipelineRun reconciler panic for computed timeouts
Using index info to reconstruct a base tree...
M	pkg/reconciler/pipelinerun/pipelinerun.go
M	pkg/reconciler/pipelinerun/pipelinerun_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/reconciler/pipelinerun/pipelinerun_test.go
CONFLICT (content): Merge conflict in pkg/reconciler/pipelinerun/pipelinerun_test.go
Auto-merging pkg/reconciler/pipelinerun/pipelinerun.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Fix PipelineRun reconciler panic for computed timeouts
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-v0.47.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot
Copy link
Collaborator

@afrittoli: #6886 failed to apply on top of branch "release-v0.44.x":

Applying: Fix PipelineRun reconciler panic for computed timeouts
Using index info to reconstruct a base tree...
M	pkg/reconciler/pipelinerun/pipelinerun.go
M	pkg/reconciler/pipelinerun/pipelinerun_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/reconciler/pipelinerun/pipelinerun_test.go
CONFLICT (content): Merge conflict in pkg/reconciler/pipelinerun/pipelinerun_test.go
Auto-merging pkg/reconciler/pipelinerun/pipelinerun.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Fix PipelineRun reconciler panic for computed timeouts
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-v0.44.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vdemeester
Copy link
Member

@lbernick @afrittoli this didn't get cherry-picked to the previous LTS branches.. Did we cherry-pick manually or is it something we still need to do ?

@afrittoli
Copy link
Member

@lbernick @afrittoli this didn't get cherry-picked to the previous LTS branches.. Did we cherry-pick manually or is it something we still need to do ?

Oh, good catch @vdemeester, I did not cherry pick it, so unless someone else did it, it needs to be done

@lbernick
Copy link
Member Author

opened #6999 and #7000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tekton controller panic at github.com/tektoncd/pipeline/pkg/reconciler/pipelinerun.(*Reconciler).reconcile
6 participants