-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAPI cluster stuck in failed after infrastructure cluster has failureMessage #10991
Comments
This issue is currently awaiting triage. If CAPI contributors determine this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
failureMessage / failureReason are supposed to be used to signal terminal failures. Terminal in a sense of that they cannot be recovered from. That is why there are no code paths to unset them again (we've been trying to get rid of the concept of terminal failure for a while, looks like with v1beta2 we'll get around to actually doing it, see: #10897) |
Ok, so this is an issue with cluster API provider OpenStack (with which this is happening)? They shouldn't set these fields if it's not a terminal failure then? |
Correct! |
(manual workaround is ~ kubectl edit --subresource=status) |
Looking forward to get #10997 implemented and get rid of failureMessage / failureReason /close |
@fabriziopandini: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What steps did you take and what happened?
As seen in
cluster-api/internal/controllers/cluster/cluster_controller_phases.go
Lines 135 to 144 in 4a0900c
cluster-api/internal/controllers/cluster/cluster_controller_phases.go
Lines 59 to 61 in 4a0900c
failed
But I can't find any line that sets the
failureMessage
andfailureReason
to nil again.What did you expect to happen?
That the
failureMessage
andfailureReason
get's reset at some pointCluster API version
1.6.3, but the same code is in main as well
Kubernetes version
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.14
Anything else you would like to add?
If no one objects, I'd open a PR with the following changes;
Label(s) to be applied
/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.
The text was updated successfully, but these errors were encountered: