-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleting PVCs created by VolumeClaimTemplate is flaky #7138
Comments
Thanks for reporting the bug @jimmyjones2 (see #5776 (comment))! The race condition is caused by the wrong order of purging finalizer and deleting PVCs. The root cause is that the k8s Switching the the order of the calls should fix the race condition. /assign |
fixes [tektoncd#7138][tektoncd#7138]. In `coschedule:pipelineruns` mode, the `pvcs` created by `VolumeClaimTemplate` should be automatically deleted when the owning `pipelinerun` is completed. To delete a `pvc` used by a pod (pipelinerun), the [kubernetes.io/pvc-protection][finalizer] finalizer must be removed. Prior to this commit, the Tekton reconciler attempted to remove the `kubernetes.io/pvc-protection` finalizer first and then deletes the created pvcs. This results in race condition since `PVCProtectionController` tries to add back the finalizer if the `pvc` is in `bounded` status . If the add back action happens before the PVC deletion call is completed, the `PVC` will be left in `terminating` status. This commit changes the reconciler to delete the `PVC` first and then removes the `finalizer` to fix the race condition. This commit also adds integration test to validate `pvc` status when a `pipelinerun` is completed. /kind bug [tektoncd#7138]: tektoncd#7138 [finalizer]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection
fixes [tektoncd#7138][tektoncd#7138]. In `coschedule:pipelineruns` mode, the `pvcs` created by `VolumeClaimTemplate` should be automatically deleted when the owning `pipelinerun` is completed. To delete a `pvc` used by a pod (pipelinerun), the [kubernetes.io/pvc-protection][finalizer] finalizer must be removed. Prior to this commit, the Tekton reconciler attempted to remove the `kubernetes.io/pvc-protection` finalizer first and then deletes the created pvcs. This results in race condition since `PVCProtectionController` tries to add back the finalizer if the `pvc` is in `bounded` status . If the add back action happens before the PVC deletion call is completed, the `PVC` will be left in `terminating` status. This commit changes the reconciler to delete the `PVC` first and then removes the `finalizer` to fix the race condition. This commit removes `TestPurgeFinalizerAndDeletePVCForWorkspace` UT, since the finalizer behavior cannot be mocked when a PVC is deleted. This commit also adds integration test to cover this scenario. /kind bug [tektoncd#7138]: tektoncd#7138 [finalizer]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection
fixes [tektoncd#7138][tektoncd#7138]. In `coschedule:pipelineruns` mode, the `pvcs` created by `VolumeClaimTemplate` should be automatically deleted when the owning `pipelinerun` is completed. To delete a `pvc` used by a pod (pipelinerun), the [kubernetes.io/pvc-protection][finalizer] finalizer must be removed. Prior to this commit, the Tekton reconciler attempted to remove the `kubernetes.io/pvc-protection` finalizer first and then deletes the created pvcs. This results in race condition since `PVCProtectionController` tries to add back the finalizer if the `pvc` is in `bounded` status . If the add back action happens before the PVC deletion call is completed, the `PVC` will be left in `terminating` status. This commit changes the reconciler to delete the `PVC` first and then removes the `finalizer` to fix the race condition. This commit removes `TestPurgeFinalizerAndDeletePVCForWorkspace` UT, since the finalizer behavior cannot be mocked when a PVC is deleted. This commit also adds integration test to cover this scenario. /kind bug [tektoncd#7138]: tektoncd#7138 [finalizer]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection
fixes [tektoncd#7138][tektoncd#7138]. In `coschedule:pipelineruns` mode, the `pvcs` created by `VolumeClaimTemplate` should be automatically deleted when the owning `pipelinerun` is completed. To delete a `pvc` used by a pod (pipelinerun), the [kubernetes.io/pvc-protection][finalizer] finalizer must be removed. Prior to this commit, the Tekton reconciler attempted to remove the `kubernetes.io/pvc-protection` finalizer first and then deletes the created pvcs. This results in race condition since `PVCProtectionController` tries to add back the finalizer if the `pvc` is in `bounded` status . If the add back action happens before the PVC deletion call is completed, the `PVC` will be left in `terminating` status. This commit changes the reconciler to delete the `PVC` first and then removes the `finalizer` to fix the race condition. This commit removes `TestPurgeFinalizerAndDeletePVCForWorkspace` UT, since the finalizer behavior cannot be mocked when a PVC is deleted. This commit also adds integration test to cover this scenario. /kind bug [tektoncd#7138]: tektoncd#7138 [finalizer]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection
fixes [#7138][#7138]. In `coschedule:pipelineruns` mode, the `pvcs` created by `VolumeClaimTemplate` should be automatically deleted when the owning `pipelinerun` is completed. To delete a `pvc` used by a pod (pipelinerun), the [kubernetes.io/pvc-protection][finalizer] finalizer must be removed. Prior to this commit, the Tekton reconciler attempted to remove the `kubernetes.io/pvc-protection` finalizer first and then deletes the created pvcs. This results in race condition since `PVCProtectionController` tries to add back the finalizer if the `pvc` is in `bounded` status . If the add back action happens before the PVC deletion call is completed, the `PVC` will be left in `terminating` status. This commit changes the reconciler to delete the `PVC` first and then removes the `finalizer` to fix the race condition. This commit removes `TestPurgeFinalizerAndDeletePVCForWorkspace` UT, since the finalizer behavior cannot be mocked when a PVC is deleted. This commit also adds integration test to cover this scenario. /kind bug [#7138]: #7138 [finalizer]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection
Expected Behavior
pvcs
created byVolumeClaimTemplate
should be deleted when the owningpipelinerun
is completed withcoschedule:pipelineruns
.Actual Behavior
There is race condition, some
pvcs
are randomly left interminating
statusSteps to Reproduce the Problem
disable-affinity-assistant: "true"
andcoschedule: "pipelineruns"
PipelineRun
multiple timespvcs
and some PVCs are left interminating status randomly
Additional Info
Kubernetes version:
Output of
kubectl version
:Tekton Pipeline version:
Output of
tkn version
orkubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
The text was updated successfully, but these errors were encountered: