Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow PVC of completed PipelineRun to be deleted #5776

Open
jimmyjones2 opened this issue Nov 19, 2022 · 20 comments · May be fixed by #6635
Open

Allow PVC of completed PipelineRun to be deleted #5776

jimmyjones2 opened this issue Nov 19, 2022 · 20 comments · May be fixed by #6635
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@jimmyjones2
Copy link
Contributor

jimmyjones2 commented Nov 19, 2022

Feature request

For PVCs specified as part of a PipelineRun with volumeClaimTemplate:

Use case

If a PVC is specified as part of a PipelineRun with volumeClaimTemplate, the PVC lifecycle is the same as the PipelineRun. In the case where PipelineRuns are kept for longer periods this is wasteful of storage resources and some users may want to delete the associated PVC before the PipelineRun, perhaps to keep logs, or because the PipelineRun is managed by a gitops controller.

See #1784 for when this was first noticed after PVC protection had been implemented in Kubernetes

@jimmyjones2 jimmyjones2 added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2022
@jlpettersson
Copy link
Member

If I remember it correctly from implementing volumeClaimTemplate. The PVC cannot be deleted because the Pod has the volume mounted, so the Pod needs to be deleted first.

@jimmyjones2
Copy link
Contributor Author

Hey @jlpettersson - a completed pod no longer has the PVC mounted - easily demonstrated as it doesn't prevent a RWO PVC from being mounted by another node. However the PVC protection feature prevents PVCs from being deleted if referenced by pods, even if in completed state.

The linked Argo workflows PR removes the PVC protection finalizer when it knows all the pods are completed and it's safe to do so

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 3, 2023
@jimmyjones2
Copy link
Contributor Author

/remove-lifecycle stale Still an issue

@vdemeester
Copy link
Member

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 4, 2023
@pritidesai
Copy link
Member

Just came across this issue this morning where I didn't expect the PVC to be seen after the pipelineRun was completed.

I am trying to understand... what is the use of such PVCs? Aren't they unnecessary reserving resources? Shouldn't this be a bug rather than a feature?

I think, the pvc created with volumeClaimTemplate should be deleted once the pipelineRun is completed.

/cc @skaegi

@pritidesai
Copy link
Member

Thanks @skaegi for the references: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations

Deleting and/or scaling a StatefulSet down will not delete the volumes associated with the StatefulSet. This is done to ensure data safety, which is generally more valuable than an automatic purge of all related StatefulSet resources.

The statefulset is deleted when a pipelineRun is completed but the pvc created with volumeClaimTemplate is not cleaned up. We could at the least provide an option either cluster wide or per pipelineRun to clean up pvc so that the operators or the users can choose how they would like to configure their deployments.

@jimmyjones2
Copy link
Contributor Author

@pritidesai An option to clean up PVCs per PipelineRun would be awesome! I suppose in some cases people might want the option to keep them around for debugging?

For various reasons I have to keep completed PipelineRuns around, and if I were using volumeClaimTemplates each would consume it's own PVC which would be wasteful and expensive (I'm building 50+ apps). As a workaround I have to use a shared RWX PVC as the workspace with a subPath 🤮

@jerop
Copy link
Member

jerop commented May 8, 2023

Related: #5412

pritidesai pushed a commit to pritidesai/pipeline that referenced this issue May 9, 2023
PVCs can be specified using volumeClaimTemplates as part of a pipelineRun.
The pipelineRun controller creates a new PVC using the specifications from
the template. Once the pipelineRun is complete, the PVC exist to faciliate
any analysis on the completed pod. Such PVCs are helpful but requires an
extra cleanup if not needed. In the cases where pipelineRuns are kept for a
long period of time, such PVC can waste resources.

This commit introduces an option for the user such that the users can configure
pipelineRun controller to delete such PVCs. There is no API change needed to
support this cleanup.

PipelineRun/TaskRun controller now reads a label from the volumeClaimTemplate
to decide whether to delete PVC or not. Set the label " pvc-protection-finalizer"
to "remove" to take advantage of this kind of cleanup.

kind: PipelineRun
metadata:
  generateName: pipeline-run-
spec:
  workspaces:
    - name: source
      volumeClaimTemplate:
        metadata:
          labels:
            pvc-protection-finalizer: remove

Closes tektoncd#5776

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai pushed a commit to pritidesai/pipeline that referenced this issue May 9, 2023
PVCs can be specified using volumeClaimTemplates as part of a pipelineRun.
The pipelineRun controller creates a new PVC using the specifications from
the template. Once the pipelineRun is complete, the PVC exist to faciliate
any analysis on the completed pod. Such PVCs are helpful but requires an
extra cleanup if not needed. In the cases where pipelineRuns are kept for a
long period of time, such PVC can waste resources.

This commit introduces an option for the user such that the users can configure
pipelineRun controller to delete such PVCs. There is no API change needed to
support this cleanup.

PipelineRun/TaskRun controller now reads a label from the volumeClaimTemplate
to decide whether to delete PVC or not. Set the label " pvc-protection-finalizer"
to "remove" to take advantage of this kind of cleanup.

kind: PipelineRun
metadata:
  generateName: pipeline-run-
spec:
  workspaces:
    - name: source
      volumeClaimTemplate:
        metadata:
          labels:
            pvc-protection-finalizer: remove

Closes tektoncd#5776

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai pushed a commit to pritidesai/pipeline that referenced this issue May 9, 2023
PVCs can be specified using volumeClaimTemplates as part of a pipelineRun.
The pipelineRun controller creates a new PVC using the specifications from
the template. Once the pipelineRun is complete, the PVC exist to faciliate
any analysis on the completed pod. Such PVCs are helpful but requires an
extra cleanup if not needed. In the cases where pipelineRuns are kept for a
long period of time, such PVC can waste resources.

This commit introduces an option for the user such that the users can configure
pipelineRun controller to delete such PVCs. There is no API change needed to
support this cleanup.

PipelineRun/TaskRun controller now reads a label from the volumeClaimTemplate
to decide whether to delete PVC or not. Set the label " pvc-protection-finalizer"
to "remove" to take advantage of this kind of cleanup.

kind: PipelineRun
metadata:
  generateName: pipeline-run-
spec:
  workspaces:
    - name: source
      volumeClaimTemplate:
        metadata:
          labels:
            pvc-protection-finalizer: remove

Closes tektoncd#5776

Signed-off-by: pritidesai <pdesai@us.ibm.com>
@pritidesai
Copy link
Member

Hey @jimmyjones2 I have created a PoC here - #6635 Please let me know if this works for you. Thanks! Once we finalize on the design, I will update the PR to include tests. The changes in the PR are tested locally.

pritidesai pushed a commit to pritidesai/pipeline that referenced this issue May 9, 2023
PVCs can be specified using volumeClaimTemplates as part of a pipelineRun.
The pipelineRun controller creates a new PVC using the specifications from
the template. Once the pipelineRun is complete, the PVC exist to faciliate
any analysis on the completed pod. Such PVCs are helpful but requires an
extra cleanup if not needed. In the cases where pipelineRuns are kept for a
long period of time, such PVC can waste resources.

This commit introduces an option for the user such that the users can configure
pipelineRun controller to delete such PVCs. There is no API change needed to
support this cleanup.

PipelineRun/TaskRun controller now reads a label from the volumeClaimTemplate
to decide whether to delete PVC or not. Set the label " pvc-protection-finalizer"
to "remove" to take advantage of this kind of cleanup.

kind: PipelineRun
metadata:
  generateName: pipeline-run-
spec:
  workspaces:
    - name: source
      volumeClaimTemplate:
        metadata:
          labels:
            pvc-protection-finalizer: remove

Closes tektoncd#5776

Signed-off-by: pritidesai <pdesai@us.ibm.com>
pritidesai pushed a commit to pritidesai/pipeline that referenced this issue May 9, 2023
PVCs can be specified using volumeClaimTemplates as part of a pipelineRun.
The pipelineRun controller creates a new PVC using the specifications from
the template. Once the pipelineRun is complete, the PVC exist to faciliate
any analysis on the completed pod. Such PVCs are helpful but requires an
extra cleanup if not needed. In the cases where pipelineRuns are kept for a
long period of time, such PVC can waste resources.

This commit introduces an option for the user such that the users can configure
pipelineRun controller to delete such PVCs. There is no API change needed to
support this cleanup.

PipelineRun/TaskRun controller now reads a label from the volumeClaimTemplate
to decide whether to delete PVC or not. Set the label " pvc-protection-finalizer"
to "remove" to take advantage of this kind of cleanup.

kind: PipelineRun
metadata:
  generateName: pipeline-run-
spec:
  workspaces:
    - name: source
      volumeClaimTemplate:
        metadata:
          labels:
            pvc-protection-finalizer: remove

Closes tektoncd#5776

Signed-off-by: pritidesai <pdesai@us.ibm.com>
Signed-off-by: Priti Desai <pdesai@us.ibm.com>
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 7, 2023
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 6, 2023
@jimmyjones2
Copy link
Contributor Author

jimmyjones2 commented Sep 17, 2023

@QuanZhang-William Just tried v0.51.0 with coschedule: workspaces and was expecting the PVC to be deleted as per #6940 - however there seems to be a race condition - sometimes the PVC is deleted, but other times I was left with a PVC in Terminating state as the kubernetes.io/pvc-protection wasn't removed

@QuanZhang-William
Copy link
Member

Hi @jimmyjones2, #6940 only supports purging PVC with coschedule:pipelineruns mode. The change is not applied to coschedule:workspaces mode due to backward compatibility concerns.

You can see more detail in this discussion

@jimmyjones2
Copy link
Contributor Author

Sorry @QuanZhang-William I wrote that incorrectly - when I tried coschedule: pipelineruns I sometimes ended up with terminating PVCs. With cochedule: workspaces I got the previous behavior of the PVC remaining after completion.

@QuanZhang-William
Copy link
Member

@jimmyjones2 . Thanks for reporting the bug! With some investigation, the flake is caused by the wrong order of purging finalizer and deleting PVCs.

It seems like finalizers cannot be patched/purged when the PVC is not in terminating status, so we need to delete the PVC first and then purge the finalizer.

I'm working on a PR to fix it.

@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vdemeester
Copy link
Member

/reopen
/remove-lifecycle rotten

@tekton-robot
Copy link
Collaborator

@vdemeester: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot reopened this Oct 23, 2023
@tekton-robot tekton-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: In Progress
7 participants