-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEP-0135] Revert PVC creation #6893
[TEP-0135] Revert PVC creation #6893
Conversation
Skipping CI for Draft Pull Request. |
/test all |
/kind feature |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
@@ -88,7 +88,7 @@ func (c *Reconciler) createOrUpdateAffinityAssistantsPerAABehavior(ctx context.C | |||
} | |||
for claimTemplate, workspaceName := range claimTemplatesToWorkspace { | |||
aaName := getAffinityAssistantName(workspaceName, pr.Name) | |||
err := c.createOrUpdateAffinityAssistant(ctx, aaName, pr, []corev1.PersistentVolumeClaim{*claimTemplate}, nil, unschedulableNodes) | |||
err := c.createOrUpdateAffinityAssistant(ctx, aaName, pr, nil, []corev1.PersistentVolumeClaimVolumeSource{{ClaimName: claimTemplate.Name}}, unschedulableNodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this works because it assumes that pvcs have already been created from the volume claim template and that the function affinityAssistantStatefulSet only uses the volume claim templates in the statefulset spec. This is a bit hard to understand, and if we change affinityAssistantStatefulSet this will result in incorrect behavior. Maybe affinityAssistantStatefulSet
needs to have some awareness of the affinity assistant mode, or we could use some explanatory comments + better variable naming of args passed into createOrUpdateAffinityAssistant
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. I think affinityAssistantStatefulSet
should have the sole responsibility to create an StatefulSet
based on the given input (so usually we don't change it), and all our business logic should go to createOrUpdateAffinityAssistantPerAABehavior
and createOrUpdateAffinityAssistant
.
Let me try to find some better variable name in createOrUpdateAffinityAssistantsPerAABehavior
and put some comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the variable names are accurate, so I added some doc for this part, PTAL 😁 !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's how I would refactor this (involves rewriting CreatePVCsForWorkspacesWithoutAffinityAssistant):
case aa.AffinityAssistantPerWorkspace:
...
for claimTemplate, workspaceBinding := range claimTemplatesToWorkspace {
aaName := getAffinityAssistantName(workspaceBinding, pr.Name)
pvc, err := c.pvcHandler.GetOrCreatePVC(workspaceBinding, pr)
< handle err>
err := c.createOrUpdateAffinityAssistant(ctx, aaName, pr, nil, []corev1.PersistentVolumeClaimVolumeSource{pvc}, unschedulableNodes)
< handle err>
However, I know this PR is just meant to revert deletion behavior for the existing affinity assistant, and I want to keep this PR scoped to the minimal set of changes to do that. (I recognize I've already asked for some refactoring, but this chunk of code can be hard to grapple with :)) Maybe refactoring changes could be split into a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the advice! I think we can put the refactoring to another PR to keep this one scoped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving this PR but I have a strong preference that refactoring changes happen first. typically refactoring changes that get deferred to later just get deprioritized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Lee, I did some investigation and realized that doing such refactoring in this PR will introduce a bunch of code changes. This makes this PR quite big and harder to review.
To address your concern, I have created an issue tracking it and assigned to myself: #6915 (will make sure to prioritize). Merging this PR as it is can help unblocking the followup PRs where I will add e2e support for coschedule pipelineruns
mode, and I can work on the refactoring work in parallel (and the reviews for the followup PRs can happen in parallel).
WDTY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I appreciate it!
6340ce2
to
f88f1bf
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
f88f1bf
to
ee4c0c7
Compare
/test pull-tekton-pipeline-beta-integration-tests |
The following is the coverage report on the affected files.
|
SGTM, I have added an integration test to test the AA behavior from end to end. PTAL |
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
/test pull-tekton-pipeline-integration-tests |
knativetest "knative.dev/pkg/test" | ||
) | ||
|
||
func TestAffinityAssistant_PerWorkspace(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thanks Quan.
// When Affinity Assistant is enabled, the PersistentVolumeClaims will be created by the Affinity Assistant StatefulSet VolumeClaimTemplate instead. | ||
// This function is only called when the Affinity Assistant behavior is AffinityAssistantPerWorkspace or AffinityAssistantDisabled. | ||
// When the Affinity Assistant behavior is AffinityAssistantPerPipelineRun or AffinityAssistantPerPipelineRunWithIsolation, | ||
// the PersistentVolumeClaims will be created by the Affinity Assistant StatefulSet VolumeClaimTemplate instead. | ||
func (c *defaultPVCHandler) CreatePVCsForWorkspacesWithoutAffinityAssistant(ctx context.Context, wb []v1.WorkspaceBinding, ownerReference metav1.OwnerReference, namespace string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to change this function name now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG, renamed it to CreatePVCsForWorkspaces
as the usage of this function to create PVC means it is separated from Affinity Assistant StatefulSet
// This function is only called when the Affinity Assistant behavior is AffinityAssistantPerWorkspace or AffinityAssistantDisabled. | ||
// When the Affinity Assistant behavior is AffinityAssistantPerPipelineRun or AffinityAssistantPerPipelineRunWithIsolation, | ||
// the PersistentVolumeClaims will be created by the Affinity Assistant StatefulSet VolumeClaimTemplate instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of docstrings that explain how a function is called. I think it would be sufficient to explain that it creates PVCs for workspaces with volumeClaimTemplates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the usage docstring for now. Will further look into the explanation in the followup refactor PR
@@ -88,7 +88,7 @@ func (c *Reconciler) createOrUpdateAffinityAssistantsPerAABehavior(ctx context.C | |||
} | |||
for claimTemplate, workspaceName := range claimTemplatesToWorkspace { | |||
aaName := getAffinityAssistantName(workspaceName, pr.Name) | |||
err := c.createOrUpdateAffinityAssistant(ctx, aaName, pr, []corev1.PersistentVolumeClaim{*claimTemplate}, nil, unschedulableNodes) | |||
err := c.createOrUpdateAffinityAssistant(ctx, aaName, pr, nil, []corev1.PersistentVolumeClaimVolumeSource{{ClaimName: claimTemplate.Name}}, unschedulableNodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving this PR but I have a strong preference that refactoring changes happen first. typically refactoring changes that get deferred to later just get deprioritized.
/assign |
aa65860
to
1ea210b
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
knativetest "knative.dev/pkg/test" | ||
) | ||
|
||
func TestAffinityAssistant_PerWorkspace(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Thanks for the comments step by step in the e2e test. Wondering it might be beneficial to add a summary to the docstring here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, docstring added
1ea210b
to
b01af4a
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
@@ -184,14 +185,16 @@ func TestCreateAndDeleteOfAffinityAssistantPerPipelineRun(t *testing.T) { | |||
|
|||
// TestCreateAndDeleteOfAffinityAssistantPerWorkspace tests to create and delete an Affinity Assistant | |||
// per workspace for a given PipelineRun | |||
func TestCreateAndDeleteOfAffinityAssistantPerWorkspace(t *testing.T) { | |||
func TestCreateAndDeleteOfAffinityAssistantPerWorkspaceOrDisabled(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update the docstring for this test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, fixed!
Part of [tektoncd#6740] and unblocks [tektoncd#6635]. `PVCs` are created if a user specifies `VolumeClaimTemplate` as the source of a `pipelinerun workspace`. Prior to this change, such `pvcs` are created via `affinity assistant statefulset` when `affinity assistant` is enabled (in both `workspaces` or `pipelineruns` coschedule mode). To delete such pvcs when the owning `pipelinerun` is completed, we must explicitly delete those pvcs in the reconciler since the owner of such pvcs is the `affinity assistant statefulset` instead of the `pipelinerun`. The problem of such deletion mechanism is that such `pvcs` are left in `terminating` state when the owning `pipelinerun` is `completed` but not `deleted`. This is because the pvcs are protected by `kubernetes.io/pvc-protection` `finalizer`, which does not allow the `pvcs` to be deleted until the `pipelinerun` consuming the `pvc` is deleted. Leaving pvcs in `terminating` state is confusing to cluster operators. Instead of changing the pvc deletion behavior in such backward incompatible way, it is better to make it configurable so that it is backward compatible, as prototyped in [tektoncd#6635]. This commit reverts the pvc creation behavior `per-workspace` coschedule mode, which changes the owner of the `pvcs` back to the `pipelinerun` instead of the `affinity assistant statefulset`. After the commit, the pvcs created by specifying `VolumeClaimTemplate` are left in `bounded` state instead of `terminating`. This commit is the prerequisite of [tektoncd#6635]. This commit does NOT reverts the pvc creation behavior `per-pipelinerun` coschedule mode as we will enforce the deletion of pvcs when the owning `pipelinerun` is completed. This is a better practice and there is no backward compatability concern since `per-pipelinerun` coschedule mode is a new feature. [tektoncd#6740]: tektoncd#6740 [tektoncd#6635]: tektoncd#6635
b01af4a
to
e40338c
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for answering the concerns on testing offline and adding the excellent commit message. Discussed couple of implementation choices with @QuanZhang-William offline.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: JeromeJu, lbernick, Yongxuanzhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Part of #6740 and unblocks #6635.
PVCs
are created if a user specifiesVolumeClaimTemplate
as the source of apipelinerun workspace
. Prior to this change, suchpvcs
are created viaaffinity assistant statefulset
whenaffinity assistant
is enabled (in bothworkspaces
orpipelineruns
coschedule mode).To delete such pvcs when the owning
pipelinerun
is completed, we must explicitly delete those pvcs in the reconciler since the owner of such pvcs is theaffinity assistant statefulset
instead of thepipelinerun
. The problem of such deletion mechanism is that suchpvcs
are left interminating
state when the owningpipelinerun
iscompleted
but notdeleted
. This is because the pvcs are protected bykubernetes.io/pvc-protection
finalizer
, which does not allow thepvcs
to be deleted until thepipelinerun
consuming thepvc
is deleted. Leaving pvcs interminating
state is confusing to cluster operators. Instead of changing the pvc deletion behavior in such backward incompatible way, it is better to make it configurable so that it is backward compatible, as prototyped in #6635.This commit reverts the pvc creation behavior
per-workspace
coschedule mode, which changes the owner of thepvcs
back to thepipelinerun
instead of theaffinity assistant statefulset
. After the commit, the pvcs created by specifyingVolumeClaimTemplate
are left inbounded
state instead ofterminating
. This commit is the prerequisite of #6635.This commit does NOT reverts the pvc creation behavior
per-pipelinerun
coschedule mode as we will enforce the deletion of pvcs when the owningpipelinerun
is completed. This is a better practice and there is no backward compatability concern sinceper-pipelinerun
coschedule mode is a new feature.Changes
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
/kind <type>
. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tepRelease Notes