Skip to content

Commit

Permalink
docs(proposal): add docs for using cascading deletion
Browse files Browse the repository at this point in the history
Signed-off-by: chang.qiangqiang <chang.qiangqiang@immomo.com>
  • Loading branch information
CharlesQQ committed Sep 9, 2024
1 parent c426e97 commit 7605dc3
Show file tree
Hide file tree
Showing 2 changed files with 395 additions and 0 deletions.
395 changes: 395 additions & 0 deletions docs/proposals/use-cascading-deletion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,395 @@
---
title: Use Cascading Deletion in Karmada

authors:
- "@CharlesQQ"

reviewers:
- "@robot"
- TBD

approvers:
- "@robot"
- TBD

creation-date: 2024-07-01

# Use Cascading Deletion in Karmada

## Summary

<!--
一种联邦资源的级联删除策略,用于控制当用户删除 Karmada 控制面资源时,成员集群中的资源是否进行同步删除,类似于[级联删除](https://kubernetes.io/docs/tasks/administer-cluster/use-cascading-deletion/)。
-->
A cascading deletion policy for federated resources, which controls whether resources in member clusters are deleted synchronously when users delete Karmada control plane resources. It is similar to [cascade deletion](https://kubernetes.io/docs/tasks/administer-cluster/use-cascading-deletion/).

## Motivation

<!--
默认情况下, 当用户删除 Karmada 控制面的资源之后,成员集群中的资源也会被删除。但是在某些场景下,用户仍希望仅删除控制面的资源,保留成员集群资源。
-->

By default, when the user deletes the resource in the Karmada control plane, the resources of the member cluster will also be deleted. However, in certain scenarios, users may still prefer to delete only the control plane resources while retaining the member cluster resources.

### Goals

<!--
- 提供删除控制面resource template时保留成员集群中资源的能力,与此同时,清理 Karmada 系统附加在成员集群资源上的 labels/annotations 等信息。
- 提供 karmadactl 子命令,能够执行资源删除策略, 如 `karmadactl delete deployment --cascade=orphan`。
-->

- Provides the ability to retain resources in member clusters when deleting the control plane resource template, and at the same time, cleans information such as labels/annotations attached to member cluster resources by the Karmada system.
- Provide a `karmadactl` subcommand capable of executing resource deletion policies, such as `karmadactl delete deployment --cascade=orphan`.

### Non-Goals

<!--
- 为不同成员集群定义不同的资源删除策略。
- 其他删除策略, 比如保留 Karmada 控制面中的 work 对象。
-->
- Define different resource deletion strategies for different member clusters.
- Other deletion strategies, such as retaining work objects in the Karmada control plane.

## Proposal

### User Stories (Optional)

#### Story 1

<!--
作为管理员,我希望在将工作负载迁移到 Karmada 的过程中,如果出现了任何意外情况,例如云平台无法发布应用程序或者 Pod 出现了意外问题,为了迅速停止损失,需要通过 Karmada 提供的回滚机制,立即恢复到迁移前的状态。
-->
As an administrator, I hope that during the process of migrating workloads to Karmada, if any unexpected situations arise, such as the cloud platform being unable to publish the application or the Pod encountering unexpected issues, it is necessary to use the rollback mechanism provided by Karmada to immediately revert to the state before the migration in order to quickly stop the loss.

### Notes/Constraints/Caveats (Optional)

### Risks and Mitigations

## Design Details

After extensive preliminary discussions, four solutions have now been proposed, and we need to confirm the final solution.

### Solution one: Extended by Annotation

#### API changes

<!--
新增一个 Annotation,用于用户在 Karmada 控制面中的资源模板上增加,key 值为: `resourcetemplate.karmada.io/cascadedeletion`,为了增加扩展性,value 值为 string 枚举类型,当前支持的类型包括:
- orphan: 保留成员集群中资源,清理 Karmada 系统附加在成员集群资源上的 labels/annotations 等信息。
当用户不指定该 annotation 时,为系统当前行为:同步删除成员集群中的资源。
-->
A new Annotation is added for users to include on resource templates in the Karmada control plane, with the key value: `resourcetemplate.karmada.io/cascadedeletion`. To increase extensibility, the value is of the string enumeration type, and currently supported types include:
- orphan: Retain resources in member clusters and clean up labels/annotations and other information attached to member cluster resources by the Karmada system.

When users do not specify this annotation, the system's current behavior is to synchronously delete resources in member clusters.

#### Controller logic changes

<!--
用户添加在资源模板上的 `resourcetemplate.karmada.io/cascadedeletion` annotation 会被传播到 `work.spec.workload.manifests` 中,当资源模板被删除时,`execution-controller` 会执行 work 对象删除的逻辑,它能够从 `work.spec.workload.manifests` 中解析出资源模板上的 `resourcetemplate.karmada.io/cascadedeletion` annotation 值,执行如下判断逻辑:
- 如果不存在目标 annotation,同步删除成员集群中的资源;
- 如果目标 annotation 值为 `orphan`,保留成员集群中资源,并清理 Karmada 系统附加在成员集群资源上的 labels/annotations 等信息。
-->

The `resourcetemplate.karmada.io/cascadedeletion` annotation added by users to the resource template will be propagated to `work.spec.workload.manifests`. When the resource template is deleted, the `execution-controller` will execute the logic for deleting the work object. It can parse the value of the `resourcetemplate.karmada.io/cascadedeletion` annotation from `work.spec.workload.manifests` and perform the following judgment logic:
- If the target annotation does not exist, synchronously delete the resources in the member clusters.
- If the target annotation value is `orphan`, retain the resources in the member clusters and clean up the labels/annotations and other information attached to the member cluster resources by the Karmada system.

![use-cascading-deletion](statics/use-cascading-deletion.png)

#### User usage example
<!--
设置级联删除策略为orphan
-->
Set the cascade deletion policy to orphan

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
propagationpolicy.karmada.io/name: foo
propagationpolicy.karmada.io/namespace: default
resourcetemplate.karmada.io/cascadedeletion: orphan
...
```

<!--
在这个方法中,还有一个分支想法是,在 Work API 中增加一个 CascadeDeletion 字段,这样就不用解析 `work.spec.workload.manifests` 了。
-->

In this approach, there is also a branch idea of adding a `CascadeDeletion` field in the Work API, so there is no need to parse `work.spec.workload.manifests`.

Work
```go
// WorkSpec defines the desired state of Work.
type WorkSpec struct {
...
// CascadeDeletion Declare the cascade deletion strategy. The default value is null, which is equivalent to background.
// +optional
CascadeDeletion *CascadeDeletionPolicy `json:"cascadeDeletion,omitempty"`
}
```
The `binding-controller` needs to set the `CascadeDeletion` field in the Work object according the resource annotation.

The `cluster-resource-binding-controller` needs to set the `CascadeDeletion` field in the Work object according the resource annotation.

The `execution-controller` needs to perform resource deletion based on the `CascadeDeletion` field in Work.

### Solution two: Extend the fields of PropagationPolicy/ClusterPropagationPolicy

<!--
通过扩展 `PropagationPolicy/ClusterPropagationPolicy` API,引入一个新的字段 `cascadeDeletion`, 字段会被透传到 `ResourceBinding/ClusterResourceBinding` 以及 work 对象,最后由 execution-controller 根据 work 字段的值决定级联删除策略。
-->

By extending the `PropagationPolicy/ClusterPropagationPolicy` API, a new field `cascadeDeletion` is introduced. The field will be transparently transmitted to `ResourceBinding/ClusterResourceBinding` and the work object. Finally, the execution controller determines the cascade deletion strategy based on the value of the work field.

#### API changes

PropagationPolicy/ClusterPropagationPolicy
```go
type CascadeDeletionPolicy string
const (
// CascadeDeletionPolicyOrphan Orphans the dependents.
CascadeDeletionPolicyOrphan CascadeDeletionPolicy = "orphan"
)
// PropagationSpec represents the desired behavior of PropagationPolicy.
type PropagationSpec struct {
...
// CascadeDeletion Declare the cascade deletion strategy. The default value is null, which is equivalent to background.
// +optional
CascadeDeletion *CascadeDeletionPolicy `json:"cascadeDeletion,omitempty"`
}
```
ResourceBinding/ClusterResourceBinding
```go
// ResourceBindingSpec represents the expectation of ResourceBinding.
type ResourceBindingSpec struct {
...

// CascadeDeletion Declare the cascade deletion strategy. The default value is null, which is equivalent to background.
// +optional
CascadeDeletion *CascadeDeletionPolicy `json:"cascadeDeletion,omitempty"`
}
```
Work
```go
// WorkSpec defines the desired state of Work.
type WorkSpec struct {
...

// CascadeDeletion Declare the cascade deletion strategy. The default value is null, which is equivalent to background.
// +optional
CascadeDeletion *CascadeDeletionPolicy `json:"cascadeDeletion,omitempty"`
}
```
#### Controller logic changes
<!--
detector 需要将 CascadeDeletion 从 PropagationPolicy/ClusterPropagationPolicy 传递到 ResourceBinding/ClusterResourceBinding 中;
binding-controller 需要将 CascadeDeletion 从 ResourceBinding 传递到 Work 中;
cluster-resource-binding-controller 需要将 CascadeDeletion 从 ClusterResourceBinding 传递到 Work 中;
execution-controller 需要根据 Work 的 CascadeDeletion 字段,进行资源删除。
-->
The `detector` needs to pass the CascadeDeletion from PropagationPolicy/ClusterPropagationPolicy to ResourceBinding/ClusterResourceBinding.

The `binding-controller` needs to pass the CascadeDeletion from ResourceBinding to Work.

The `cluster-resource-binding-controller` needs to pass the CascadeDeletion from ClusterResourceBinding to Work.

The `execution-controller` needs to perform resource deletion based on the CascadeDeletion field in Work.

#### User usage example

Set the cascade deletion policy to orphan
```yaml
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
name: nginx-propagation
spec:
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: nginx
cascadeDeletion: orphan
```

### Solution Three: Extended by adding a new CRD

<!--
新增一个 CRD 资源,用户通过定义该 CRD 的 CR 资源,来描述目标资源的资源删除策略。
-->

A new CRD resource is added, through which users define the CR (Custom Resource) of this CRD to describe the resource deletion strategy for the target resource.

#### API changes

```go
type CascadeDeletionPolicy struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`

// Spec represents the desired cascadeDeletion Behavior.
Spec CascadeDeletionSpec `json:"spec"`

// Status represents the status of cascadeDeletion.
// +optional
Status CascadeDeletionStatus `json:"status,omitempty"`
}

type CascadeDeletionSpec struct {
// CascadeDeletion Declare the cascade deletion strategy. The default value is null, which is equivalent to background.
// +optional
CascadeDeletion *CascadeDeletionPolicy `json:"cascadeDeletion,omitempty"`
// ResourceSelectors used to select resources.
// Nil or empty selector is not allowed and doesn't mean match all kinds
// of resources for security concerns that sensitive resources(like Secret)
// might be accidentally propagated.
// +required
// +kubebuilder:validation:MinItems=1
ResourceSelectors []ResourceSelector `json:"resourceSelectors"`
}

// ResourceSelector the resources will be selected.
type ResourceSelector struct {
// APIVersion represents the API version of the target resources.
// +required
APIVersion string `json:"apiVersion"`

// Kind represents the Kind of the target resources.
// +required
Kind string `json:"kind"`

// Namespace of the target resource.
// Default is empty, which means inherit from the parent object scope.
// +optional
Namespace string `json:"namespace,omitempty"`

// Name of the target resource.
// Default is empty, which means selecting all resources.
// +optional
Name string `json:"name,omitempty"`

// A label query over a set of resources.
// If name is not empty, labelSelector will be ignored.
// +optional
LabelSelector *metav1.LabelSelector `json:"labelSelector,omitempty"`
}

type CascadeDeletionStatus struct {
...
}
```
Work
```go
// WorkSpec defines the desired state of Work.
type WorkSpec struct {
// CascadeDeletion Declare the cascade deletion strategy. The default value is null, which is equivalent to background.
// +optional
CascadeDeletion *CascadeDeletionPolicy `json:"cascadeDeletion,omitempty"`

...
}
```
#### Controller logic changes
<!--
binding-controller/cluster-resource-binding-controller 在创建或更新 work 对象的时候, 查询是否存在关联目标资源的 CascadeDeletionPolicy,如果能够找到,将删除策略同步至 Work 对象中。
execution-controller 根据 Work 对象中的 CascadeDeletion 字段,进行资源删除。
-->
The `binding-controller`/`cluster-resource-binding-controller` checks for the existence of a `CascadeDeletionPolicy` associated with the target resource when creating or updating the Work object. If found, the deletion policy is synchronized into the Work object.

The `execution-controller` carries out resource deletion based on the `CascadeDeletion` field in the Work object.

#### User usage example

Set the cascade deletion policy to orphan

```yaml
apiVersion: policy.karmada.io/v1alpha1
kind: CascadeDeletionPolicy
metadata:
name: foo
spec:
cascadeDeletion: orphan
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: foo
namespace: default
```

### Solution Four: Extended by Annotation & Extend the fields of PropagationPolicy/ClusterPropagationPolicy

Equivalent to supporting both solution one and solution two

### Solution comparison

| Name | Supported control plane resources | Extend API resources | User learning cost | Execute orphan delete just by per resource |
|----------------|-----------------------------------|----------------------|--------------------|--------------------------------------------|
| Solution One | resource template | None | Lowest | YES |
| Solution Two | resource template | PP/CPP/RB/CRB/WORK | Lowest | No |
| Solution Three | all resources | new CRD/WORK | Highest | YES |
| Solution Four | all resources | PP/CPP/RB/CRB/WORK | Lower | YES |

Solution One:
Disadvantages:
- When the `execution-controller` determines whether to cascade delete resources in the member clusters, it is enough for resource template migration rollback scenario. But for resources that are not distributed through PropagationPolicy, such as namespace, federatedresourcequota, need to be implemented on its corresponding controller

Solution Two:
Disadvantages:
- For resources that are not distributed through PropagationPolicy, such as namespace, federatedresourcequota, it is not possible to specify a deletion policy.
- In 1 policy vs multi resource scene, we can't execute orphan delete just by per resource.

Solution Three:
Disadvantages:
It increases the learning cost for users and results in an increased number of resources in the Karmada control plane.

Solution Four:
Disadvantages:
Having both plans in place is somewhat redundant.

### The cascading deletion policy of dependent resources and main resources does not force binding
<!--
依赖资源和主资源的级联删除策略不强制绑定
由于依赖资源可能被多个资源模版共享,在这种情况下很难决策依赖资源的删除策略以哪个删除策略为准; 不强制和主资源绑定,由用户自己决策,灵活性和扩展性更好
-->
Since dependent resources may be shared by multiple resource templates, in this case it is difficult to decide which deletion strategy should be used for the dependent resources; it is not forced to be bound to the main resource, and is left to the user to decide, with greater flexibility and scalability. good

### The cascade deletion strategy for namespace and CRD resources is still specified by the user.
<!--
namespace和CRD资源的级联删除策略由用户依然由用户指定
在execution节点区分workload的类型成本较高; 另外区分这两种资源的级联删除策略会带来用户学习成本; 如果用户有需求保留成员集群namespace和CRD资源,可以说明修改级联删除策略为orphan, 这样可以和资源模版的策略保持一致
-->

The cost of distinguishing the workload type on the execution node is high; in addition, the cascade deletion strategy that distinguishes these two resources will bring user learning costs; if the user needs to retain the member cluster namespace and CRD resources, it can be explained that the cascade deletion strategy is modified to orphan , so that it can be consistent with the strategy of the resource template

### karmadactl adds command line parameters related to cascade deletion
<!--
karmadactl 增加级联删除相关的命令行参数
- `karmadactl delete deployment <name> --cascade=orphan` 给资源增加级联删除策略并删除资源
-->

`karmadactl delete deployment <name> --cascade=orphan` adds a cascade deletion policy to the resource and deletes the resource


QA: When cascade=orphan, whether the workload of the member cluster only clears the `karmada.io/managed` label is enough

### Test Plan

TODO

## Alternatives
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 7605dc3

Please sign in to comment.