Skip to content

Commit

Permalink
Add more documentation about the KCP pre-terminate hook
Browse files Browse the repository at this point in the history
Signed-off-by: Stefan Büringer buringerst@vmware.com
  • Loading branch information
sbueringer committed Sep 9, 2024
1 parent a8ae016 commit c7eba21
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 0 deletions.
10 changes: 10 additions & 0 deletions api/v1beta1/machine_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@ const (
// search each annotation for during the pre-terminate.delete lifecycle hook
// to pause reconciliation of deletion. These hooks will prevent removal of
// an instance from an infrastructure provider until all are removed.
//
// Notes for Machines managed by KCP (starting with Cluster API v1.8.2):
// * KCP adds its own pre-terminate hook on all Machines it controls. This is done to ensure it can later remove
// the etcd member right before Machine termination (i.e. before InfraMachine deletion). If the etcd member would be
// removed earlier, kubelet would start failing and the Node drain would not work.
// * Starting with Kubernetes v1.31 the KCP pre-terminate hook will wait for all other pre-terminate hooks to finish to
// ensure it runs last (thus ensuring that kubelet is still working while other pre-terminate hooks run). This is only done
// for v1.31 or above because the kubeadm ControlPlaneKubeletLocalMode was introduced with kubeadm 1.31. This feature configures
// the kubelet to communicate with the local apiserver. Only because of that the kubelet immediately starts failing after the etcd
// member is removed. We need the ControlPlaneKubeletLocalMode feature with 1.31 to adhere to the kubelet skew policy.
PreTerminateDeleteHookAnnotationPrefix = "pre-terminate.delete.hook.machine.cluster.x-k8s.io"

// MachineCertificatesExpiryDateAnnotation annotation specifies the expiry date of the machine certificates in RFC3339 format.
Expand Down
10 changes: 10 additions & 0 deletions docs/book/src/developer/providers/migrations/v1.8-to-v1.9.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,16 @@ maintainers of providers and consumers of our Go API.

### Other

- Notes for Machines managed by KCP (starting with Cluster API v1.8.2):
- KCP adds its own pre-terminate hook on all Machines it controls. This is done to ensure it can later remove
the etcd member right before Machine termination (i.e. before InfraMachine deletion). If the etcd member would be
removed earlier, kubelet would start failing and the Node drain would not work.
- Starting with Kubernetes v1.31 the KCP pre-terminate hook will wait for all other pre-terminate hooks to finish to
ensure it runs last (thus ensuring that kubelet is still working while other pre-terminate hooks run). This is only done
for v1.31 or above because the kubeadm ControlPlaneKubeletLocalMode was introduced with kubeadm 1.31. This feature configures
the kubelet to communicate with the local apiserver. Only because of that the kubelet immediately starts failing after the etcd
member is removed. We need the ControlPlaneKubeletLocalMode feature with 1.31 to adhere to the kubelet skew policy.

### Suggested changes for providers

- The Errors package was created when capi provider implementation was running as machineActuators that needed to vendor core capi to function. There is no usage recommendations today and its value is questionable since we moved to CRDs that inter-operate mostly via conditions. Instead we plan to drop the dedicated semantic for terminal failure and keep improving Machine lifecycle signal through conditions. Therefore the Errors package [has been deprecated in v1.8](https://github.com/kubernetes-sigs/cluster-api/issues/10784). It's recommented to remove any usage of the currently exported variables.

0 comments on commit c7eba21

Please sign in to comment.