Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not annotate the Node CRI socket in the "init upload-config" phase #1924

Closed
ereslibre opened this issue Nov 18, 2019 · 24 comments
Closed
Labels
kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@ereslibre
Copy link
Contributor

ereslibre commented Nov 18, 2019

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Context

During Kubernetes upgrades using kubeadm we have the desire to patch the Kubelet configuration. While we could write some flags on a systemd unit to override some flags, the Kubelet flags are deprecated (at least the ones relevant to us at this point) and they will be gone in favour of configuration-only anytime soon.

The preferred way is to use the Kubelet configuration file instead. When upgrading a minor Kubernetes version, the following workflow could be performed:

  • Upgrade kubeadm to the latest desired version (x.y)
  • Run kubeadm init phase upload-config kubelet, so the default kubelet-config-x.y ConfigMap is created
  • Patch the kubelet-config.x.y ConfigMap to include the desired changes
  • Run kubeadm upgrade apply and kubeadm upgrade node on the different nodes comprising the cluster

What happened?

When running kubeadm init phase upload-config kubelet, the CRI socket annotation logic runs as well as part of this phase, and that needs the node registration (thus, it needs a configuration file).

If you run this command without a configuration file you'll see an error like so:

# kubeadm init phase upload-config kubelet
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

What you expected to happen?

I would expect the CRI annotation to not happen during this phase, so it's possible for the upload-config kubelet phase to just upload the config and don't attempt to annotate the CRI socket in any way.

How to reproduce it (as minimally and precisely as possible)?

Run kubeadm init phase upload-config kubelet without a --config parameter.

Anything else we need to know?

The logic is triggered here:

https://github.com/kubernetes/kubernetes/blob/beaf3a2f04f2882b9e1de99e4b4f2f06995e3d82/cmd/kubeadm/app/cmd/phases/init/uploadconfig.go#L127-L130

@ereslibre
Copy link
Contributor Author

/priority important-longterm
/kind bug

Added the bug kind, but at this stage this is mostly a question to understand whether this is a bug or just something we can improve.

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. kind/bug Categorizes issue or PR as related to a bug. labels Nov 18, 2019
@ereslibre
Copy link
Contributor Author

/assign @neolit123 @fabriziopandini @rosti @yastij

@ereslibre
Copy link
Contributor Author

Submitted kubernetes/kubernetes#85419 for feedback.

/lifecycle active

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Nov 18, 2019
@neolit123 neolit123 added this to the v1.18 milestone Nov 18, 2019
@fabriziopandini
Copy link
Member

I'm a little bit torn about this.
Trying to conflate "change the cluster" user stories into upgrades, in my opinion, is a no-no, but considering that currently there are no valid alternatives in kubeadm I'm supportive to everything can easy the pain for users at the only condition that we are crystal clear in explaining this is not the target solution.

Getting into the detail of this use case, what are the expectation when running
kubeadm init phase upload-config kubelet without a --config parameter. Is there a reason why we can get this fixed instead of dividing current behavior into sub-phases?

@neolit123
Copy link
Member

/remove-kind bug
/kind design

@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. and removed kind/bug Categorizes issue or PR as related to a bug. labels Dec 1, 2019
@neolit123 neolit123 changed the title runUploadKubeletConfig annotates the CRI socket do not annotate the Node CRI socket in the "init upload-config" phase Jan 1, 2020
@neolit123 neolit123 modified the milestones: v1.18, v1.19 Mar 8, 2020
@neolit123 neolit123 removed the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Jun 2, 2020
@neolit123 neolit123 removed this from the v1.19 milestone Jul 27, 2020
@mtaufen
Copy link

mtaufen commented Mar 29, 2022

AIUI it's backlog work in SIG-Node. If someone's interested in helping move componentconfig forward, I'm happy to help. IMO it should be at least a little easier now that dynamic Kubelet config is being removed. @SergeyKanzhelev @dchen1107

@neolit123
Copy link
Member

neolit123 commented Mar 29, 2022

Are there any plans for adding corresponding component config fields to the --container-runtime and --container-runtime-endpoint command line parameters of the kubelet

The process of migration kubelet flags into config hangs for a period

the --container-runtime flag is going to be removed eventually as its only possible value after the dockershim extraction becomes remote.

there is already a PR for the CRI endpoints to be added in config:
kubernetes/kubernetes#108734
it probably won't make 1.24 as code freeze is today.

if something like that lands, in theory we don't need the CRI socket annotation on the Node object that kubeadm does any more. however, if we plan to remove it, it opens up for a redesign requirement on the kubeadm side.

  • currently users can use any CRI on different hosts and that CRI endpoint would be written on the Node object for that host.
  • if the CRI field is global in the KubeletConfiguration that kubeadm manages, user settings during kubeadm join would have to override the downloaded from cluster global configuration before its written down on disk for the kubelet to consume with --config.
  • kubeadm upgrade re-downloads the KubeletConfiguration from the cluster on each upgrade. but the user settings of preferred CRI socket would only be persisted in the kubelet config file on disk.
  • this would require kubeadm to store the CRI socket configuration from the disk to be able to apply the host specific configuration.
  • in the process, kubeadm would also have to likely stop passing the flag for CRI endpoint and assume the fields in config are set.

such a change would deserve a kubeadm KEP or a design doc at minimum.
would probably end up being a multi-release transition as well.

@neolit123 neolit123 added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Mar 29, 2022
@neolit123 neolit123 modified the milestones: v1.24, v1.25 Mar 29, 2022
@pacoxu
Copy link
Member

pacoxu commented Apr 2, 2022

AIUI it's backlog work in SIG-Node. If someone's interested in helping move componentconfig forward, I'm happy to help. IMO it should be at least a little easier now that dynamic Kubelet config is being removed.

It would be a good approach.

such a change would deserve a kubeadm KEP or a design doc at minimum.
would probably end up being a multi-release transition as well.

I would like the KEP after the kubelet flag-to-configuration migration.

@pacoxu
Copy link
Member

pacoxu commented Nov 6, 2022

kubernetes/kubernetes#112136 was opened for the kubelet configuration migration and needs approval for a period. is now merged in v1.27 release cycle.

@neolit123 neolit123 modified the milestones: v1.26, v1.27 Nov 21, 2022
@pacoxu
Copy link
Member

pacoxu commented Mar 13, 2023

/assign

As this may need a KEP, I will start it once I have time after the code freeze. The target release may be 1.28.

@neolit123
Copy link
Member

consolidating into
#3042
there is plan to move away from the annotation entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet