Skip to content

Commit

Permalink
Add WaitForKubeAPIServerCertRenewal helper
Browse files Browse the repository at this point in the history
During 4.17.0-ec.1 testing we found out that it might be possible that
kubelet certs are not expired but aggreator one is and due to that our
apiserver not able to get the node info and fails with following error.

```
INFO Verifying validity of the kubelet certificates...
DEBU Running SSH command: date --date="$(sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -noout -enddate | cut -d= -f 2)" --iso-8601=seconds
DEBU SSH command results: err: <nil>, output: 2025-07-05T03:48:39+00:00
DEBU Running SSH command: date --date="$(sudo openssl x509 -in /var/lib/kubelet/pki/kubelet-server-current.pem -noout -enddate | cut -d= -f 2)" --iso-8601=seconds
DEBU SSH command results: err: <nil>, output: 2025-07-05T03:49:24+00:00
DEBU Running SSH command: date --date="$(sudo openssl x509 -in /etc/kubernetes/static-pod-resources/kube-apiserver-certs/configmaps/aggregator-client-ca/ca-bundle.crt -noout -enddate | cut -d= -f 2)" --iso-8601=seconds
DEBU SSH command results: err: <nil>, output: 2024-07-11T05:50:37+00:00
DEBU Certs have expired, they were valid till: 11 Jul 24 05:50 +0000

DEBU Running SSH command: timeout 5s oc get nodes --context admin --cluster crc --kubeconfig /opt/kubeconfig
DEBU SSH command results: err: Process exited with status 1, output:
DEBU E0722 10:21:40.601631   10967 memcache.go:265] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-07-22T10:21:40Z is after 2024-07-10T05:27:06Z
E0722 10:21:40.604575   10967 memcache.go:265] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-07-22T10:21:40Z is after 2024-07-10T05:27:06Z
```

This PR make sure we also wait to recover the api server related certs
before checking the apiserver is responding.
  • Loading branch information
praveenkumar committed Jul 30, 2024
1 parent b88f1e1 commit 6af4fb9
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 2 deletions.
6 changes: 5 additions & 1 deletion pkg/crc/cluster/cert_renewal.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ func approvePendingCSRs(ctx context.Context, ocConfig oc.Config, expectedSignerN
}, time.Second*5)
}

func ApproveCSRAndWaitForCertsRenewal(ctx context.Context, sshRunner *ssh.Runner, ocConfig oc.Config, client, server bool) error {
func ApproveCSRAndWaitForCertsRenewal(ctx context.Context, sshRunner *ssh.Runner, ocConfig oc.Config, client, server, aggregratorClient bool) error {
const (
kubeletClientSignerName = "kubernetes.io/kube-apiserver-client-kubelet"
kubeletServingSignerName = "kubernetes.io/kubelet-serving"
Expand Down Expand Up @@ -77,6 +77,10 @@ func ApproveCSRAndWaitForCertsRenewal(ctx context.Context, sshRunner *ssh.Runner
logging.Info("Kubelet serving certificate has expired, waiting for automatic renewal... [will take up to 5 minutes]")
return crcerrors.Retry(ctx, 5*time.Minute, waitForCertRenewal(sshRunner, KubeletServerCert), time.Second*5)
}
if aggregratorClient {
logging.Info("Kube API server certificate has expired, waiting for automatic renewal... [will take up to 8 minutes]")
return crcerrors.Retry(ctx, 8*time.Minute, waitForCertRenewal(sshRunner, AggregatorClientCert), time.Second*5)
}
return nil
}

Expand Down
2 changes: 1 addition & 1 deletion pkg/crc/machine/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,7 @@ func (client *client) Start(ctx context.Context, startConfig types.StartConfig)

ocConfig := oc.UseOCWithSSH(sshRunner)

if err := cluster.ApproveCSRAndWaitForCertsRenewal(ctx, sshRunner, ocConfig, certsExpired[cluster.KubeletClientCert], certsExpired[cluster.KubeletServerCert]); err != nil {
if err := cluster.ApproveCSRAndWaitForCertsRenewal(ctx, sshRunner, ocConfig, certsExpired[cluster.KubeletClientCert], certsExpired[cluster.KubeletServerCert], certsExpired[cluster.AggregatorClientCert]); err != nil {
logBundleDate(vm.bundle)
return nil, errors.Wrap(err, "Failed to renew TLS certificates: please check if a newer CRC release is available")
}
Expand Down

0 comments on commit 6af4fb9

Please sign in to comment.