Cleanup of kind cluster #6768
Labels
area/test/jenkins
Issue about jenkins setup code
duplicate
This issue or pull request already exists
kind/bug
Categorizes issue or PR as related to a bug.
Milestone
Describe the bug
The CI jobs fail because of panic in the cleanup of existing kind cluster. Because in the current implementation of cleanup function for kind cluster, the code tries to get the creation timestamp of all the available kind clusters using the command
kubectl get nodes --context kind-$kind_cluster_name -o json -l node-role.kubernetes.io/control-plane | \ jq -r '.items[0].metadata.creationTimestamp'
,and sometimes there may be other job running on the same vm that has just started and the cluster creation is in process so the context is not ready but when another job tries to create the cluster it will stuck in this step and that job will panic and fail.
Not only in case of parallel job runs, but also if some job is aborted in the cluster creation phase the context of the kind cluster will not be available and whenever any new job will run on this testbed and will run the cleanup function the job will fail because it will try to fetch the context of clusters listed by
kind get clusters
using the above command and will panic causing the job to fail.To Reproduce
Trigger two kind jobs at the same time on same vm, or trigger one job and then as soon as the cluster creation starts, abort the job and then trigger a new job on the same testbed the second job will fail because of panic in both the cases.
Expected
The jobs should not fail and cluster creation should be successful.
Actual behavior
The job fails
Additional context
Reference to current implementation of cleanup function: clean_kind
#5753
The text was updated successfully, but these errors were encountered: