Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix minikube start failure #2771

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions .github/actions/e2e-deploy-vald-readreplica/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,6 @@ runs:

sleep 3

kubectl wait --for=condition=ready pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}

kubectl get pods

podname=`kubectl get pods --selector=${WAIT_FOR_SELECTOR} | tail -1 | awk '{print $1}'`
Expand All @@ -87,12 +85,15 @@ runs:
run: |
make k8s/vald-readreplica/deploy HELM_VALUES=${VALUES} HELM_EXTRA_OPTIONS="${HELM_EXTRA_OPTIONS}"

stern vald-discoverer &
stern vald-lb-gateway &
stern vald-agent-readreplica &
sleep 3

kubectl wait --for=condition=ready pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}

kubectl get pods

kubectl wait --for=condition=ready pod -l ${WAIT_FOR_SELECTOR} --timeout=600s

podname=`kubectl get pods --selector=${WAIT_FOR_SELECTOR} | tail -1 | awk '{print $1}'`
echo "POD_NAME=${podname}" >> $GITHUB_OUTPUT
env:
Expand Down
6 changes: 3 additions & 3 deletions .github/actions/e2e-deploy-vald/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,8 @@ runs:

sleep 3

kubectl wait --for=condition=ready pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}
kubectl wait --for=condition=Ready pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}
kubectl wait --for=condition=ContainersReady pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}

kubectl get pods

Expand All @@ -96,10 +97,9 @@ runs:
if: ${{ inputs.use_local_charts == 'true' }}
run: |
make k8s/vald/deploy HELM_VALUES=${VALUES} HELM_EXTRA_OPTIONS="${HELM_EXTRA_OPTIONS}"

sleep 3

kubectl wait --for=condition=ready pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}
kubectl wait --for=condition=Ready pod -l ${WAIT_FOR_SELECTOR} --timeout=${WAIT_FOR_TIMEOUT}

kubectl get pods

Expand Down
10 changes: 10 additions & 0 deletions .github/actions/setup-e2e/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ inputs:
description: "If minikube is not required, set this to true and set require_k3d to false"
required: false
default: "false"
require_kind:
description: "If kind is not required, set this to true and set require_k3d to false"
required: false
default: "false"
Comment on lines +39 to +42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add mutual exclusivity validation for cluster tools

The new require_kind parameter is well documented, but there's no validation preventing multiple cluster tools (kind, minikube, k3d) from being enabled simultaneously, which could lead to resource conflicts.

Consider adding a composite run step at the beginning to validate inputs:

- name: Validate cluster tool inputs
  shell: bash
  run: |
    ENABLED_TOOLS=0
    [[ "${{ inputs.require_kind }}" == "true" ]] && ((ENABLED_TOOLS++))
    [[ "${{ inputs.require_minikube }}" == "true" ]] && ((ENABLED_TOOLS++))
    [[ "${{ inputs.require_k3d }}" == "true" ]] && ((ENABLED_TOOLS++))
    if [[ $ENABLED_TOOLS -gt 1 ]]; then
      echo "Error: Only one cluster tool can be enabled at a time"
      exit 1
    fi

ingress_port:
description: 'If it is not "0", ingress will be exposed to the specified port'
required: false
Expand Down Expand Up @@ -91,6 +95,12 @@ runs:
run: |
make minikube/install
make minikube/start
- name: Setup kind environment
if: ${{ inputs.require_kind == 'true' }}
shell: bash
run: |
make kind/install
make kind/vs/start
Comment on lines +98 to +103
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for kind setup

The kind setup step should include error handling and verification of the cluster's readiness.

Consider enhancing the step:

-    - name: Setup kind environment
-      if: ${{ inputs.require_kind == 'true' }}
-      shell: bash
-      run: |
-        make kind/install
-        make kind/vs/start
+    - name: Setup kind environment
+      if: ${{ inputs.require_kind == 'true' }}
+      shell: bash
+      run: |
+        make kind/install || exit 1
+        make kind/vs/start || exit 1
+        # Verify cluster is ready
+        kubectl wait --for=condition=ready node --all --timeout=90s || exit 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Setup kind environment
if: ${{ inputs.require_kind == 'true' }}
shell: bash
run: |
make kind/install
make kind/vs/start
- name: Setup kind environment
if: ${{ inputs.require_kind == 'true' }}
shell: bash
run: |
make kind/install || exit 1
make kind/vs/start || exit 1
# Verify cluster is ready
kubectl wait --for=condition=ready node --all --timeout=90s || exit 1

- name: Check Kubernetes cluster
shell: bash
run: |
Expand Down
4 changes: 3 additions & 1 deletion .github/helm/values/values-readreplica.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

defaults:
logging:
level: info
level: debug
networkPolicy:
enabled: true
gateway:
Expand Down Expand Up @@ -59,6 +59,8 @@ agent:
snapshot_classname: "csi-hostpath-snapclass"
hpa:
enabled: true
name: vald-agent-readreplica
volume_name: vald-agent-readreplica-pvc
discoverer:
minReplicas: 1
hpa:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_detect-ci-container.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:
- name: Get Docker image tag from detection result
id: get_tag_name
run: |
TAG=$(echo "$TAGS" | awk '{print $1}' | awk -F '=' '{print $2}')
TAG=$(echo "$TAGS" | awk '{print $1}' | awk -F '=' '{print $2}' | sed -e 's/"//g')
echo "TAG=${TAG}"
echo "TAG=${TAG}" >> $GITHUB_OUTPUT
env:
Expand Down
11 changes: 2 additions & 9 deletions .github/workflows/e2e.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -315,14 +315,7 @@ jobs:
uses: ./.github/actions/setup-e2e
with:
require_k3d: "false"
require_minikube: "true"
- name: Deploy Vald
id: deploy_vald
uses: ./.github/actions/e2e-deploy-vald
with:
helm_extra_options: ${{ steps.setup_e2e.outputs.HELM_EXTRA_OPTIONS }}
values: .github/helm/values/values-readreplica.yaml
wait_for_selector: app=vald-agent
require_kind: "true"
- name: Deploy Vald Read Replica
id: deploy_vald_readreplica
uses: ./.github/actions/e2e-deploy-vald-readreplica
Expand All @@ -342,7 +335,7 @@ jobs:
E2E_UPDATE_COUNT=100 \
E2E_UPSERT_COUNT=100 \
E2E_REMOVE_COUNT=100 \
E2E_WAIT_FOR_CREATE_INDEX_DURATION=3m \
E2E_WAIT_FOR_CREATE_INDEX_DURATION=10m \
E2E_TARGET_POD_NAME=${POD_NAME} \
E2E_TARGET_NAMESPACE=default \
e2e/readreplica
Expand Down
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,8 @@ telepresence.log

# Generated by Cargo
# will have compiled files and executables
debug/
target/
rust/debug/
rust/target/
Comment on lines +62 to +63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Update .gitignore to handle additional target directories

The current .gitignore entries for Rust build artifacts are incomplete. While we have a proper Rust project structure under rust/, there are additional locations with target and debug directories:

  • internal/k8s/vald/mirror/target/
  • k8s/debug/

These directories don't contain any Rust files, suggesting they're unrelated to Rust builds, but the current .gitignore entries might miss actual Rust artifacts in subdirectories like rust/libs/*/target/.

Suggested changes:

  • Update the entries to cover all Rust project subdirectories:
    rust/**/debug/
    rust/**/target/
    
🔗 Analysis chain

LGTM! Verify Rust project structure alignment.

The more specific paths for Rust build artifacts provide better organization. However, let's verify that these paths align with the project structure.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and structure of Rust directories

# Check if rust directory exists and contains expected subdirectories
if [ -d "rust" ]; then
  echo "Rust directory structure:"
  fd . rust -t d -d 2
  
  # Check for any misplaced debug/target directories outside rust/
  echo -e "\nChecking for misplaced Rust build directories:"
  fd -t d "^(debug|target)$" -E "rust/*"
fi

Length of output: 335


Script:

#!/bin/bash
# Let's check if there are any Rust-related files in the directories with debug/target
echo "Checking for Rust files in k8s/debug/:"
fd -e rs -e toml . k8s/debug/

echo -e "\nChecking for Rust files in internal/k8s/vald/mirror/target/:"
fd -e rs -e toml . internal/k8s/vald/mirror/target/

echo -e "\nChecking Rust project structure:"
fd -e rs -e toml . rust/

echo -e "\nChecking for Cargo.toml files:"
fd Cargo.toml

Length of output: 1240


# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
Expand Down
14 changes: 4 additions & 10 deletions Makefile.d/e2e.mk
Original file line number Diff line number Diff line change
Expand Up @@ -145,17 +145,11 @@ e2e/actions/run/job: \
## run GitHub Actions E2E test (Stream CRUD with read replica )
e2e/actions/run/readreplica: \
hack/benchmark/assets/dataset/$(E2E_DATASET_NAME) \
minikube/restart
kind/vs/restart
kubectl wait -n kube-system --for=condition=Available deployment/metrics-server --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
sleep 2
kubectl wait -n kube-system --for=condition=Ready pod -l k8s-app=metrics-server --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
kubectl wait -n kube-system --for=condition=ContainersReady pod -l k8s-app=metrics-server --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
sleep 3
$(MAKE) k8s/vald/deploy \
HELM_VALUES=$(ROOTDIR)/.github/helm/values/values-readreplica.yaml
sleep 20
kubectl wait --for=condition=Ready pod -l "app=$(AGENT_NGT_IMAGE)" --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
kubectl wait --for=condition=ContainersReady pod -l "app=$(AGENT_NGT_IMAGE)" --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
kubectl wait -n kube-system --for=condition=Ready pod -l app.kubernetes.io/name=metrics-server --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
kubectl wait -n kube-system --for=condition=ContainersReady pod -l app.kubernetes.io/name=metrics-server --timeout=$(E2E_WAIT_FOR_START_TIMEOUT)
$(MAKE) k8s/vald-readreplica/deploy \
HELM_VALUES=$(ROOTDIR)/.github/helm/values/values-readreplica.yaml
sleep 3
Expand All @@ -166,7 +160,7 @@ e2e/actions/run/readreplica: \
echo $$pod_name; \
$(MAKE) E2E_TARGET_POD_NAME=$$pod_name e2e/readreplica
$(MAKE) k8s/vald/delete
$(MAKE) minikube/delete
$(MAKE) kind/vs/stop

.PHONY: e2e/actions/run/stream/crud/skip
## run GitHub Actions E2E test (Stream CRUD with SkipExistsCheck = true)
Expand Down
49 changes: 26 additions & 23 deletions Makefile.d/k8s.mk
Original file line number Diff line number Diff line change
Expand Up @@ -111,9 +111,9 @@ k8s/manifest/readreplica/update: \
mv $(TEMP_DIR)/vald-readreplica/templates $(ROOTDIR)/k8s/readreplica
rm -rf $(TEMP_DIR)

.PHONY: k8s/vald/deploy
## deploy vald sample cluster to k8s
k8s/vald/deploy:
.PHONY: k8s/vald/manifests
## generate vald manifest
k8s/vald/manifests:
helm template \
--values $(HELM_VALUES) \
--set defaults.image.tag=$(VERSION) \
Expand All @@ -131,6 +131,10 @@ k8s/vald/deploy:
--include-crds \
--output-dir $(TEMP_DIR) \
charts/vald

.PHONY: k8s/vald/deploy
## deploy vald sample cluster to k8s
k8s/vald/deploy: k8s/vald/manifests
@echo "Permitting error because there's some cases nothing to apply"
kubectl apply -f $(TEMP_DIR)/vald/templates/manager/index || true
kubectl apply -f $(TEMP_DIR)/vald/templates/agent || true
Expand All @@ -146,26 +150,13 @@ k8s/vald/deploy:
kubectl apply -f $(TEMP_DIR)/vald/templates/index/job/save || true
kubectl apply -f $(TEMP_DIR)/vald/templates/index/job/readreplica/rotate || true
kubectl apply -f $(TEMP_DIR)/vald/templates/index/operator || true
rm -rf $(TEMP_DIR)
kubectl get pods -o jsonpath="{.items[*].spec.containers[*].image}" | tr " " "\n"

@echo "manifest files location: $(TEMP_DIR)"

.PHONY: k8s/vald/delete
## delete vald sample cluster from k8s
k8s/vald/delete:
helm template \
--values $(HELM_VALUES) \
--set defaults.image.tag=$(VERSION) \
--set agent.image.repository=$(CRORG)/$(AGENT_NGT_IMAGE) \
--set agent.sidecar.image.repository=$(CRORG)/$(AGENT_SIDECAR_IMAGE) \
--set discoverer.image.repository=$(CRORG)/$(DISCOVERER_IMAGE) \
--set gateway.filter.image.repository=$(CRORG)/$(FILTER_GATEWAY_IMAGE) \
--set gateway.lb.image.repository=$(CRORG)/$(LB_GATEWAY_IMAGE) \
--set gateway.mirror.image.repository=$(CRORG)/$(MIRROR_GATEWAY_IMAGE) \
--set manager.index.image.repository=$(CRORG)/$(MANAGER_INDEX_IMAGE) \
--set manager.index.operator.image.repository=$(CRORG)/$(INDEX_OPERATOR_IMAGE) \
--include-crds \
--output-dir $(TEMP_DIR) \
charts/vald
k8s/vald/delete: k8s/vald/manifests
kubectl delete -f $(TEMP_DIR)/vald/templates/gateway/mirror || true
kubectl delete -f $(TEMP_DIR)/vald/templates/index/operator || true
kubectl delete -f $(TEMP_DIR)/vald/templates/index/job/readreplica/rotate || true
Expand All @@ -182,7 +173,6 @@ k8s/vald/delete:
kubectl delete -f $(TEMP_DIR)/vald/templates/agent/ngt || true
kubectl delete -f $(TEMP_DIR)/vald/templates/agent || true
kubectl delete -f $(TEMP_DIR)/vald/crds || true
rm -rf $(TEMP_DIR)

.PHONY: k8s/multi/vald/deploy
## deploy multiple vald sample clusters to k8s
Expand Down Expand Up @@ -246,7 +236,7 @@ k8s/vald-helm-operator/delete:

.PHONY: k8s/vald-readreplica/deploy
## deploy vald-readreplica to k8s
k8s/vald-readreplica/deploy:
k8s/vald-readreplica/deploy: k8s/vald/deploy
helm template \
--values $(HELM_VALUES) \
--set defaults.image.tag=$(VERSION) \
Expand All @@ -261,13 +251,26 @@ k8s/vald-readreplica/deploy:
$(HELM_EXTRA_OPTIONS) \
--output-dir $(TEMP_DIR) \
charts/vald-readreplica
kubectl delete -f $(TEMP_DIR)/vald/templates/gateway || true
kubectl delete -f $(TEMP_DIR)/vald/templates/gateway/lb || true
kubectl get pods
kubectl wait --for=delete pod -l app=vald-lb-gateway --timeout=600s

kubectl apply -f $(TEMP_DIR)/vald-readreplica/templates
sleep 2
sleep 5

kubectl get pods
kubectl wait --for=condition=ready pod -l app=vald-agent --timeout=600s
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=vald-readreplica --timeout=600s

kubectl apply -f $(TEMP_DIR)/vald/templates/gateway || true
kubectl apply -f $(TEMP_DIR)/vald/templates/gateway/lb || true

kubectl get pods

.PHONY: k8s/vald-readreplica/delete
## delete vald-helm-operator from k8s
k8s/vald-readreplica/delete:
k8s/vald-readreplica/delete: k8s/vald/delete
helm template \
--values $(HELM_VALUES) \
--set defaults.image.tag=$(VERSION) \
Expand Down
47 changes: 44 additions & 3 deletions Makefile.d/kind.mk
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,14 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#

SNAPSHOTTER_VERSION=v8.2.0

.PHONY: kind/install
## install KinD
kind/install: $(BINDIR)/kind

$(BINDIR)/kind:
$(BINDIR)/kind: $(BINDIR)/docker
mkdir -p $(BINDIR)
$(eval DARCH := $(subst aarch64,arm64,$(ARCH)))
curl -fsSL https://github.com/kubernetes-sigs/kind/releases/download/v$(KIND_VERSION)/kind-$(OS)-$(subst x86_64,amd64,$(shell echo $(DARCH) | tr '[:upper:]' '[:lower:]')) -o $(BINDIR)/kind
Expand Down Expand Up @@ -45,7 +48,6 @@ kind/restart: \
kind/stop \
kind/start


.PHONY: kind/cluster/start
## start kind (kubernetes in docker) multi node cluster
kind/cluster/start:
Expand All @@ -54,7 +56,6 @@ kind/cluster/start:
kubectl apply -f https://projectcontour.io/quickstart/operator.yaml
kubectl apply -f https://projectcontour.io/quickstart/contour-custom-resource.yaml


.PHONY: kind/cluster/stop
## stop kind (kubernetes in docker) multi node cluster
kind/cluster/stop:
Expand All @@ -70,3 +71,43 @@ kind/cluster/login:
kind/cluster/restart: \
kind/cluster/stop \
kind/cluster/start

.PHONY: kind/vs/start
## start kind (kubernetes in docker) cluster with volume snapshot
kind/vs/start:
sed -e 's/apiServerAddress: "127.0.0.1"/apiServerAddress: "$(shell grep host.docker.internal /etc/hosts | cut -f1)"/' $(ROOTDIR)/k8s/debug/kind/e2e.yaml | kind create cluster --name $(NAME)-vs --config -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve reliability of e2e.yaml modification.

The current sed command might fail silently if:

  • host.docker.internal is not in /etc/hosts
  • The file pattern isn't found in e2e.yaml

Consider adding error checking:

-sed -e 's/apiServerAddress: "127.0.0.1"/apiServerAddress: "$(shell grep host.docker.internal /etc/hosts | cut -f1)"/' $(ROOTDIR)/k8s/debug/kind/e2e.yaml | kind create cluster --name $(NAME)-vs --config - 
+HOST_IP := $(shell grep host.docker.internal /etc/hosts | cut -f1)
+ifeq ($(HOST_IP),)
+	$(error host.docker.internal not found in /etc/hosts)
+endif
+sed -e 's/apiServerAddress: "127.0.0.1"/apiServerAddress: "$(HOST_IP)"/' $(ROOTDIR)/k8s/debug/kind/e2e.yaml > /tmp/e2e_modified.yaml
+test -s /tmp/e2e_modified.yaml || (echo "Failed to modify e2e.yaml" && exit 1)
+kind create cluster --name $(NAME)-vs --config /tmp/e2e_modified.yaml
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sed -e 's/apiServerAddress: "127.0.0.1"/apiServerAddress: "$(shell grep host.docker.internal /etc/hosts | cut -f1)"/' $(ROOTDIR)/k8s/debug/kind/e2e.yaml | kind create cluster --name $(NAME)-vs --config -
HOST_IP := $(shell grep host.docker.internal /etc/hosts | cut -f1)
ifeq ($(HOST_IP),)
$(error host.docker.internal not found in /etc/hosts)
endif
sed -e 's/apiServerAddress: "127.0.0.1"/apiServerAddress: "$(HOST_IP)"/' $(ROOTDIR)/k8s/debug/kind/e2e.yaml > /tmp/e2e_modified.yaml
test -s /tmp/e2e_modified.yaml || (echo "Failed to modify e2e.yaml" && exit 1)
kind create cluster --name $(NAME)-vs --config /tmp/e2e_modified.yaml

@make kind/vs/login

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
Comment on lines +81 to +85
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add verification for CRD and controller installation.

The current implementation doesn't verify if the CRDs and controller are successfully installed. Consider adding checks:

+define verify_resource
+	@echo "Verifying $$1..."
+	@kubectl wait --for=condition=established --timeout=60s crd/$$1 || (echo "Failed to install $$1" && exit 1)
+endef

 kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
+$(call verify_resource,volumesnapshotclasses.snapshot.storage.k8s.io)
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
+$(call verify_resource,volumesnapshotcontents.snapshot.storage.k8s.io)
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
+$(call verify_resource,volumesnapshots.snapshot.storage.k8s.io)
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
+kubectl -n kube-system wait --for=condition=ready pod -l app=snapshot-controller --timeout=60s || (echo "Snapshot controller not ready" && exit 1)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
define verify_resource
@echo "Verifying $$1..."
@kubectl wait --for=condition=established --timeout=60s crd/$$1 || (echo "Failed to install $$1" && exit 1)
endef
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
$(call verify_resource,volumesnapshotclasses.snapshot.storage.k8s.io)
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
$(call verify_resource,volumesnapshotcontents.snapshot.storage.k8s.io)
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
$(call verify_resource,volumesnapshots.snapshot.storage.k8s.io)
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/$(SNAPSHOTTER_VERSION)/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
kubectl -n kube-system wait --for=condition=ready pod -l app=snapshot-controller --timeout=60s || (echo "Snapshot controller not ready" && exit 1)


mkdir -p $(TEMP_DIR)/csi-driver-hostpath \
&& curl -fsSL https://github.com/kubernetes-csi/csi-driver-host-path/archive/refs/tags/v1.15.0.tar.gz | tar zxf - -C $(TEMP_DIR)/csi-driver-hostpath --strip-components 1 \
&& cd $(TEMP_DIR)/csi-driver-hostpath \
&& deploy/kubernetes-latest/deploy.sh \
&& kubectl apply -f examples/csi-storageclass.yaml \
&& kubectl apply -f examples/csi-pvc.yaml \
&& rm -rf $(TEMP_DIR)/csi-driver-hostpath

@make k8s/metrics/metrics-server/deploy
helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server -n kube-system
sleep $(K8S_SLEEP_DURATION_FOR_WAIT_COMMAND)

Comment on lines +95 to +98
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add readiness check for metrics-server

The metrics-server deployment should include a readiness check.

 @make k8s/metrics/metrics-server/deploy
 helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server -n kube-system
-sleep $(K8S_SLEEP_DURATION_FOR_WAIT_COMMAND)
+kubectl wait --for=condition=available deployment/metrics-server -n kube-system --timeout=60s || (echo "metrics-server not ready" && exit 1)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@make k8s/metrics/metrics-server/deploy
helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server -n kube-system
sleep $(K8S_SLEEP_DURATION_FOR_WAIT_COMMAND)
@make k8s/metrics/metrics-server/deploy
helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server -n kube-system
kubectl wait --for=condition=available deployment/metrics-server -n kube-system --timeout=60s || (echo "metrics-server not ready" && exit 1)

.PHONY: kind/vs/stop
## stop kind (kubernetes in docker) cluster with volume snapshot
kind/vs/stop:
kind delete cluster --name $(NAME)-vs

.PHONY: kind/vs/login
## login command for kind (kubernetes in docker) cluster with volume snapshot
kind/vs/login:
kubectl cluster-info --context kind-$(NAME)-vs

.PHONY: kind/vs/restart
## restart kind (kubernetes in docker) cluster with volume snapshot
kind/vs/restart: \
kind/vs/stop \
kind/vs/start
4 changes: 3 additions & 1 deletion Makefile.d/minikube.mk
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
#

.PHONY: minikube/install
minikube/install: $(BINDIR)/minikube

Expand All @@ -25,11 +26,12 @@ $(BINDIR)/minikube:
# Only use this for development related to Volume Snapshots. Usually k3d is faster.
.PHONY: minikube/start
minikube/start:
minikube start --force
minikube start
minikube addons enable volumesnapshots
minikube addons enable csi-hostpath-driver
minikube addons disable storage-provisioner
minikube addons disable default-storageclass
minikube addons enable metrics-server
kubectl patch storageclass csi-hostpath-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

.PHONY: minikube/delete
Expand Down
8 changes: 8 additions & 0 deletions Makefile.d/tools.mk
Original file line number Diff line number Diff line change
Expand Up @@ -272,3 +272,11 @@ $(BINDIR)/yq:
&& curl -fsSL https://github.com/mikefarah/yq/releases/download/$(YQ_VERSION)/yq_$(OS)_$(subst x86_64,amd64,$(shell echo $(DARCH) | tr '[:upper:]' '[:lower:]')) -o $(BINDIR)/yq \
&& chmod a+x $(BINDIR)/yq

.PHONY: docker-cli/install
docker-cli/install: $(BINDIR)/docker

$(BINDIR)/docker: $(BINDIR)
curl -fsSL https://download.docker.com/linux/static/stable/$(shell uname -m)/docker-$(shell echo $(DOCKER_VERSION) | cut -c2-).tgz -o $(TEMP_DIR)/docker.tgz \
&& tar -xzvf $(TEMP_DIR)/docker.tgz -C $(TEMP_DIR) \
&& mv $(TEMP_DIR)/docker/docker $(BINDIR) \
&& rm -rf $(TEMP_DIR)/docker{.tgz,}
4 changes: 4 additions & 0 deletions charts/vald/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -714,6 +714,10 @@ initContainers
{{- $agentReadinessPort := default $.Values.defaults.server_config.healths.readiness.port $.Values.agent.server_config.healths.readiness.port }}
{{- $agentReadinessPath := default $.Values.defaults.server_config.healths.readiness.readinessProbe.httpGet.path .readinessPath }}
until [ "$(wget --server-response --spider --quiet http://{{ $.Values.agent.name }}.{{ $.namespace }}.svc.cluster.local:{{ $agentReadinessPort }}{{ $agentReadinessPath }} 2>&1 | awk 'NR==1{print $2}')" == "200" ]; do
{{- else if eq .target "agent-readreplica" }}
{{- $agentReadReplicaReadinessPort := default $.Values.defaults.server_config.healths.readiness.port $.Values.agent.server_config.healths.readiness.port }}
{{- $agentReadReplicaReadinessPath := default $.Values.defaults.server_config.healths.readiness.readinessProbe.httpGet.path .readinessPath }}
until [ "$(wget --server-response --spider --quiet http://{{ $.Values.agent.readreplica.name }}-0.{{ $.namespace }}.svc.cluster.local:{{ $agentReadReplicaReadinessPort }}{{ $agentReadReplicaReadinessPath }} 2>&1 | awk 'NR==1{print $2}')" == "200" ]; do
{{- else if eq .target "gateway-lb" }}
{{- $lbGatewayReadinessPort := default $.Values.defaults.server_config.healths.readiness.port $.Values.gateway.lb.server_config.healths.readiness.port }}
{{- $lbGatewayReadinessPath := default $.Values.defaults.server_config.healths.readiness.readinessProbe.httpGet.path .readinessPath }}
Expand Down
Loading
Loading