Skip to content

Commit

Permalink
update (#228)
Browse files Browse the repository at this point in the history
* fixes runwhen/platform-core#1198

* update pdb format

* add gcp ingress first task

* add backend health check from annotation

* updates to gce-ingress cb

* template / gen rule updates

* tweak genrule

* debug template

* update name

* test rule

* fix template

* touchups

* add function check

* debug runtime

* failfast

* add debug task

* test none

* platy with os path

* add auth to gcloud

* remove debug

* add timeout

* fix typo

* debug

* test

* switch to evaluate

* update

* update

* update

* x

* x

* update

* try backtick

* x

* x

* remove code ticks

* update timeout_seconds for deployment script

* test runbook update

* debug

* test

* test env var

* debug

* more debug

* add defaults

* remove invalid dict

* revert most changes - focus on simple tasks

* fix cb

* set dit details

* add new rules

* try newline edit

* try to escape it

* remove target service

* update issue next steps

* update gcloud

* update env

* gcoud target removal test 2

* target service removals (not all)

* update

* fix kind

* add space

* hardcode kind

* add readme
  • Loading branch information
stewartshea authored Oct 30, 2023
1 parent 7418f6c commit e01d7d4
Show file tree
Hide file tree
Showing 20 changed files with 433 additions and 46 deletions.
4 changes: 0 additions & 4 deletions codebundles/cli-test/runbook.robot
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ Run CLI and Parse Output For Issues
[Tags] Stdout Test Output Pods
${rsp}= RW.CLI.Run Cli
... cmd=kubectl get pods --context ${CONTEXT} -n ${NAMESPACE}
... target_service=${kubectl}
... env=${env}
... secret_file__kubeconfig=${kubeconfig}
# TODO: remove double slashes and find WYSIWYG method for regex passing
Expand All @@ -66,7 +65,6 @@ Run CLI and Parse Output For Issues

${rsp}= RW.CLI.Run Cli
... cmd=kubectl get pods --context ${CONTEXT} -n ${NAMESPACE} -ojson
... target_service=${kubectl}
... env=${env}
... secret_file__kubeconfig=${kubeconfig}
${rsp}= RW.CLI.Parse Cli Json Output
Expand All @@ -89,15 +87,13 @@ Exec Test
[Tags] Remote Exec Command Tags Workload Pod
${df}= RW.CLI.Run Cli
... cmd=df
... target_service=${kubectl}
... env=${env}
... secret_file__kubeconfig=${kubeconfig}
... run_in_workload_with_name=deploy/crashi
... optional_namespace=${NAMESPACE}
... optional_context=${CONTEXT}
${ls}= RW.CLI.Run Cli
... cmd=ls
... target_service=${kubectl}
... env=${env}
... secret_file__kubeconfig=${kubeconfig}
... run_in_workload_with_labels=app=crashi
Expand Down
1 change: 0 additions & 1 deletion codebundles/cmd-test/runbook.robot
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ Run CLI Command
[Tags] stdout test output pods
${rsp}= RW.CLI.Run Cli
... cmd=${CLI_COMMAND}
... target_service=${kubectl}
... env={"KUBECONFIG":"./${kubeconfig.key}"}
... secret_file__kubeconfig=${kubeconfig}
RW.Core.Add Pre To Report Command Stdout:\n${rsp.stdout}
Expand Down
9 changes: 3 additions & 6 deletions codebundles/curl-gmp-kong-ingress-inspection/runbook.robot
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Library RW.Core
Library RW.CLI
Library RW.platform
Library RW.NextSteps
Library OperatingSystem

Suite Setup Suite Initialization

Expand All @@ -21,7 +22,6 @@ Check If Kong Ingress HTTP Error Rate Violates HTTP Error Threshold
${gmp_rsp}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && response=$(curl -s -d "query=rate(kong_http_requests_total{service='${INGRESS_SERVICE}',code=~'${HTTP_ERROR_CODES}'}[${TIME_SLICE}]) > ${HTTP_ERROR_RATE_THRESHOLD}" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/runwhen-nonprod-sandbox/location/global/prometheus/api/v1/query') && echo "$response" | jq -e '.data.result | length > 0' && echo "$response" | jq -r '.data.result[] | "Route:" + .metric.route + " Service:" + .metric.service + " Kong Instance:" + .metric.instance + " HTTP Error Count:" + .value[1]' || echo "No HTTP Error threshold violations found for ${INGRESS_SERVICE}."
... render_in_commandlist=true
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${ingress_name}= RW.CLI.Run Cli
Expand All @@ -48,7 +48,6 @@ Check If Kong Ingress HTTP Error Rate Violates HTTP Error Threshold
... _line__raise_issue_if_contains=Route
${gmp_json}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && curl -s -d "query=rate(kong_http_requests_total{service='${INGRESS_SERVICE}',code=~'${HTTP_ERROR_CODES}'}[${TIME_SLICE}])" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/runwhen-nonprod-sandbox/location/global/prometheus/api/v1/query' | jq .
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${history}= RW.CLI.Pop Shell History
Expand All @@ -62,7 +61,6 @@ Check If Kong Ingress HTTP Request Latency Violates Threshold
${gmp_rsp}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && response=$(curl -s -d "query=histogram_quantile(0.99, sum(rate(kong_request_latency_ms_bucket{service='${INGRESS_SERVICE}'}[${TIME_SLICE}])) by (le)) > ${REQUEST_LATENCY_THRESHOLD}" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/runwhen-nonprod-sandbox/location/global/prometheus/api/v1/query') && echo "$response" | jq -e '.data.result | length > 0' && echo "$response" | jq -r '.data.result[] | "Service: ${INGRESS_SERVICE}" + " HTTP Request Latency(ms):" + .value[1]' || echo "No HTTP request latency threshold violations found for ${INGRESS_SERVICE}."
... render_in_commandlist=true
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${ingress_name}= RW.CLI.Run Cli
Expand Down Expand Up @@ -99,7 +97,6 @@ Check If Kong Ingress Controller Reports Upstream Errors
${gmp_healthchecks_off_rsp}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && response=$(curl -s -d "query=kong_upstream_target_health{upstream='${INGRESS_UPSTREAM}',state='healthchecks_off'} > 0" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/runwhen-nonprod-sandbox/location/global/prometheus/api/v1/query') && echo "$response" | jq -e '.data.result | length > 0' && echo "$response" | jq -r '.data.result[] | "Service: ${INGRESS_UPSTREAM}" + " Healthchecks Disabled!' || echo "${INGRESS_UPSTREAM} has healthchecks enabled."
... render_in_commandlist=true
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${next_steps}= RW.NextSteps.Suggest
Expand All @@ -118,7 +115,6 @@ Check If Kong Ingress Controller Reports Upstream Errors
${gmp_healthchecks_rsp}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && response=$(curl -s -d "query=kong_upstream_target_health{upstream='${INGRESS_UPSTREAM}',state=~'dns_error|unhealthy'} > 0" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/runwhen-nonprod-sandbox/location/global/prometheus/api/v1/query') && echo "$response" | jq -e '.data.result | length > 0' && echo "$response" | jq -r '.data.result[] | "Issue detected with Service: ${INGRESS_UPSTREAM}" + " Healthcheck subsystem-state: " + .metric.subsystem + "-" + .metric.state + " Target: " + .metric.target' || echo "${INGRESS_UPSTREAM} is reported as healthy from the Kong ingress controller."
... render_in_commandlist=true
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${next_steps}= RW.NextSteps.Suggest
Expand Down Expand Up @@ -193,6 +189,7 @@ Suite Initialization
... description=The threshold in ms for request latency to be considered unhealthy.
... pattern=\w*
... example=100
${OS_PATH}= Get Environment Variable PATH
Set Suite Variable ${GCLOUD_SERVICE} ${GCLOUD_SERVICE}
Set Suite Variable ${gcp_credentials_json} ${gcp_credentials_json}
Set Suite Variable ${GCP_PROJECT_ID} ${GCP_PROJECT_ID}
Expand All @@ -204,4 +201,4 @@ Suite Initialization
Set Suite Variable ${HTTP_ERROR_CODES} ${HTTP_ERROR_CODES}
Set Suite Variable
... ${env}
... {"CLOUDSDK_CORE_PROJECT":"${GCP_PROJECT_ID}","GOOGLE_APPLICATION_CREDENTIALS":"./${gcp_credentials_json.key}"}
... {"CLOUDSDK_CORE_PROJECT":"${GCP_PROJECT_ID}","GOOGLE_APPLICATION_CREDENTIALS":"./${gcp_credentials_json.key}","PATH":"$PATH:${OS_PATH}"}
9 changes: 3 additions & 6 deletions codebundles/curl-gmp-nginx-ingress-inspection/runbook.robot
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Library BuiltIn
Library RW.Core
Library RW.CLI
Library RW.platform
Library OperatingSystem

Suite Setup Suite Initialization

Expand All @@ -21,22 +22,18 @@ Fetch Nginx Ingress HTTP Errors From GMP And Perform Inspection On Results
${gmp_rsp}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && curl -d "query=rate(nginx_ingress_controller_requests{host='${INGRESS_HOST}', service='${INGRESS_SERVICE}', status=~'${ERROR_CODES}'}[${TIME_SLICE}]) > 0" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/${GCP_PROJECT_ID}/location/global/prometheus/api/v1/query' | jq -r 'if .data.result[0] then "Host:" + .data.result[0].metric.host + " Ingress:" + .data.result[0].metric.ingress + " Namespace:" + .data.result[0].metric.exported_namespace + " Service:" + .data.result[0].metric.service else "" end'
... render_in_commandlist=true
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${gmp_json}= RW.CLI.Run Cli
... cmd=gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS && curl -d "query=rate(nginx_ingress_controller_requests{host='${INGRESS_HOST}', service='${INGRESS_SERVICE}', status=~'${ERROR_CODES}'}[${TIME_SLICE}]) > 0" -H "Authorization: Bearer $(gcloud auth print-access-token)" 'https://monitoring.googleapis.com/v1/projects/${GCP_PROJECT_ID}/location/global/prometheus/api/v1/query'
... target_service=${GCLOUD_SERVICE}
... env=${env}
... secret_file__gcp_credentials_json=${gcp_credentials_json}
${k8s_ingress_details}= RW.CLI.Run Cli
... cmd=namespace="${NAMESPACE}"; context="${CONTEXT}"; ingress="${INGRESS_OBJECT_NAME}"; echo "Ingress: $ingress"; health_status="NA"; services=(); backend_services=$(${KUBERNETES_DISTRIBUTION_BINARY} get ingress "$ingress" -n "$namespace" --context "$context" -ojsonpath='{range .spec.rules[*].http.paths[*]}{.backend.service.name}{" "}{.backend.service.port.number}{"\\n"}{end}'); IFS=$'\\n'; for line in $backend_services; do service=$(echo "$line" | cut -d " " -f 1); port=$(echo "$line" | cut -d " " -f 2); if [ -n "$service" ] && [ -n "$port" ]; then echo "Backend Service: $service, Port: $port"; service_exists=$(${KUBERNETES_DISTRIBUTION_BINARY} get service "$service" -n "$namespace" --context "$context" -ojsonpath='{.metadata.name}'); if [ -z "$service_exists" ]; then health_status="Unhealthy"; echo "Validation: Service $service does not exist"; else endpoint_pods=$(${KUBERNETES_DISTRIBUTION_BINARY} get endpoints "$service" -n "$namespace" --context "$context" -ojsonpath='{range .subsets[*].addresses[*]}- Pod Name: {.targetRef.name}, Pod IP: {.ip}\\n{end}'); if [ -z "$endpoint_pods" ]; then health_status="Unhealthy"; echo "Validation: Endpoint for service $service does not have any pods"; else echo "Endpoint Pod:"; echo -e "$endpoint_pods"; for pod in $endpoint_pods; do if [[ $pod == *"- Pod Name: "* ]]; then pod_name="\${pod#*- Pod Name: }"; pod_name="\${pod_name%%,*}"; if [ -n "$pod_name" ]; then owner_kind=$(${KUBERNETES_DISTRIBUTION_BINARY} get pod "$pod_name" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].kind}'); if [ -n "$owner_kind" ]; then if [ "$owner_kind" = "StatefulSet" ] || [ "$owner_kind" = "DaemonSet" ]; then owner_info="$(${KUBERNETES_DISTRIBUTION_BINARY} get pod "$pod_name" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].name}') $owner_kind"; else replicaset=$(${KUBERNETES_DISTRIBUTION_BINARY} get pod "$pod_name" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].name}'); if [ -n "$replicaset" ]; then owner_kind=$(${KUBERNETES_DISTRIBUTION_BINARY} get replicaset "$replicaset" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].kind}'); owner_name=$(${KUBERNETES_DISTRIBUTION_BINARY} get replicaset "$replicaset" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].name}'); owner_info="$owner_kind:$owner_name"; fi; fi; fi; if [ -n "$owner_info" ]; then echo "Owner: $owner_info"; fi; fi; fi; done; health_status="Healthy"; fi; fi; services+=("$service"); fi; done; for service in "\${services[@]}"; do service_exists=$(${KUBERNETES_DISTRIBUTION_BINARY} get service "$service" -n "$namespace" --context "$context" -ojsonpath='{.metadata.name}'); if [ -z "$service_exists" ]; then health_status="Unhealthy"; echo "Validation: Service $service does not exist"; else endpoint_exists=$(${KUBERNETES_DISTRIBUTION_BINARY} get endpoints "$service" -n "$namespace" --context "$context" -ojsonpath='{.metadata.name}'); if [ -z "$endpoint_exists" ]; then health_status="Unhealthy"; echo "Validation: Endpoint for service $service does not exist"; fi; fi; done; if [ "$health_status" = "Unhealthy" ]; then echo "Health Status: $health_status"; echo "====================="; elif [ "$health_status" = "Healthy" ]; then echo "Health Status: $health_status"; fi; echo "------------"
... target_service=${kubectl}
... env=${env}
... secret_file__kubeconfig=${kubeconfig}
${ingress_owner}= RW.CLI.Run Cli
... cmd=echo "${k8s_ingress_details.stdout}" | grep 'Owner:[^ ]*' | awk -F': ' '{print $2}'
... target_service=${GCLOUD_SERVICE}
RW.CLI.Parse Cli Output By Line
... rsp=${gmp_rsp}
... set_severity_level=2
Expand All @@ -62,7 +59,6 @@ Find Ingress Owner and Service Health
[Tags] owner ingress service endpoints
${k8s_ingress_details}= RW.CLI.Run Cli
... cmd=namespace="${NAMESPACE}"; context="${CONTEXT}"; ingress="${INGRESS_OBJECT_NAME}"; echo "Ingress: $ingress"; health_status="NA"; services=(); backend_services=$(${KUBERNETES_DISTRIBUTION_BINARY} get ingress "$ingress" -n "$namespace" --context "$context" -ojsonpath='{range .spec.rules[*].http.paths[*]}{.backend.service.name}{" "}{.backend.service.port.number}{"\\n"}{end}'); IFS=$'\\n'; for line in $backend_services; do service=$(echo "$line" | cut -d " " -f 1); port=$(echo "$line" | cut -d " " -f 2); if [ -n "$service" ] && [ -n "$port" ]; then echo "Backend Service: $service, Port: $port"; service_exists=$(${KUBERNETES_DISTRIBUTION_BINARY} get service "$service" -n "$namespace" --context "$context" -ojsonpath='{.metadata.name}'); if [ -z "$service_exists" ]; then health_status="Unhealthy"; echo "Validation: Service $service does not exist"; else endpoint_pods=$(${KUBERNETES_DISTRIBUTION_BINARY} get endpoints "$service" -n "$namespace" --context "$context" -ojsonpath='{range .subsets[*].addresses[*]}- Pod Name: {.targetRef.name}, Pod IP: {.ip}\\n{end}'); if [ -z "$endpoint_pods" ]; then health_status="Unhealthy"; echo "Validation: Endpoint for service $service does not have any pods"; else echo "Endpoint Pod:"; echo -e "$endpoint_pods"; for pod in $endpoint_pods; do if [[ $pod == *"- Pod Name: "* ]]; then pod_name="\${pod#*- Pod Name: }"; pod_name="\${pod_name%%,*}"; if [ -n "$pod_name" ]; then owner_kind=$(${KUBERNETES_DISTRIBUTION_BINARY} get pod "$pod_name" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].kind}'); if [ -n "$owner_kind" ]; then if [ "$owner_kind" = "StatefulSet" ] || [ "$owner_kind" = "DaemonSet" ]; then owner_info="$(${KUBERNETES_DISTRIBUTION_BINARY} get pod "$pod_name" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].name}') $owner_kind"; else replicaset=$(${KUBERNETES_DISTRIBUTION_BINARY} get pod "$pod_name" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].name}'); if [ -n "$replicaset" ]; then owner_kind=$(${KUBERNETES_DISTRIBUTION_BINARY} get replicaset "$replicaset" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].kind}'); owner_name=$(${KUBERNETES_DISTRIBUTION_BINARY} get replicaset "$replicaset" -n "$namespace" --context "$context" -o=jsonpath='{.metadata.ownerReferences[0].name}'); owner_info="$owner_name $owner_kind"; fi; fi; fi; if [ -n "$owner_info" ]; then echo "Owner: $owner_info"; fi; fi; fi; done; health_status="Healthy"; fi; fi; services+=("$service"); fi; done; for service in "\${services[@]}"; do service_exists=$(${KUBERNETES_DISTRIBUTION_BINARY} get service "$service" -n "$namespace" --context "$context" -ojsonpath='{.metadata.name}'); if [ -z "$service_exists" ]; then health_status="Unhealthy"; echo "Validation: Service $service does not exist"; else endpoint_exists=$(${KUBERNETES_DISTRIBUTION_BINARY} get endpoints "$service" -n "$namespace" --context "$context" -ojsonpath='{.metadata.name}'); if [ -z "$endpoint_exists" ]; then health_status="Unhealthy"; echo "Validation: Endpoint for service $service does not exist"; fi; fi; done; if [ "$health_status" = "Unhealthy" ]; then echo "Health Status: $health_status"; echo "====================="; elif [ "$health_status" = "Healthy" ]; then echo "Health Status: $health_status"; fi; echo "------------"
... target_service=${kubectl}
... env=${env}
... secret_file__kubeconfig=${kubeconfig}
... render_in_commandlist=true
Expand Down Expand Up @@ -147,6 +143,7 @@ Suite Initialization
... pattern=\w*
... example=500
... default=500|501|502
${OS_PATH}= Get Environment Variable PATH
Set Suite Variable ${kubeconfig} ${kubeconfig}
Set Suite Variable ${kubectl} ${kubectl}
Set Suite Variable ${KUBERNETES_DISTRIBUTION_BINARY} ${KUBERNETES_DISTRIBUTION_BINARY}
Expand All @@ -161,4 +158,4 @@ Suite Initialization
Set Suite Variable ${INGRESS_OBJECT_NAME} ${INGRESS_OBJECT_NAME}
Set Suite Variable
... ${env}
... {"CLOUDSDK_CORE_PROJECT":"${GCP_PROJECT_ID}","GOOGLE_APPLICATION_CREDENTIALS":"./${gcp_credentials_json.key}", "KUBECONFIG":"./${kubeconfig.key}"}
... {"CLOUDSDK_CORE_PROJECT":"${GCP_PROJECT_ID}","GOOGLE_APPLICATION_CREDENTIALS":"./${gcp_credentials_json.key}", "KUBECONFIG":"./${kubeconfig.key}","PATH":"$PATH:${OS_PATH}"}
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,7 @@ spec:
value: '1.2'
- name: DESIRED_RESPONSE_CODE
value: '200'
- name: OWNER_DETAILS
value: "{'name':'{{match_resource.resource.metadata.name}}', 'kind':'Ingress','namespace':'{{match_resource.resource.metadata.namespace}}'}"
secretsProvided: []
servicesProvided: []
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,7 @@ spec:
value: "1.2"
- name: DESIRED_RESPONSE_CODE
value: "200"
- name: OWNER_DETAILS
value: "{'name':'{{match_resource.resource.metadata.name}}', 'kind':'Service','namespace':'{{match_resource.resource.metadata.namespace}}'}"
secretsProvided: []
servicesProvided: []
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,7 @@ spec:
value: '1.2'
- name: DESIRED_RESPONSE_CODE
value: '200'
- name: OWNER_DETAILS
value: "{'name':'{{match_resource.resource.metadata.name}}', 'kind':'Ingress','namespace':'{{match_resource.resource.metadata.namespace}}'}"
secretsProvided: []
servicesProvided: []
Loading

0 comments on commit e01d7d4

Please sign in to comment.