Skip to content

Commit

Permalink
Integrating Shipshape with ai-on-gke: Helm Scan Integration (#918)
Browse files Browse the repository at this point in the history
* Integrating Shipshape with ai-on-gke: Helm Scan Integration
This PR initiates the integration of Shipshape security scans into the ai-on-gke repository, starting with Phase 1: Helm Scan Onboarding.

Purpose:

This PR introduces a Cloud Build workflow to automatically perform Helm scans on the ai-on-gke repository using the Shipshape validation service. This initial integration focuses on scanning Helm charts for the iap and kuberay-tpu-webhook components. These components were selected because they are owned and fully controlled by the ai-on-gke team and do not require a cluster scan.

Implementation:

A new Cloud Build configuration file violation_scan_helm.yaml is added to trigger Helm scans on pull requests.
The workflow utilizes a Docker image from the validation-service-agent repository to execute the scans.
The scan targets all Helm charts within the repository.
An initial allowlist files are included to manage accepted policy exceptions.
The build will fail if any violations are found outside the allowlist.
Next Steps:

Phase 2: Identify and onboard focus components based on ownership established in collaboration with the ai-eco team.
Phase 3: Implement a component ignore feature and transition to a secure-by-default model for all remaining components.
Continuously update the allowlist in collaboration with the ai-eco team and establish a prioritized remediation strategy.
Track and report on success metrics, including the number of violations found, PRs blocked, and violations fixed.
Related Issues:

b/377714818
b/378933059
b/382726583
Future Considerations:

Integrate cluster scans for comprehensive security analysis.
Synchronize violations with the Shipshape dashboard for improved visualization and tracking.
Set up office hours with GKE Security experts for consultation and guidance on addressing violations.
This PR marks a significant step towards enhancing the security and compliance of the ai-on-gke project by proactively identifying and addressing potential vulnerabilities in Kubernetes configurations.

* Change to official repo
  • Loading branch information
blackzlq authored Dec 19, 2024
1 parent 2c1c515 commit 450d0fa
Show file tree
Hide file tree
Showing 10 changed files with 201 additions and 0 deletions.
21 changes: 21 additions & 0 deletions security_test/allowlist/category/helm/iap/defaultnamespace.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[
{
"message": "Ingress \"iap-ingress\" is in the default namespace, which is not allowed.",
"policyName": "defaultnamespace",
"resourceKey": {
"group": "networking.k8s.io",
"kind": "Ingress",
"name": "iap-ingress",
"version": "v1"
}
},
{
"message": "Secret \"iap-secret\" is in the default namespace, which is not allowed.",
"policyName": "defaultnamespace",
"resourceKey": {
"kind": "Secret",
"name": "iap-secret",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"containerName": "kuberay-tpu-webhook"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" does not set allowPrivilegeEscalation: false in its securityContext. See go/gke-shipshape#allowprivilegeescalation for more details",
"policyName": "allowprivilegeescalation",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"containerName": "kuberay-tpu-webhook"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" does not drop all capabilities in its securityContext. See go/gke-shipshape#capabilities for more details",
"policyName": "capabilities",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"containerName": "kuberay-tpu-webhook",
"image": "us-docker.pkg.dev/ai-on-gke/kuberay-tpu-webhook/tpu-webhook:v1.2.1-gke.0"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" has an image \"us-docker.pkg.dev/ai-on-gke/kuberay-tpu-webhook/tpu-webhook:v1.2.1-gke.0\" with no digest; valid image format: image[:tag]@sha256:\u003cdigest\u003e",
"policyName": "imagedigest",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"containerName": "kuberay-tpu-webhook",
"image": "us-docker.pkg.dev/ai-on-gke/kuberay-tpu-webhook/tpu-webhook:v1.2.1-gke.0"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" has an image \"us-docker.pkg.dev/ai-on-gke/kuberay-tpu-webhook/tpu-webhook:v1.2.1-gke.0\" that does not have a valid digest.",
"policyName": "imagefreshness",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"image": "us-docker.pkg.dev/ai-on-gke/kuberay-tpu-webhook/tpu-webhook:v1.2.1-gke.0",
"containerName": "kuberay-tpu-webhook"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" has an image \"us-docker.pkg.dev/ai-on-gke/kuberay-tpu-webhook/tpu-webhook:v1.2.1-gke.0\" with an invalid path. See go/gke-shipshape#imagepath for valid image paths.",
"policyName": "imagepath",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"containerName": "kuberay-tpu-webhook"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" does not set readOnlyRootFilesystem: true in its securityContext. This setting is encouraged because it can prevent attackers from writing malicious binaries into runnable locations in the container filesystem.",
"policyName": "readonlyrootfs",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[
{
"details": {
"@type": "type.googleapis.com/google.internal.kubernetes.security.validation.v1.ContainerDetails",
"containerName": "kuberay-tpu-webhook"
},
"message": "container \"kuberay-tpu-webhook\" in Deployment \"kuberay-tpu-webhook\" is running as root. Update the container to run as non-root. See go/gke-shipshape#rootless for more details",
"policyName": "rootless",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[
{
"message": "pod in Deployment \"kuberay-tpu-webhook\" must set securityContext.seccompProfile.type to value RuntimeDefault",
"policyName": "seccompprofile",
"resourceKey": {
"group": "apps",
"kind": "Deployment",
"name": "kuberay-tpu-webhook",
"namespace": "ray-system",
"version": "v1"
}
}
]
45 changes: 45 additions & 0 deletions violation_scan_helm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
substitutions:
_IMAGE: us-docker.pkg.dev/k8ssecurityvalidation-agent/k8ssecurityvalidation-agent/k8ssecurityvalidation-agent@sha256:7eaedb4153841b814e6b5367e63214318cb3318f902427b9214474e1256d0b37

steps:
- name: 'ubuntu'
id: 'Copy Folder Locally'
entrypoint: 'bash'
args:
- '-c'
- |
mkdir -p security_test/scan_target/ && find . -mindepth 1 -maxdepth 1 -type d ! -name "security_test" -exec cp -r {} security_test/scan_target/ \;
- name: 'ubuntu'
id: 'Copy metadata'
entrypoint: 'bash'
args:
- '-c'
- |
mkdir -p /workspace/security_test/scan_target
# Exclude /workspace/security_test from the copy to avoid recursive issue
find . -mindepth 1 -maxdepth 1 ! -path "./security_test" -exec cp -r {} /workspace/security_test/scan_target/ \;
chown -R 65532:65532 /workspace/security_test/scan_target
- name: 'gcr.io/cloud-builders/docker'
id: 'Run Docker Image'
args:
- 'run'
- '--network=cloudbuild'
- '--rm'
- '-v'
- '/workspace/security_test/allowlist:/workspace/security_test/allowlist'
- '-v'
- '/workspace/security_test/scan_target:/workspace/security_test/scan_target'
- '${_IMAGE}'
- '--mode=helm'
- '--allowlist_folder=/workspace/security_test/allowlist'
- '--scan_path=/workspace/security_test/scan_target'
- '--max_wait_duration=60'


# Fail the build if there are any violations
timeout: '12000s'

options:
logging: CLOUD_LOGGING_ONLY

0 comments on commit 450d0fa

Please sign in to comment.