Integrating Shipshape with ai-on-gke: Helm Scan Integration #918
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR initiates the integration of Shipshape security scans into the ai-on-gke repository, starting with Phase 1: Helm Scan Onboarding.
Purpose:
This PR introduces a Cloud Build workflow to automatically perform Helm scans on the ai-on-gke repository using the Shipshape validation service. This initial integration focuses on scanning Helm charts for the iap and kuberay-tpu-webhook components. These components were selected because they are owned and fully controlled by the ai-on-gke team and do not require a cluster scan.
Implementation:
A new Cloud Build configuration file violation_scan_helm.yaml is added to trigger Helm scans on pull requests.
The workflow utilizes a Docker image from the validation-service-agent repository to execute the scans.
The scan targets all Helm charts within the repository.
An initial allowlist files are included to manage accepted policy exceptions.
The build will fail if any violations are found outside the allowlist.
Next Steps:
Phase 2: Identify and onboard focus components based on ownership established in collaboration with the ai-eco team.
Phase 3: Implement a component ignore feature and transition to a secure-by-default model for all remaining components.
Continuously update the allowlist in collaboration with the ai-eco team and establish a prioritized remediation strategy.
Track and report on success metrics, including the number of violations found, PRs blocked, and violations fixed.
Related Issues:
b/377714818
b/378933059
b/382726583
Future Considerations:
Integrate cluster scans for comprehensive security analysis.
Synchronize violations with the Shipshape dashboard for improved visualization and tracking.
Set up office hours with GKE Security experts for consultation and guidance on addressing violations.
This PR marks a significant step towards enhancing the security and compliance of the ai-on-gke project by proactively identifying and addressing potential vulnerabilities in Kubernetes configurations.