Skip to content

Latest commit

 

History

History
257 lines (225 loc) · 24 KB

usage.md

File metadata and controls

257 lines (225 loc) · 24 KB

Kubespot (AWS)

AWS EKS Setup for PCI-DSS, SOC2, HIPAA

Kubespot is AWS EKS customized to add security postures around SOC2, HIPAA, and PCI compliance. It is distributed as an open source terraform module allowing you to run it within your own AWS account without lock-in. Kubespot has been developed over a half a decade evolving with the AWS EKS distribution and before that kops. It is in use within multiple startups that have scaled from a couple founders in an apartment to billion dollar unicorns. By using Kubespot they were able to achieve the technical requirements for compliance while being able to deploy software fast.

Kubespot is a light wrapper around AWS EKS. The primary changes included in Kubespot are:

  • Locked down with security groups, private subnets and other compliance related requirements.
  • Locked down RDS and Elasticache if needed.
  • Users have a single Load Balancer through which all requests go through to reduce costs.
  • KEDA is used for scaling on event metrics such as queue sizes, user requests, CPU, memory or anything else Keda supports.
  • Karpenter is used for autoscaling.
  • Instance are lockdown with encryption, and a regular node cycle rate is set.

Tools & Setup

brew install kubectl kubernetes-helm awscli terraform

Cluster Usage

If the infrastructure is using the opsZero infrastructure as code template then you access the resources like the following:

Add your IAM credentials in ~/.aws/credentials.

[profile_name]
aws_access_key_id=<>key>
aws_secret_access_key=<secret_key>
region=us-west-2
cd environments/<nameofenv>
make kubeconfig
export KUBECONFIG=./kubeconfig # add to a .zshrc
kubectl get pods

Autoscaler

Kubespot uses Karpenter as the default autoscaler. To configure the autoscaler we need to create a file like the one below and run:

kubectl apply -f karpenter.yml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: "karpenter.k8s.aws/instance-category"
          operator: In
          values: ["t", "c", "m"]
        - key: "kubernetes.io/arch"
          operator: In
          values: ["amd64"]
        - key: "karpenter.k8s.aws/instance-cpu"
          operator: In
          values: ["1", "2", "4", "8", "16"]
        - key: "karpenter.k8s.aws/instance-hypervisor"
          operator: In
          values: ["nitro"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
      nodeClassRef:
        name: default
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 2h # 30 * 24h = 720h
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: Bottlerocket # Amazon Linux 2
  role: "Karpenter-opszero" # Set the name of the cluster
  subnetSelectorTerms:
    - tags:
        Name: opszero-public
  securityGroupSelectorTerms:
    - tags:
        Name: eks-cluster-sg-opszero-1249901478

Knative

brew install knative/client/kn
brew tap knative-extensions/kn-plugins

kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.13.1/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.13.1/serving-core.yaml
kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.13.0/kourier.yaml

kubectl patch configmap/config-network --namespace knative-serving --type merge --patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'
kubectl patch configmap/config-domain --namespace knative-serving --type merge --patch '{"data":{"fn.opszero.com":""}}'

kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.13.1/serving-hpa.yaml
kubectl apply -f https://github.com/knative/net-certmanager/releases/download/knative-v1.13.0/release.yaml

kubectl edit configmap config-network -n knative-serving
# Turn the tls
#data:
#  external-domain-tls: Enabled
  http-protocol: Redirected

kubectl edit --namespace knative-serving configmap config-network

namespace-wildcard-cert-selector:
  matchExpressions:
    - key: "kubernetes.io/metadata.name"
      operator: "In"
      values: ["my-namespace", "my-other-namespace"]


kubectl edit configmap config-certmanager -n knative-serving

# apiVersion: v1
# kind: ConfigMap
# metadata:
#   name: config-certmanager
#   namespace: knative-serving
#   labels:
#     networking.knative.dev/certificate-provider: cert-manager
# data:
#   issuerRef: |
#     kind: ClusterIssuer
#     name: letsencrypt-http01-issuer

Apply the following:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-http01-issuer
spec:
  acme:
    privateKeySecretRef:
      name: letsencrypt
    server: https://acme-v02.api.letsencrypt.org/directory
    solvers:
    - http01:
       ingress:
         class: kourier.ingress.networking.knative.dev

Cluster Setup

aws iam create-service-linked-role --aws-service-name spot.amazonaws.com

CIS Kubernetes Benchmark

Note: PodSecurityPolicy (PSP) is deprecated and PodSecurity admission controller is the new standard. The CIS Benchmark is still using PSP. We have converted the PSP to the equivalent new standard.

Control Recommendation Level Status Description
1 Control Plane Components
2 Control Plane Configuration
2.1 Logging
2.1.1 Enable audit logs L1 Active cluster_logging is configured
3 Worker Nodes
3.1 Worker Node Configuration Files
3.1.1 Ensure that the kubeconfig file permissions are set to 644 or more restrictive L1 Won't Fix Use NodeGroups or Fargate
3.1.2 Ensure that the kubelet kubeconfig file ownership is set to root:root L1 Won't Fix Use NodeGroups or Fargate
3.1.3 Ensure that the kubelet configuration file has permissions set to 644 or more restrictive L1 Won't Fix Use NodeGroups or Fargate
3.1.4 Ensure that the kubelet configuration file ownership is set to root:root L1 Won't Fix Use NodeGroups or Fargate
3.2 Kubelet
3.2.1 Ensure that the Anonymous Auth is Not Enabled L1 Won't Fix Use NodeGroups or Fargate
3.2.2 Ensure that the --authorization-mode argument is not set to AlwaysAllow L1 Won't Fix Use NodeGroups or Fargate
3.2.3 Ensure that a Client CA File is Configured L1 Won't Fix Use NodeGroups or Fargate
3.2.4 Ensure that the --read-only-port is disabled L1 Won't Fix Use NodeGroups or Fargate
3.2.5 Ensure that the --streaming-connection-idle-timeout argument is not set to 0 L1 Won't Fix Use NodeGroups or Fargate
3.2.6 Ensure that the --protect-kernel-defaults argument is set to true L1 Won't Fix Use NodeGroups or Fargate
3.2.7 Ensure that the --make-iptables-util-chains argument is set to true L1 Won't Fix Use NodeGroups or Fargate
3.2.8 Ensure that the --hostname-override argument is not set L1 Won't Fix Use NodeGroups or Fargate
3.2.9 Ensure that the --eventRecordQPS argument is set to 0 or a level which ensures appropriate event capture L2 Won't Fix Use NodeGroups or Fargate
3.2.10 Ensure that the --rotate-certificates argument is not present or is set to true L1 Won't Fix Use NodeGroups or Fargate
3.2.11 Ensure that the RotateKubeletServerCertificate argument is set to true L1 Won't Fix Use NodeGroups or Fargate
3.3 Container Optimized OS
3.3.1 Prefer using a container-optimized OS when possible L2 Active Bottlerocket ContainerOS is used.
4 Policies
4.1 RBAC and Service Accounts
4.1.1 Ensure that the cluster-admin role is only used where required L1 Active Default Configuration
4.1.2 Minimize access to secrets L1 Active iam_roles pass limited RBAC
4.1.3 Minimize wildcard use in Roles and ClusterRoles L1 Manual terraform-kubernetes-rbac Set role
4.1.4 Minimize access to create pods L1 Manual terraform-kubernetes-rbac Limit role with pod create
4.1.5 Ensure that default service accounts are not actively used L1 Manual kubectl patch serviceaccount default -p $'automountServiceAccountToken: false'
4.1.6 Ensure that Service Account Tokens are only mounted where necessary L1 Active tiphys Default set to false
4.1.7 Avoid use of system:masters group L1 Active Must manually add users and roles to system:masters
4.1.8 Limit use of the Bind, Impersonate and Escalate permissions in the Kubernetes cluster L1 Manual Limit users with system:masters role
4.2 Pod Security Policies
4.2.1 Minimize the admission of privileged containers L1 Active tiphys defaultSecurityContext.allowPrivilegeEscalation=false
4.2.2 Minimize the admission of containers wishing to share the host process ID namespace L1 Active tiphys hostPID defaults to false
4.2.3 Minimize the admission of containers wishing to share the host IPC namespace L1 Active tiphys hostIPC defaults to false
4.2.4 Minimize the admission of containers wishing to share the host network namespace L1 Active tiphys hostNetwork defaults to false
4.2.5 Minimize the admission of containers with allowPrivilegeEscalation L1 Active tiphys defaultSecurityContext.allowPrivilegeEscalation=false
4.2.6 Minimize the admission of root containers L2 Active tiphys defaultSecurityContext.[runAsNonRoot=true,runAsUser=1001]
4.2.7 Minimize the admission of containers with added capabilities L1 Active tiphys defaultSecurityContext.allowPrivilegeEscalation=false
4.2.8 Minimize the admission of containers with capabilities assigned L1 Active tiphys defaultSecurityContext.capabilities.drop: ALL
4.3 CNI Plugin
4.3.1 Ensure CNI plugin supports network policies. L1 Manual calico_enabled=true
4.3.2 Ensure that all Namespaces have Network Policies defined L1 Manual Add Network Policy manually
4.4 Secrets Management
4.4.1 Prefer using secrets as files over secrets as environment variables L2 Active tiphys writes secrets to file
4.4.2 Consider external secret storage L2 Manual Pull secrets using AWS Secret Manager.
4.5 Extensible Admission Control
4.6 General Policies
4.6.1 Create administrative boundaries between resources using namespaces L1 Manul tiphys deploy on different namespace
4.6.2 Apply Security Context to Your Pods and Containers L2 Active tiphys defaultSecurityContext is set
4.6.3 The default namespace should not be used L2 Active tiphys select namespace
5 Managed services
5.1 Image Registry and Image Scanning
5.1.1 Ensure Image Vulnerability Scanning using Amazon ECR image scanning or a third party provider L1 Active Example
5.1.2 Minimize user access to Amazon ECR L1 Active terraform-aws-mrmgr
5.1.3 Minimize cluster access to read-only for Amazon ECR L1 Active terraform-aws-mrmgr with OIDC
5.1.4 Minimize Container Registries to only those approved L2 Active terraform-aws-mrmgr
5.2 Identity and Access Management (IAM)
5.2.1 Prefer using dedicated EKS Service Accounts L1 Active terraform-aws-mrmgr with OIDC
5.3 AWS EKS Key Management Service
5.3.1 Ensure Kubernetes Secrets are encrypted using Customer Master Keys (CMKs) managed in AWS KMS L1 Active
5.4 Cluster Networking
5.4.1 Restrict Access to the Control Plane Endpoint L1 Active Set cluster_public_access_cidrs
5.4.2 Ensure clusters are created with Private Endpoint Enabled and Public Access Disabled L2 Active Set cluster_private_access = true and cluster_public_access = false
5.4.3 Ensure clusters are created with Private Nodes L1 Active Set enable_nat = true and set nodes_in_public_subnet = false
5.4.4 Ensure Network Policy is Enabled and set as appropriate L1 Manual calico_enabled=true
5.4.5 Encrypt traffic to HTTPS load balancers with TLS certificates L2 Active terraform-helm-kubespot
5.5 Authentication and Authorization
5.5.1 Manage Kubernetes RBAC users with AWS IAM Authenticator for Kubernetes L2 Active iam_users use AWS IAM Authenticator
5.6 Other Cluster Configurations
5.6.1 Consider Fargate for running untrusted workloads L1 Active Set the fargate_selector