Lab documentation for testing Tetragon Security Capabilities
Run the below eksctl command to create a minimal one node EKS cluster
eksctl create cluster nigel-eks-cluster --node-type t3.xlarge --nodes=1 --nodes-min=0 --nodes-max=3 --max-pods-per-node 58
This process can take a good 10 mins to complete:
Run the below eksctl command to create a one-node cluster using the optimized Bottlerocket AMI
that was designed for eBPF.
#!/usr/bin/env bash
export CLUSTER_NAME="nigel-eks-cluster"
export AWS_DEFAULT_REGION="eu-west-1"
export KUBECONFIG="/tmp/kubeconfig-${CLUSTER_NAME}.conf"
export TAGS="Owner=C00292053@itcarlow.ie Environment=staging"
set -euxo pipefail
cat > "/tmp/eksctl-${CLUSTER_NAME}.yaml" << EOF
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ${CLUSTER_NAME}
region: ${AWS_DEFAULT_REGION}
tags: &tags
$(echo "${TAGS}" | sed "s/ /\\n /g; s/^/ /g; s/=/: /g")
iam:
withOIDC: true
managedNodeGroups:
- name: managed-ng-1
amiFamily: AmazonLinux2
# amiFamily: Bottlerocket
instanceType: t3a.medium
desiredCapacity: 1
privateNetworking: true
minSize: 0
maxSize: 3
volumeSize: 20
volumeType: gp3
maxPodsPerNode: 100
tags:
<<: *tags
compliance:na:defender: eks-node
# compliance:na:defender: bottlerocket
volumeEncrypted: false
disableIMDSv1: true
taints:
- key: "node.cilium.io/agent-not-ready"
value: "true"
effect: "NoSchedule"
EOF
eksctl create cluster --config-file "/tmp/eksctl-${CLUSTER_NAME}.yaml" --kubeconfig "${KUBECONFIG}"
sleep 30
cilium install --helm-set cluster.name="${CLUSTER_NAME}"
cilium status --wait
# cilium connectivity test
echo -e "*****\n export KUBECONFIG=${KUBECONFIG} \n*****"
To install and deploy Tetragon, run the following commands:
helm repo add cilium https://helm.cilium.io
helm repo update
helm install tetragon cilium/tetragon -n kube-system
kubectl rollout status -n kube-system ds/tetragon -w
kubectl apply -f https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/privileged-pod.yaml
Enable visibility to capability & namespace changes via the configmap
by setting enable-process-cred
and enable-process-ns
to true
:
kubectl edit cm -n kube-system tetragon-config
Restart the Tetragon daemonset to enforce those changes:
kubectl rollout restart -n kube-system ds/tetragon
The Tetragon agent has since restarted and therefore is only 21 seconds old:
Let's then open a new terminal window to monitor the events from the overly-permissive pod:
kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | tetra getevents -o compact --namespace default --pod test-pod-1
We are already alerted on the fact that our pod has elevated admin privileges - CAP_SYS_ADMIN
In the first window, terminal shell into the overly-permissive pod:
kubectl exec -it nigel-app -- bash
We receive a bunch of process activity after we shell into the pod.
However, the data is not so usefil in its current state.
TracingPolicy
is a user-configurable Kubernetes custom resource that allows users to trace arbitrary events in the kernel and optionally define actions to take on a match. We can enable it by running the below command:
I started by killing a process when the user attempts to open any file in the /tmp
directory:
kubectl apply -f https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/sigkill-example.yaml
If you don't need the profile that detects cat
attempts on files in temp, you can delete it:
kubectl delete tracingpolicies tmp-read-file-sigkill
Output if the Policy is applied successfully:
Download the cryptomining 'xmrig
' scenario for now:
https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/TracingPolicies/mining-binary-sigkill.yaml
Confirm the file is formatted correctly:
cat mining-binary-sigkill.yaml
Apply the file
kubectl apply -f mining-binary-sigkill.yaml
Download the xmrig
binary from the official Github repository:
curl -OL https://github.com/xmrig/xmrig/releases/download/v6.16.4/xmrig-6.16.4-linux-static-x64.tar.gz
We are definitely seeing the activity in realtime.
However, the only context is that the process started and then there was an exit:
Unzip the tarbal package to access the malicious files:
tar -xvf xmrig-6.16.4-linux-static-x64.tar.gz
Move to the directory holding the miner:
cd xmrig-6.16.4
For the purpose of testing, run chmod
to trigger the SetGid Bit detection:
chmod u+s xmrig
Should trigger the detection, but there's likely no actual change here:
find / -perm /6000 -type f
Run the cryptominer in background mode (this won't show anything in your shell)
./xmrig --donate-level 8 -o xmr-us-east1.nanopool.org:14433 -u 422skia35WvF9mVq9Z9oCMRtoEunYQ5kHPvRqpH1rGCv1BzD5dUY4cD8wiCMp4KQEYLAN1BuawbUEJE99SNrTv9N9gf2TWC --tls --coin monero
After running ./xmrig
, the sigkill
action is performed on the binary:
To view TCP connect events, apply the example TCP connect TracingPolicy:
kubectl apply -f https://raw.githubusercontent.com/cilium/tetragon/main/examples/tracingpolicy/tcp-connect.yaml
We can see the miner connections were tracked in Tetragon:
If you don't need network obseravbility any longer, it can be removed:
kubectl delete -f https://raw.githubusercontent.com/cilium/tetragon/main/examples/tracingpolicy/tcp-connect.yaml
The goal was to prevent users from shelling into a over-permissive workload (CAP_SYS_ADMIN privileges).
It works in the sense that I get an intrnal error, but it does not perform the sigkill action correctly.
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: "kubectl-exec-sigkill"
spec:
kprobes:
- call: "sys_execve"
syscall: true
args:
- index: 0
type: "string"
selectors:
- matchArgs:
- index: 0
operator: "Equal"
values:
- "kubectl"
- index: 1
operator: "Equal"
values:
- "exec"
- index: 2
operator: "Equal"
values:
- "-it"
matchCapabilities:
- type: Effective
operator: In
values:
- "CAP_SYS_ADMIN"
selectors:
- matchActions:
- action: Sigkill
In fact, I get inconsistent results. In one case, I could shell in immediately
However, sigkill was performed on all actions except moving between directories.
I was unable to shell back into the running workload after I had exited the workload
Use the Stratum protocol
./xmrig -o stratum+tcp://xmr.pool.minergate.com:45700 -u lies@lies.lies -p x -t 2
I started by killing a process when the user attempts to open any file in the /tmp
directory:
kubectl apply -f https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/sigkill-example.yaml
This totally worked!!
I used this TracingProfile as the foundation for my Tetragon SigKill rule:
https://gist.github.com/henrysachs/1975a8fe862216b4301698c8c3135e85
Naturally, I don't need this TracingProfile
in the real world. So I deleted it.
kubectl delete -f https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/sigkill-example.yaml
I also wanted to grep for only cases where the SigKill was successful. The rest is just noise in testing:
kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | tetra getevents -o compact --namespace default --pod nigel-app | grep exit
Installing a suspicious networking tool like telnet
yum install telnet telnet-server -y
If this fails, just apply a few modifications to the registry management:
cd /etc/yum.repos.d/
sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-*
sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*
Update the yum registry manager:
yum update -y
Now, try to install telnet and telnet server from the registry manager:
yum install telnet telnet-server -y
yum install bind-utils
In general, you can search for the nslookup
package provides a command using the yum provides command:
yum provides '*bin/nslookup'
Just to generate the detection, run nslookup or telnet
:
nslookup ebpf.io
telnet
Let's also test tcpdump to prove the macro is working:
yum install tcpdump -y
tcpdump -D
tcpdump --version
tcpdump -nnSX port 443
kubectl apply -f https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/privileged-pod.yaml
kubectl exec -it nigel-app -- bash
wget https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/TracingPolicies/multi-binary-sigkill.yaml
cat multi-binary-sigkill.yaml
After you've run kubectl apply -f
on the above TracingPolicy
, we can exec back into the nigel-app
pod.
From here we can download and run the second minerd
binary.
curl -LO https://github.com/pooler/cpuminer/releases/download/v2.5.1/pooler-cpuminer-2.5.1-linux-x86_64.tar.gz
tar -xf pooler-cpuminer-2.5.1-linux-x86_64.tar.gz
./minerd -a scrypt -o stratum+tcp://pool.example.com:3333 -u johnsmith.worker1 -p mysecretpassword
Working on this new TracingPolicy - but I am unable to kill read/write activity on sensitive files listed here:
https://github.com/falcosecurity/rules/blob/c558fc7d2d02cc2c2edc968fe5770d544f1a9d55/rules/falco_rules.yaml#L307C1-L308C82
kubectl apply -f https://raw.githubusercontent.com/nigeldouglas-itcarlow/Tetragon-Lab/main/TracingPolicies/read-sensitive-files.yaml
In the given configuration for the raw system call subsystem, the values "59" and "322" represent specific syscall IDs that are being monitored for the sys_exit event.
Each system call in Linux is assigned a unique ID, and these IDs are used to identify and differentiate between different system calls. The syscall ID is an integral part of the tracing process as it allows the system to identify the specific system call being executed.
In this case, the configuration is monitoring the sys_exit event for syscall ID 59 and syscall ID 322. By specifying these syscall IDs, the tracepoints will capture and log information when these specific system calls are exited.
spec:
tracepoints:
- subsystem: raw_syscalls
event: sys_exit
# args: syscall id
args:
- index: 4
type: int64
selectors:
- matchArgs:
- index: 4
operator: Equal
values:
- "59"
- "322"
The args section specifies the argument for the syscall being monitored. In this case, the syscall ID is located at index 4, and the type is specified as int64. This information helps to ensure that the correct value is captured and processed when tracing the sys_exit event for the specified syscall IDs.
Please note that the values "59" and "322" correspond to specific system calls in the Linux kernel. The exact meaning and purpose of these system calls can vary depending on the specific Linux distribution and version you are using. You can refer to the Linux kernel documentation or relevant resources to understand the specific functionality associated with these syscall IDs in your environment.
Confirm that AWS CLI
is connected to your EKS Cluster
in order to work from terminal.
aws configure --profile nigel-aws-profile
export AWS_PROFILE=nigel-aws-profile
aws sts get-caller-identity --profile nigel-aws-profile
aws eks update-kubeconfig --region eu-west-1 --name nigel-eks-cluster
Remember to scale-down the cluster to 0 Nodes
or delete the cluster
when unused.
eksctl get cluster
eksctl get nodegroup --cluster nigel-eks-cluster
eksctl scale nodegroup --cluster nigel-eks-cluster --name ng-6194909f --nodes 0
eksctl delete cluster --name nigel-eks-cluster