Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_session_metadata processs DB can grow to 20k+ entries, OOMing machine #42317

Open
fearful-symmetry opened this issue Jan 15, 2025 · 1 comment · May be fixed by #42398
Open

add_session_metadata processs DB can grow to 20k+ entries, OOMing machine #42317

fearful-symmetry opened this issue Jan 15, 2025 · 1 comment · May be fixed by #42398
Labels
Auditbeat bug needs_team Indicates that the issue/PR needs a Team:* label

Comments

@fearful-symmetry
Copy link
Contributor

We have at least one report of auditbeat OOMing a machine with the add_session_metadata processor: add_session_metadata

after a bit of tinkering, I can reproduce this with the following config:

- module: auditd
  # Load audit rules from separate files. Same format as audit.rules(7).
  audit_rule_files: [ '${path.config}/audit.rules.d/*.conf' ]
  audit_rules: |
    -a exit,always -F arch=b64 -F euid=0 -S execve -k rootact
    -a exit,always -F arch=b32 -F euid=0 -S execve -k rootact
    -a always,exit -F arch=b64 -S connect -F a2=16 -F success=1 -F key=network_connect_4
    -a always,exit -F arch=b64 -F exe=/bin/bash -F success=1 -S connect -k "remote_shell"
    -a always,exit -F arch=b64 -F exe=/usr/bin/bash -F success=1 -S connect -k "remote_shell"
    -a always,exit -F arch=b64 -S exit_group
    -a always,exit -F arch=b64 -S setsid
    -a always,exit -F arch=b64 -S execve,execveat -k exec

processors:
  - add_session_metadata:
      backend: "procfs"

I instrumented the processor to dump the entire process DB used by the hostfs provider, and just running some SSH commands in a loop is enough to get the DB up to 30k+ entries in a few minutes, before the reaper would clean them up. However, the process count sitting in the DB is still 12k+ after a few minutes. On hight-load systems, the real count is probably much higher.

I'm not entirely sure what's going on here, but there's a massive amount of log spam suggesting that there's something up with the PID values coming from auditd:

10:41:54 alexk@motmot auditbeat-8.17.1-linux-x86_64 ±|8.17 ✗|→ sudo grep -rn "get process info from proc" logs/ | wc -l
23433
10:49:03 alexk@motmot auditbeat-8.17.1-linux-x86_64 ±|8.17 ✗|→ sudo grep -rn "could not insert exit" logs/ | wc -l
4576

The majority of the processes in the database are also missing metadata, suggesting they're processes that failed a PID lookup:

10:53:57 alexk@motmot auditbeat-8.17.1-linux-x86_64 ±|8.17 ✗|→ cat /tmp/procdb.json | jq -c '.[] | .Argv' | wc -l
12051
10:57:15 alexk@motmot auditbeat-8.17.1-linux-x86_64 ±|8.17 ✗|→ cat /tmp/procdb.json | jq -c '.[] | .Argv' | grep -v null | wc -l
262

I wonder if the values we expect to be PIDs/TGIDs at various points are just TIDs instead?

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 15, 2025
@botelastic
Copy link

botelastic bot commented Jan 15, 2025

This issue doesn't have a Team:<team> label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auditbeat bug needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant