Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Solving overlapping syscall names issues #2443

Closed
jasondellaluce opened this issue Mar 10, 2023 · 10 comments
Closed

[BUG] Solving overlapping syscall names issues #2443

jasondellaluce opened this issue Mar 10, 2023 · 10 comments
Assignees
Labels
Milestone

Comments

@jasondellaluce
Copy link
Contributor

jasondellaluce commented Mar 10, 2023

This acts both as a bug report and as a solution proposal. Solving this problem inevitably involves some breaking changes in the Falco UX, so the goal here is to find an agreement on the direction we want to go for.

Problem

In Falco rules, events can be matches by type through the evt.type filter field, both in rules' conditions and outputs (e.g. evt.type=open, evt.type in (openat, close), etc). The string values extracted through event type with evt.type come right from libsinsp, libscap and the drivers, and as for today have two distinct origins:

  • For generic events only PPME_GENERIC_E and PPM_GENERIC_X (never supported by Falco as of the latest stable version, but to-be-supported in Falco 0.35+, and available to other libs consumers), the name information associated with the PPM_SC enumeration, which is our platform/architecture-agnostic representation of syscalls and linux-kernel events
  • For every other event, the event table containing all metadata about any events supported by the library, including the ones not related to the linux kernel (e.g. plugins, metadata enrichment, etc.)

In the libraries, we currently have an design flaw where some syscalls (each with their own distinct platform-agnostic PPM_SC representation) are mapped to generate the same event at the kernel-level in the drivers. The end result is that a given evt.type comparison would unexpectedly match more than one syscall. For example, accept4 and accept are mapped to both produce the same event, and so the filter evt.type=accept matches both accept4 and accept syscalls at runtime. This can be ambiguous and misleading.

This issue is tracked in detail in falcosecurity/libs#911.

Secondary problems

Since starting the development for the syscall selection feature (#2371, #2373), Falco started relying on the PPM_SC constants to configure the kernel drivers about which syscalls need to be collected at runtime. Considering the problem above, we have some secondary issues:

  • We had to introduce APIs in libsinsp for converting sets of syscalls names and event names to a set of PPM_SC (and a set of event codes to for evaluating the Falco rules at runtime). The ambiguity above causes the conversions to be flaky and to require some on-the-fly patches such as the ones introduced in cleanup(libsinsp): Extension of ppm sc API for corner cases (e.g. event_set_to_names and names_to_sc_set) libs#915. The workaround so far consists in using "event names" (related to the event table) instead of "syscalls names" (related to actual syscalls), which works for now but is dirty, ambiguous, and suboptimal for our libsinsp API stability goals.
  • Since the set of selected system calls is now selective based on the loaded rules, rulesets with conditions such as evt.type=accept would cause the selection of more than one event, thus losing control and granularity. This also happens when selecting rules through the newly-introduced base_syscalls set ([New Feature]: User input option base_syscalls overriding default sinsp state enforcement #2373).

Proposed solution

The discussions involving me, @Andreagit97, @FedeDP, and @incertum converged in two proposals:

  • Decouple the overlapping events and create ad-hoc events for all the duplicates (tracked in [Feature] implement new event pairs for a restricted set of syscalls libs#911). The ideal choice would be to change all of them in bulk within the same Falco/libs release. The proposed targeted release is Falco 0.35.0 (end of May).
  • Decouple the ambiguous the names <-> syscall/event codes conversion APIs in libsinsp to not be ambiguous, thus supporting distinctly:
    • "event names to syscall codes set" (event_names_to_sc_set(...)) (will be used by Falco)
    • "syscall names to syscall codes set" (sc_names_to_sc_set(...))
    • The reverse-conversions

Breaking changes

Solving this issue will involve a breaking change in the Falco rules. Considering the example of a rule with a condition containing evt.type = accept, we would now need to change it into evt.type in (accept, accept4), because after the fix the accept4 syscall would have its own event. Not doing so would cause accept4 syscall events to be ignored by the rules. This applies for all the events listed in falcosecurity/libs#911, however from my personal research accept seems to be the only used one in rulesets (including our default one).

In all cases, the migration plan consists only of changes such as evt.type = accept -> evt.type in (accept, accept4).

Action items

cc @falcosecurity/falco-maintainers @falcosecurity/libs-maintainers

I'm opening this issue for everyone's visibility and because we need to agree on:

  • Proceeding with fixing the issue
  • Deciding by when the fixes need to be applied in falcosecurity/libs

Looking forward to having feedback! 🤗 If there will be no contrary opinions on those points, we'll proceed with the proposed solution.

@incertum
Copy link
Contributor

incertum commented Mar 13, 2023

Intention is to re-iterate aspects Jason already mentioned as well as provide a different view / summary of falcosecurity/libs#911 in order to derive a more concrete / unambiguous plan.

2023-03-15: Edited the setuid, setuid32 Row based on community call conclusion.

syscall string names new rule syntax requirement (Falco >= 0.35.x) current event codes current syscall codes notes
accept, accept4 evt.type in (accept, accept4) PPME_SOCKET_ACCEPT_5_E, PPME_SOCKET_ACCEPT_5_X, PPME_SOCKET_ACCEPT4_5_E, PPME_SOCKET_ACCEPT4_5_X PPM_SC_ACCEPT, PPM_SC_ACCEPT4 Overlapping event name "accept" in Falco <= 0.34.x, but already unique events, can be easily corrected
inotify_init, inotify_init1 evt.type in (inotify_init, inotify_init1) PPME_SYSCALL_INOTIFY_INIT_E, PPME_SYSCALL_INOTIFY_INIT_X PPM_SC_INOTIFY_INIT, PPM_SC_INOTIFY_INIT1 Create a new PPME event for inotify_init1? Or re-use same event type but require rules to call exact syscall string name? Same for pipe, pipe2 or eventfd, eventfd2 or signalfd, signalfd4 pairs
umount, umount evt.type in (umount, umount2) PPME_SYSCALL_UMOUNT_E, PPME_SYSCALL_UMOUNT_X, PPME_SYSCALL_UMOUNT_1_E, PPME_SYSCALL_UMOUNT_1_X PPM_SC_UMOUNT, PPM_SC_UMOUNT2 Plan is to rename the event names, else no issues with new requirements similar to accept, accept4 case
init_module evt.type = init_module PPME_GENERIC_E, PPME_GENERIC_X PPM_SC_INIT_MODULE Falco <= 0.34.x could not tap into system calls that are internally tracked as generic events. Starting with Falco >= 0.35.x all system calls Falco supports can be used in rules. Is the plan to keep PPME_GENERIC_ for these cases or exploding out a unique event type for each generic system call?
setuid, setuid32 evt.type = setuid PPME_SYSCALL_SETUID_E, PPME_SYSCALL_SETUID_X PPM_SC_SETUID, PPM_SC_SETUID32 Same applies to all other cases such as fcntl64, fcntl or ugetrlimit, getrlimit etc. Difference here is just the architecture, we decided during the 2023-03-15 community call to keep this overlap for now.

Re syscall extraction from each rule (this means evt.type corresponds to a system call event) ... do we continue to use names_to_sc_set / sc_names_to_sc_set ppm sc API or do I understand correctly we want to cut over to event_names_to_sc_set?

@jasondellaluce
Copy link
Contributor Author

Thanks @incertum!

Is the plan to keep PPME_GENERIC_ for these cases or exploding out a unique event type for each generic system call?

I think removing PPME_GENERIC_ would be too much of a structural change for this effort, plus in my opinion having it it's the right approach to support all the syscalls that don't have a specific filler in our drivers.

do we continue to use names_to_sc_set / sc_names_to_sc_set ppm sc API or do I understand correctly we want to cut over to event_names_to_sc_set?

My proposal is for to rename the current names_to_sc_set into event_names_to_sc_set, because that's its actual semantics as for now, and keep using it in Falco (so no UX change from the Falco perspective). Then, I'm also proposing adding sc_names_to_sc_set into the libsinsp API for parity, but not using it in Falco.

@incertum
Copy link
Contributor

Is the plan to keep PPME_GENERIC_ for these cases or exploding out a unique event type for each generic system call?

I think removing PPME_GENERIC_ would be too much of a structural change for this effort, plus in my opinion having it it's the right approach to support all the syscalls that don't have a specific filler in our drivers.

Glad to hear we keep PPME_GENERIC_ thank you!

do we continue to use names_to_sc_set / sc_names_to_sc_set ppm sc API or do I understand correctly we want to cut over to event_names_to_sc_set?

My proposal is for to rename the current names_to_sc_set into event_names_to_sc_set, because that's its actual semantics as for now, and keep using it in Falco (so no UX change from the Falco perspective). Then, I'm also proposing adding sc_names_to_sc_set into the libsinsp API for parity, but not using it in Falco.

Works! Re implementation details: The new event_names_to_sc_set would still directly call sc_names_to_sc_set in order to resolve the rules evt.type that correspond to a system call? We would have to right? Because even if we fix all other syscall related event names (meaning they are unique and correspond to the actual system call string name as well) we continue to have the generic event type and the associated information loss.

@jasondellaluce
Copy link
Contributor Author

That's right, we'll still rely on the syscall names when dealing with generic events.

@Andreagit97
Copy link
Member

I'm late to the party...

I think removing PPME_GENERIC_ would be too much of a structural change for this effort, plus in my opinion having it it's the right approach to support all the syscalls that don't have a specific filler in our drivers.

Agree with @jasondellaluce here, removing PPME_GENERIC_ would mean implementing all the missing syscalls, so quite a huge effort :_(

Works! Re implementation details: The new event_names_to_sc_set would still directly call sc_names_to_sc_set in order to resolve the rules evt.type that correspond to a system call? We would have to right? Because even if we fix all other syscall related event names (meaning they are unique and correspond to the actual system call string name as well) we continue to have the generic event type and the associated information loss.

Yep the end logic should be exactly the same, I would change just the name since it is quite misleading.

Then, I'm also proposing adding sc_names_to_sc_set into the libsinsp API for parity, but not using it in Falco.

Yeah I like the idea of parity if we need to add just one specular API, it's ok, but I would avoid adding tons of new APIs just to have specular methods, I would implement new methods only when we have a real need

@incertum
Copy link
Contributor

incertum commented Mar 16, 2023

Community call summary 2023-03-15:

No objection, we proceed with this proposal. In addition, we clarified that these changes accomplish a smoother and clearer end user experience.

Tracking libs event types <-> syscall refactors:

  • accept, accept4
  • umount, umount2
  • pipe, pipe2
  • inotify_init, inotify_init1
  • eventfd, eventfd2
  • signalfd, signalfd4

!!! We decided to keep the overlap for cases like setuid, setuid32 (for now) as it is merely an architecture distinction, see also @Andreagit97 comment in falcosecurity/libs#911 (comment)

Additional Action Items:

  • Finalize ppm sc API with proposed minor changes to improve API clarity
  • Update default rules (accept, accept4 case ...)
  • Blog post + Falco 0.35 release note mention
  • Create stronger or new upstream rules, e.g. Linux Kernel Module Injection Detected should be augmented with init_module system call condition, and a new detection around suspicious umount usage could be a possibility

@leogr
Copy link
Member

leogr commented Mar 16, 2023

/milestone 0.35.0

@poiana poiana added this to the 0.35.0 milestone Mar 16, 2023
@Andreagit97 Andreagit97 self-assigned this Mar 20, 2023
incertum added a commit to incertum/rules that referenced this issue Mar 31, 2023
…names deprecated

A major refactor of Falco now exposes each syscall Falco's libs
supports to the end user :)

At the same time we deprecate overlapping syscall names,
see falcosecurity/falco#2443 (comment)

Official support starts with Falco 0.35.0

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it>
Co-authored-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: incertum <melissa.kilby.oss@gmail.com>
poiana pushed a commit to falcosecurity/rules that referenced this issue Apr 3, 2023
…names deprecated

A major refactor of Falco now exposes each syscall Falco's libs
supports to the end user :)

At the same time we deprecate overlapping syscall names,
see falcosecurity/falco#2443 (comment)

Official support starts with Falco 0.35.0

Co-authored-by: Jason Dellaluce <jasondellaluce@gmail.com>
Co-authored-by: Andrea Terzolo <andrea.terzolo@polito.it>
Co-authored-by: Federico Di Pierro <nierro92@gmail.com>
Signed-off-by: incertum <melissa.kilby.oss@gmail.com>
@incertum
Copy link
Contributor

incertum commented Apr 4, 2023

All items completed, @jasondellaluce feel free to close this issue.
Blogpost is scheduled as well -> tracked in community repo going forward.

@jasondellaluce
Copy link
Contributor Author

/close

🥳

@poiana poiana closed this as completed Apr 4, 2023
@poiana
Copy link
Contributor

poiana commented Apr 4, 2023

@jasondellaluce: Closing this issue.

In response to this:

/close

🥳

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants