Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(driver): remove some inconsistencies in our event tables #936

Merged
merged 5 commits into from
Mar 7, 2023

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind bug

Any specific area of the project related to this PR?

/area driver-kmod
/area driver-bpf
/area driver-modern-bpf

Does this PR require a change in the driver versions?

What this PR does / why we need it:

Working on #889 we discovered some inconsistencies in our tables, this PR tries to address them:

  • use specific instrumentation with some syscalls that were wrongly treated as generic. We already had events and fillers for these syscalls but for some reason, we considered them generic :/ More in detail syscalls that changed behavior are:
    • umount
    • lstat64
    • sendv
    • recv
  • assign to all unknown events the same name NA
  • add some forgotten EF_OLD_VERSION flags

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix(driver): remove some inconsistencies in our event tables

@@ -270,7 +270,7 @@ const struct ppm_event_info g_event_info[] = {
[PPME_SYSCALL_PPOLL_X] = {"ppoll", EC_WAIT | EC_SYSCALL, EF_WAITS, 2, {{"res", PT_ERRNO, PF_DEC}, {"fds", PT_FDLIST, PF_DEC} } },
[PPME_SYSCALL_MOUNT_E] = {"mount", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 1, {{"flags", PT_FLAGS32, PF_HEX, mount_flags} } },
[PPME_SYSCALL_MOUNT_X] = {"mount", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 4, {{"res", PT_ERRNO, PF_DEC}, {"dev", PT_CHARBUF, PF_NA}, {"dir", PT_FSPATH, PF_NA}, {"type", PT_CHARBUF, PF_NA} } },
[PPME_SYSCALL_UMOUNT_E] = {"umount", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 1, {{"flags", PT_FLAGS32, PF_HEX, umount_flags} } },
[PPME_SYSCALL_UMOUNT_E] = {"umount", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 1, {{"flags", PT_FLAGS32, PF_HEX, umount_flags} } }, // we need to create separate events for umount and umount2, umount doesn't have the flag parameter!!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks yes that was my initial feedback when we chatted on slack.

Copy link
Contributor

@incertum incertum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Andreagit97 my only feedback would on the umount case which I am not sure about. I understand it is the only corner case with some sort of overloaded name "umount" that spans both a generic (old umount) and a defined event umount2 (CC @jasondellaluce). On the other hand umount2 was introduced in kernel 2.2 and seems to be the one used. For instance, none of my machines and kernels have the old umount defined, so how would we test it? Is it worth the trouble or are we ok just keeping umount as generic?

Renaming can definitely be done in this PR for umount2 just like you intended to

PPME_SYSCALL_UMOUNT_E -> PPME_SYSCALL_UMOUNT2_E
PPME_SYSCALL_UMOUNT_X -> PPME_SYSCALL_UMOUNT2_X

while however keeping the overloaded event name "umount" to not break Falco rules and stay consistent with the "accept" case which points to both accept, accept4 syscalls.

Btw this is something we can consider to break, but this would need to be timed with a breaking Falco release and we can discuss if Falco 0.35 should be the one. Personally would vote to break and refactor it down the road, so that Falco rules always have to use the exact syscall names, but from my perspective it does not need to happen now.

Rest of the changes all look good, thanks for catching them.

@hbrueckner
Copy link
Contributor

while however keeping the overloaded event name "umount" to not break Falco rules and stay consistent with the "accept" case which points to both accept, accept4 syscalls.

Btw this is something we can consider to break, but this would need to be timed with a breaking Falco release and we can discuss if Falco 0.35 should be the one. Personally would vote to break and refactor it down the road, so that Falco rules always have to use the exact syscall names, but from my perspective it does not need to happen now.

@incertum What I like to better understand is whether the flags are actually consumed in the upper stack. I did a grep in libs, falco, rules, and did not find any consumer. Maybe, the break comes finally down to just re-define the event definition and have an event w/o flags for it while introducing the umount2 with the existing filler implementation.

@Andreagit97
Copy link
Member Author

Andreagit97 commented Mar 2, 2023

Btw this is something we can consider to break, but this would need to be timed with a breaking Falco release and we can discuss if Falco 0.35 should be the one. Personally would vote to break and refactor it down the road, so that Falco rules always have to use the exact syscall names, but from my perspective it does not need to happen now.

I definitely agree with that, we need to address all these issues one time for all. Here #911 i've tracked all the inconsistencies we have found right now...
Not sure what is the best way to address them... I already have another branch to address some of them and changes are not just a few lines probably so probably I would merge these fixes and then iterate over them, WDYT? (just for visibility this is the branch https://github.com/falcosecurity/libs/compare/master...Andreagit97:support_missing_syscalls?expand=1)

@incertum What I like to better understand is whether the flags are actually consumed in the upper stack. I did a grep in libs, falco, rules, and did not find any consumer. Maybe, the break comes finally down to just re-define the event definition and have an event w/o flags for it while introducing the umount2 with the existing filler implementation.

I didn't catch which flags we are talking about here... btw yes agree with you that we need to redefine it, not sure if we want to do that here

@incertum
Copy link
Contributor

incertum commented Mar 2, 2023

@incertum What I like to better understand is whether the flags are actually consumed in the upper stack. I did a grep in libs, falco, rules, and did not find any consumer. Maybe, the break comes finally down to just re-define the event definition and have an event w/o flags for it while introducing the umount2 with the existing filler implementation.

Correct Falco has no upstream rule atm that uses umount as evt.type, when I once checked it wasn't working, but that was a good time ago, probably before we introduced umount2? I think requiring umount2 in Falco rules is a good choice, but it's a big breaking paradigm change in general that goes beyond umount and also applies to other cases, so would say everyone should be on board with this, especially the Falco repo maintainers and it wouldn't be a change for this PR.

Decision for this PR seems to be: Do we want to keep umount as generic syscall or go with the change proposed here plus @hbrueckner as you pointed out finalize the complete change so we wrap it up? Would be fine either way as it seems the old umount is not relevant on most systems, but for completeness wanted to bring it up. @Andreagit97 feel free to make a call.

@incertum
Copy link
Contributor

incertum commented Mar 2, 2023

Btw this is something we can consider to break, but this would need to be timed with a breaking Falco release and we can discuss if Falco 0.35 should be the one. Personally would vote to break and refactor it down the road, so that Falco rules always have to use the exact syscall names, but from my perspective it does not need to happen now.

I definitely agree with that, we need to address all these issues one time for all. Here #911 i've tracked all the inconsistencies we have found right now... Not sure what is the best way to address them... I already have another branch to address some of them and changes are not just a few lines probably so probably I would merge these fixes and then iterate over them, WDYT? (just for visibility this is the branch https://github.com/falcosecurity/libs/compare/master...Andreagit97:support_missing_syscalls?expand=1)

Very much on board, love it, as said above @jasondellaluce and other Falco repo maintainers should definitely be on board with it given the broader end user implications.

@FedeDP
Copy link
Contributor

FedeDP commented Mar 6, 2023

/milestone 0.11.0

@poiana poiana added this to the 0.11.0 milestone Mar 6, 2023
Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Andre!

Very much on board, love it, as said above @jasondellaluce and other Falco repo maintainers should definitely be on board with it given the broader end user implications.

We will introduce a potential breaking change every time we fix any of the cases where 2+ PPM_SC point to the same non-generic event table entry (e.g. pipe/pipe2, umount/umount2, accept/accept4, etc...). Of those, I think the most impactful for Falco rules will be accept/accept4 that is used in some default rules, and likely used in custom private rules as well.

Given so, my personal take would be to prefer fixing these all together instead of one-by-one and add a clear notice in the release changelog. Now, since in 0.35.0 we are breaking-revisiting the whole syscall selection and adaptiveness notions, I think 0.35.0 would a good candidate for this kind of breaking changes too (but I'm open to also postpone it if someone is not onboard).

@Andreagit97
Copy link
Member Author

Given so, my personal take would be to prefer fixing these all together instead of one-by-one and add a clear notice in the release changelog.

The only issue in doing a one-shot is that the PR would be huge, this branch https://github.com/falcosecurity/libs/compare/master...Andreagit97:support_missing_syscalls?expand=1 addresses just 4 of them and it is already huge :/ (there are also some tests inside)

Now, since in 0.35.0 we are breaking-revisiting the whole syscall selection and adaptiveness notions, I think 0.35.0 would a good candidate for this kind of breaking changes too (but I'm open to also postpone it if someone is not onboard).

Agree with this, since we are working on all those tables refactors and breaking changes this seems the right moment to address them

@jasondellaluce
Copy link
Contributor

The only issue in doing a one-shot is that the PR would be huge, this branch https://github.com/falcosecurity/libs/compare/master...Andreagit97:support_missing_syscalls?expand=1 addresses just 4 of them and it is already huge :/ (there are also some tests inside)

Of course I didn't mean to squash all the changes in a single PR, but I would suggest including them all in a single libs release.

@jasondellaluce
Copy link
Contributor

cc @falcosecurity/libs-maintainers for visibility.

@incertum
Copy link
Contributor

incertum commented Mar 6, 2023

@jasondellaluce and @Andreagit97 leaving another comment to re-iterate that I would be all for implementing all these changes for Falco 0.35, afterwards we should stabilize :) Therefore, explicitly approving to require each system call (including variants) to be called by the official syscall string name in the Falco rules.

Re pre-release timing for testing, let's finish the umount refactor here in this PR, but keep the shared umount event name for both umount and umount2 for now, after that we should bump libs in Falco so we can move forward with testing. Then we start the breaking change libs refactors?

@jasondellaluce
Copy link
Contributor

Re pre-release timing for testing, let's finish the umount refactor here in this PR, but keep the shared umount event name for both umount and umount2 for now, after that we should bump libs in Falco so we can move forward with testing. Then we start the breaking change libs refactors?

The plan seems reasonable to me.

@poiana poiana added size/XL and removed size/L labels Mar 6, 2023
@Andreagit97
Copy link
Member Author

I've pushed 2 new commits:

  • the first one creates a new event pair for umount syscall -> PPME_SYSCALL_UMOUNT_1_E/X. With this commit, umount2 syscall is still associated with the old event pair and the event name is "umount"
  • the second one creates a new event pair also for umount2 syscall -> PPME_SYSCALL_UMOUNT2_E/X. This second commit should solve our issues with "umount family" syscalls.

and i have to rebase :(

…T64`

before this commit these two syscalls were considered "generic" but they had an associate event in the event table. Now these two syscalls use specific events and are no more generic. Please note that we need to craft new events for `PPM_SC_UMOUNT` and `PPM_SC_UMOUNT2` since `PPM_SC_UMOUNT` cannot use `PPME_SYSCALL_UMOUNT_E`, this is just a tmp patch.

Signed-off-by: Andrea Terzolo <andrea.terzolo@polito.it>
Before this commit we just had event pairs to instrument these syscalls but not a real code to identify them, for this reason their event pairs were associated to `PPM_SC_UNKNOWN`.

Signed-off-by: Andrea Terzolo <andrea.terzolo@polito.it>
Signed-off-by: Andrea Terzolo <andrea.terzolo@polito.it>
Signed-off-by: Andrea Terzolo <andrea.terzolo@polito.it>
@Andreagit97
Copy link
Member Author

rebased :)

@Andreagit97
Copy link
Member Author

Re pre-release timing for testing, let's finish the umount refactor here in this PR, but keep the shared umount event name for both umount and umount2 for now, after that we should bump libs in Falco so we can move forward with testing. Then we start the breaking change libs refactors?

Not sure why we want to keep the single name umount since we have to address also this before the final release 🤔 BTW I left it as a last commit in this way we can use the previous commit to bump libs in Falco

@@ -400,6 +400,8 @@ const struct ppm_event_info g_event_info[] = {
[PPME_SYSCALL_FCHOWNAT_X] = {"fchownat", EC_FILE | EC_SYSCALL, EF_NONE, 6, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"pathname", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"uid", PT_UINT32, PF_DEC}, {"gid", PT_UINT32, PF_DEC}, {"flags", PT_FLAGS32, PF_HEX, fchownat_flags}} },
[PPME_SYSCALL_UMOUNT_1_E] = {"umount", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 0},
[PPME_SYSCALL_UMOUNT_1_X] = {"umount", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 2, {{"res", PT_ERRNO, PF_DEC}, {"name", PT_FSPATH, PF_NA} } },
[PPME_SYSCALL_UMOUNT2_E] = {"umount2", EC_FILE | EC_SYSCALL, EF_MODIFIES_STATE, 1, {{"flags", PT_FLAGS32, PF_HEX, umount_flags} } },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty @Andreagit97 only feedback would be to still keep it "umount" event name in this PR in order to have one clean near future PR where we cut over to breaking Falco changes assuming we all agree on it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the end, I completely removed the last commit for umount2, I couldn't simply change the name here, because I have set also the old versions on old UMOUNT_E/UMOUNT_X events. Instead of reverting 2 times, it would be less error-prone to address the umount2 change in a separate PR :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Called it the "special snowflake" case earlier and it surely holds true ... ok let's try to vote and come to an agreement soon re renaming event names throughout and then get that PR going if we agree on it for Falco 0.35.

@incertum
Copy link
Contributor

incertum commented Mar 6, 2023

Not sure why we want to keep the single name umount since we have to address also this before the final release thinking BTW I left it as a last commit in this way we can use the previous commit to bump libs in Falco

Because of the way we currently resolve Falco rules evt.type strings to syscalls etc, hence also why all the tests are failing. See #936 (comment)

remove `PPME_CONTAINER_X` event from `event_set_to_names_no_generic_events1` test since after this fix it is associated with `NA` name instead of `container`

Signed-off-by: Andrea Terzolo <andrea.terzolo@polito.it>
Copy link
Contributor

@incertum incertum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Mar 7, 2023

LGTM label has been added.

Git tree hash: 1dcd7fea5b32a867124e96275c1379ebfd8ae0f1

Copy link
Contributor

@jasondellaluce jasondellaluce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Mar 7, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, incertum, jasondellaluce

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Andreagit97,incertum,jasondellaluce]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants