Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip libc code for statically compiled binaries #468

Open
k4lizen opened this issue May 22, 2024 · 4 comments
Open

skip libc code for statically compiled binaries #468

k4lizen opened this issue May 22, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@k4lizen
Copy link

k4lizen commented May 22, 2024

fish: Job 1, 'cwe_checker calc' terminated by signal SIGKILL (Forced quit)
dmesg:
[...] Out of memory: Killed process 348748 (cwe_checker) [...]
It would allow people to run cwe_checker on statically compiled binaries, which is currently very hard because it requires a large amount of memory.

@Enkelmann
Copy link
Contributor

Well, to skip them we first have to reliably detect them in the binary code. And as far as I know that would be a complex research problem of its own. So, unless there is a reliable detection solution that we could just integrate into the cwe_checker for that task, the necessary amount of work and research is more than what the project can handle. And I am not aware of such a solution existing.

@k4lizen
Copy link
Author

k4lizen commented May 29, 2024

Understood. It feels like it should be possible by doing something like

  1. Detect the glibc version (how? checking binary strings? trying 2. and 3. for multiple versions? user inputed?)
  2. Statically compile a dummy program with that glibc
  3. Compare the assembly of the functions of the dummy program with the program being checked

But there may be non-obvious pitfalls I guess and I am also not aware of an existing tool that does this.

@vobst
Copy link
Collaborator

vobst commented Jun 3, 2024

There is quite a wealth of literature on the topic of binary code similarity, and function recognition as a particular subproblem, that would be relevant here. I recall that Qasem et al. 1 had an okayish overview of the topic up until 2020, maybe a good starting point before attempting to re-invent the wheel.

In general integrating such an approach intro the cwe_checker's analysis would require shipping (or providing a setup to generate) some sort of (usually huge) database of function "fingerprints" that would underpin the analysis. I don't think that we should go down the route of shipping such an analysis in the cwe_checker binary.

What I'd suggest is implementing support for importing annotations from an external source in some well-defined format. That way users can run whatever tool they want, convert the results, and provide the results to the cwe_checker, probably via a new command line flag.

@vobst vobst added the enhancement New feature or request label Jun 3, 2024
@k4lizen
Copy link
Author

k4lizen commented Jun 16, 2024

Something that might be of interest:
https://github.com/google/bindiff
https://github.com/google/binexport

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants