Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more software signatures #1301

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

more software signatures #1301

wants to merge 2 commits into from

Conversation

jstucke
Copy link
Collaborator

@jstucke jstucke commented Nov 18, 2024

  • added some software signatures (mbed TLS, file, opkg)

@jstucke jstucke self-assigned this Nov 18, 2024
@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 7 lines in your changes missing coverage. Please review.

Project coverage is 92.01%. Comparing base (058e49f) to head (2fce538).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...is/software_components/code/software_components.py 61.11% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1301      +/-   ##
==========================================
- Coverage   92.42%   92.01%   -0.42%     
==========================================
  Files         379      378       -1     
  Lines       23661    21469    -2192     
==========================================
- Hits        21869    19754    -2115     
+ Misses       1792     1715      -77     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maringuu
Copy link
Collaborator

Can you document somewhere (commit message/code comments), why these signatures are correct?
I think it is important to not only have signatures, but also be able to verify why they are correct.
The most preferable proof would we permalinks to source code, but I'm happy with anything that allows me to understand why the signatures are as they are.

@jstucke jstucke force-pushed the software-signatures branch from 7696794 to 4e854ef Compare December 4, 2024 16:36
@jstucke
Copy link
Collaborator Author

jstucke commented Dec 4, 2024

Can you document somewhere (commit message/code comments), why these signatures are correct? I think it is important to not only have signatures, but also be able to verify why they are correct. The most preferable proof would we permalinks to source code, but I'm happy with anything that allows me to understand why the signatures are as they are.

I tried to find the corresponding places in the source code but I also found that in libmagic the version is stored as an integer instead of a string. The FSR currently does not have the capability to extract integers, so I added it.

@jstucke jstucke force-pushed the software-signatures branch from 4e854ef to 08eb5b6 Compare December 5, 2024 08:17
@maringuu
Copy link
Collaborator

Does this need a rebase after #1263?

and also added a substitution option for software signatures to change the format (e.g. add dots)
@jstucke jstucke force-pushed the software-signatures branch from 08eb5b6 to 2fce538 Compare December 18, 2024 10:38
@jstucke
Copy link
Collaborator Author

jstucke commented Dec 18, 2024

Does this need a rebase after #1263?

I don't think so, since the PR didn't change any internals, but I did one anyway.

Copy link
Collaborator

@maringuu maringuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments otherwise lgtm!

Comment on lines +63 to +73
def _convert_version_str(self, version_str: str, meta_dict: dict):
"""
The metadata entry "_sub_regex" can be used to change the version string if it does not have the expected
format (e.g. add dots). The entry should contain a regex and replacement for `re.sub()` as JSON string
"""
try:
sub_regex, replacement = json.loads(meta_dict['_sub_regex'])
return re.sub(sub_regex, replacement, version_str)
except json.JSONDecodeError:
logging.warning(f'[{self.NAME}]: signature has invalid substitution regex: {meta_dict}')
return ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should not take the meta_dict but rather the regex as parameter.
Also storing a python regex as json encoded array in a yara file seems a bit too complex to me (As can be seen by the amount of backslashes).
What do you think about formatting it in two separate fields:

rule file_libmagic {
	meta:
		software_name = "file"
		// versions are stored as decimal int with three digits
		// (first digit: major version, remaining two digits: minor version)
		version_regex = "\\d{3}"
		_version_function = "magic_version"
                _sub_regex = "(\\d)(\\d{2})"
                _sub_replacement = "\\1.\\2"
                ...
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what is the difference between the version_regex and the _sub_regex?
Could (should?) they be merged?

Copy link
Collaborator Author

@jstucke jstucke Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version_regex is for reading in the version from the matched string (if it doesn't have the default 1.2.3 form) and _sub_regex (short for substitution regex) is something new that I had to think of, because file/libmagic came with a version in the form XYY (e.g. 524) and there is no way to read this in as X.YY without it. _sub_regex and _sub_replacement are the inputs for re.sub()

Why is XYY not enough in this case? Because it is stored as X.YY in the CVE data and we cannot match it otherwise

Comment on lines +355 to +368
if function is not None:
body = function.getBody()
instruction_iterator = currentProgram.getListing().getInstructions(body, True)

for instruction in instruction_iterator:
for i in range(instruction.getNumOperands()):
for operand in instruction.getOpObjects(i):
try:
value = operand.getValue()
except AttributeError:
continue
if value is not None and isinstance(value, (int, long)):
constants.append(str(value))
return constants
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I personally prefer less indentation, which in this case can be archived by adding:

if function is None:
    return []

Also: Why can function be none if we did not get an exception in the call to getGlobalFunctions?!
I think it cannot.

"""
try:
function = getGlobalFunctions(function_name)[0]
except (IndexError, TypeError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this raise a type error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants