-
-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unhandled IndexError when calling .read() on a malformed config file #1887
Comments
On a related note, do the maintainers of this project have any interest in bringing the fuzzing test code into this repo? I searched the repository for prior discussions about it but didn't come across anything. I'm happy to:
|
Thanks a lot for this wonderfully complete bug report, it's much appreciated. Personally, I think the config-parser implementation here is probably very broken and non-conforming, and the fuzzer will have a very good time with it. A problem I see is that reproductions are usually hidden behind a login, so it requires extra work to make these available here so the community has a chance to fix them. However, I am open to give it a shot and see how it goes. I definitely welcome you to help with maintaining the fuzzer integration, and would appreciate if that would also entail the maintenance of the issues it finds such that all information for reproduction would be contained within them so it's not only maintainers who can fix them. |
Thanks for the prompt reply 🙂 Silly me for not realizing earlier, but I see I think it probably is worth opening up a discussion thread to continue the fuzzing convo for the sake of keeping this issue on topic as I do have some follow-up questions for you related to what you think a fix for this might look like (if any). I went ahead and replied to you here: #1889 (comment) |
Migrates the OSS-Fuzz tests and setup scripts from the OSS-Fuzz repository to GitPython's repo as discussed here: gitpython-developers#1887 (comment) These files include the changes that were originally proposed in: google/oss-fuzz#11763 Additional changes include: - A first pass at documenting the contents of the fuzzing set up in a dedicated README.md - Adding the dictionary files to this repo for improved visibility. Seed corpra zips are still located in an external repo pending further discussion regarding where those should live in the long term.
I started to take a closer look at this and now have a better idea of what's going on here as well as a proof-of-concept fix, but I wanted to get your thoughts on this @Byron, before I put it into PR. CC @EliahKagan in case you have thoughts here as well. Initially, my instinct was to refer to the git-scm.com docs on Git config syntax and use that to inform the fix here, but (ignoring the fact that there isn't much guidance on value formats relevant to this issue) this portion of @Byron's comment above caught my attention and made re-think things a bit:
Specifically, the "non-conforming" part caught my eye. If So instead, I think a more considerate and generally simpler fix would be to maintain the existing parsing behavior, and just make its checks a little more robust to eliminate the risk of the unhandled The diff below shows the change I am considering which would:
Here's the diff: diff --git a/git/config.py b/git/config.py
index 3ce9b123..91b69531 100644
--- a/git/config.py
+++ b/git/config.py
@@ -452,7 +452,7 @@ class GitConfigParser(cp.RawConfigParser, metaclass=MetaParserBuilder):
e = None # None, or an exception.
def string_decode(v: str) -> str:
- if v[-1] == "\\":
+ if v and v[-1] == "\\":
v = v[:-1]
# END cut trailing escapes to prevent decode error
@@ -508,6 +508,8 @@ class GitConfigParser(cp.RawConfigParser, metaclass=MetaParserBuilder):
if len(optval) > 1 and optval[0] == '"' and optval[-1] != '"':
is_multi_line = True
optval = string_decode(optval[1:])
+ elif len(optval) > 1 and optval[0] == '"' and optval[-1] == '"':
+ optval = string_decode(optval[1:-1]) # safely decode fully quoted value
# END handle multi-line
# Preserves multiple values for duplicate optnames.
cursect.add(optname, optval)
@@ -524,8 +526,10 @@ class GitConfigParser(cp.RawConfigParser, metaclass=MetaParserBuilder):
is_multi_line = False
line = line[:-1]
# END handle quotations
+ if line: # Ensure line is not empty before decoding
+ line = string_decode(line)
optval = cursect.getlast(optname)
- cursect.setlast(optname, optval + string_decode(line))
+ cursect.setlast(optname, optval + line)
# END parse section or option
# END while reading What do you think? Does this sound like a reasonable approach here, or is there some context I might be missing? If this does sound like something worth introducing, let me know and:
|
The above hits home, and perfectly describes how I imagine handling this. It's an improvement, without going overboard and try to fix something that is systematically broken - a conforming parser can't be made based on an INI parser.
This sounds like a plan! Thanks again. |
Improve the guarding `if` check in `GitConfigParser`'s `string_decode` function to safely handle empty strings and prevent `IndexError`s when accessing string elements. This resolves an IndexError in the `GitConfigParser`'s `.read()` method when the config file contains a quoted value ending with a new line. Fixes: gitpython-developers#1887
Improve the guarding `if` check in `GitConfigParser`'s `string_decode` function to safely handle empty strings and prevent `IndexError`s when accessing string elements. This resolves an IndexError in the `GitConfigParser`'s `.read()` method when the config file contains a quoted value containing a trailing new line. Fixes: gitpython-developers#1887
PR with the fix: #1908 Only the changes to |
Improve the guarding `if` check in `GitConfigParser`'s `string_decode` function to safely handle empty strings and prevent `IndexError`s when accessing string elements. This resolves an IndexError in the `GitConfigParser`'s `.read()` method when the config file contains a quoted value containing a trailing new line. Fixes: gitpython-developers#1887
Hi, I noticed the fuzzing tests that OSS-Fuzz runs on this project are broken and while I was working on fixing them I believe I came across a minor bug:
The Bug
An
Uncaught Python exception:
IndexError: string index out of range
can be triggered if when trying to call.read()
onGitConfigParser
if it was initialized with a malformed config file.Current Behavior
It's easiest to demonstrate, so please consider this example:
Assuming that code is in
/path/to/example/config_indexerror_reproduction.py
thenproduces something akin to:
My Reproduction Environment Details
The reproduction code above was tested on:
And the fuzzer environment was:
So, if I'm reading the source correct, it seems like the combination of some header section (
[-]
above) followed by a key/value assignment that has a value consisting of a double quoted string with a new line inside it confuses the check here which strips the"
on the new line:GitPython/git/config.py
Lines 521 to 528 in c7675d2
and passes an empty string to
string_decode
which isn't expecting that when it indexes into it's arg:GitPython/git/config.py
Lines 454 to 455 in c7675d2
Expected Behavior
I'd expect an explicitly raised
ParsingError
similar to how it's handled a little further up:GitPython/git/config.py
Lines 514 to 518 in c7675d2
The text was updated successfully, but these errors were encountered: