Skip to content

Commit

Permalink
Replace the suboptimal fuzz_tree harness with a better alternative
Browse files Browse the repository at this point in the history
As discussed in the initial fuzzing integration PR[^1], `fuzz_tree.py`'s
implementation was not ideal in terms of coverage and its reading/writing to
hard-coded paths inside `/tmp` was problematic as (among other concerns), it
causes intermittent crashes on ClusterFuzz[^2] when multiple workers execute
the test at the same time on the same machine.

The changes here replace `fuzz_tree.py` completely with a completely new
`fuzz_repo.py` fuzz target which:

- Uses `tempfile.TemporaryDirectory()` to safely manage tmpdir creation and
  tear down, including during multi-worker execution runs.
- Retains the same feature coverage as `fuzz_tree.py`, but it also adds
  considerably more from much smaller data inputs and with less memory consumed
  (and it doesn't even have a seed corpus or target specific dictionary yet.)
- Can likely be improved further in the future by exercising additional features
  of `Repo` to the harness.

Because `fuzz_tree.py` was removed and `fuzz_repo.py` was not derived from it,
the Apache License call outs in the docs were also updated as they only apply to
the singe `fuzz_config.py` file now.

[^1]: gitpython-developers#1901 (comment)
[^2]: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=68355
  • Loading branch information
DaveLak committed Apr 29, 2024
1 parent 797009d commit c84e643
Show file tree
Hide file tree
Showing 5 changed files with 57 additions and 90 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,8 @@ Please have a look at the [contributions file][contributing].

[3-Clause BSD License](https://opensource.org/license/bsd-3-clause/), also known as the New BSD License. See the [LICENSE file][license].

Two files exclusively used for fuzz testing are subject to [a separate license, detailed here](./fuzzing/README.md#license).
These files are not included in the wheel or sdist packages published by the maintainers of GitPython.
One file exclusively used for fuzz testing is subject to [a separate license, detailed here](./fuzzing/README.md#license).
This file is not included in the wheel or sdist packages published by the maintainers of GitPython.

[contributing]: https://github.com/gitpython-developers/GitPython/blob/main/CONTRIBUTING.md
[license]: https://github.com/gitpython-developers/GitPython/blob/main/LICENSE
16 changes: 8 additions & 8 deletions fuzzing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,14 +225,14 @@ to [the official OSS-Fuzz documentation][oss-fuzz-docs].
## LICENSE

All files located within the `fuzzing/` directory are subject to [the same license](../LICENSE)
as [the other files in this repository](../README.md#license) with two exceptions:

Two files located in this directory, [`fuzz_config.py`](./fuzz-targets/fuzz_config.py)
and [`fuzz_tree.py`](./fuzz-targets/fuzz_tree.py), have been migrated here from the OSS-Fuzz project repository where
they were originally created. As such, these two files retain their original license and copyright notice (Apache
License, Version 2.0 and Copyright 2023 Google LLC respectively.) Each file includes a notice in their respective header
comments stating that they have been modified. [LICENSE-APACHE](./LICENSE-APACHE) contains the original license used by
the OSS-Fuzz project repository at the time they were migrated.
as [the other files in this repository](../README.md#license) with one exception:

[`fuzz_config.py`](./fuzz-targets/fuzz_config.py) was migrated to this repository from the OSS-Fuzz project's repository
where it was originally created. As such, [`fuzz_config.py`](./fuzz-targets/fuzz_config.py) retains its original license
and copyright notice (Apache License, Version 2.0 and Copyright 2023 Google LLC respectively) as in a header
comment, followed by a notice stating that it has have been modified contributors to GitPython.
[LICENSE-APACHE](./LICENSE-APACHE) contains the original license used by the OSS-Fuzz project repository at the time the
file was migrated.

[oss-fuzz-repo]: https://github.com/google/oss-fuzz

Expand Down
13 changes: 0 additions & 13 deletions fuzzing/dictionaries/fuzz_tree.dict

This file was deleted.

47 changes: 47 additions & 0 deletions fuzzing/fuzz-targets/fuzz_repo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import atheris
import io
import sys
import os
import tempfile

if getattr(sys, "frozen", False) and hasattr(sys, "_MEIPASS"):
path_to_bundled_git_binary = os.path.abspath(os.path.join(os.path.dirname(__file__), "git"))
os.environ["GIT_PYTHON_GIT_EXECUTABLE"] = path_to_bundled_git_binary

with atheris.instrument_imports():
import git


def TestOneInput(data):
fdp = atheris.FuzzedDataProvider(data)

with tempfile.TemporaryDirectory() as temp_dir:
repo = git.Repo.init(path=temp_dir)

# Generate a minimal set of files based on fuzz data to minimize I/O operations.
file_paths = [os.path.join(temp_dir, f"File{i}") for i in range(min(3, fdp.ConsumeIntInRange(1, 3)))]
for file_path in file_paths:
with open(file_path, "wb") as f:
# The chosen upperbound for count of bytes we consume by writing to these
# files is somewhat arbitrary and may be worth experimenting with if the
# fuzzer coverage plateaus.
f.write(fdp.ConsumeBytes(fdp.ConsumeIntInRange(1, 512)))

repo.index.add(file_paths)
repo.index.commit(fdp.ConsumeUnicodeNoSurrogates(fdp.ConsumeIntInRange(1, 80)))

fuzz_tree = git.Tree(repo, git.Tree.NULL_BIN_SHA, 0, "")

try:
fuzz_tree._deserialize(io.BytesIO(data))
except IndexError:
return -1


def main():
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()


if __name__ == "__main__":
main()
67 changes: 0 additions & 67 deletions fuzzing/fuzz-targets/fuzz_tree.py

This file was deleted.

0 comments on commit c84e643

Please sign in to comment.