Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[parallel] If not num_threads, use queue instead of tail recursion #306

Merged

Conversation

mpsijm
Copy link
Collaborator

@mpsijm mpsijm commented Sep 13, 2023

Previously, running bt fuzz --jobs 0 would crash with a stack overflow after 244 iterations (±4 stack frames per iteration). This happened because after every GeneratorTask, a new task was started by calling Parallel.put, which in turn called its lambda, which called the new GeneratorTask, which called Parallel.put again in finish_task, etc.

I've rewritten the task handling in the case of not num_threads by moving it all to Parallel.done. Parallel.put now puts the task in the same priority queue as in the case where we do want parallelism (num_threads > 0). The fuzzer should now be able to run more than 244 iterations again 😄

I created a PR instead of simply pushing to master, since I haven't touched all of this parallel stuff that often, so some extra eyes would be nice 🙂

To quickly reproduce the bug with `bt fuzz -j0`, I added a minimal generator to the test problem "hellowholeworld", which makes 244 iterations take about 2 minutes:
diff --git a/test/problems/hellowholeworld/generators/gen.py b/test/problems/hellowholeworld/generators/gen.py
new file mode 100644
index 0000000..829afc9
--- /dev/null
+++ b/test/problems/hellowholeworld/generators/gen.py
@@ -0,0 +1,6 @@
+#!/usr/bin/env python3
+import random
+import sys
+
+random.seed(sys.argv[1])
+print(random.randint(1, 100))
diff --git a/test/problems/hellowholeworld/generators/generators.yaml b/test/problems/hellowholeworld/generators/generators.yaml
new file mode 100644
index 0000000..2f90cce
--- /dev/null
+++ b/test/problems/hellowholeworld/generators/generators.yaml
@@ -0,0 +1,9 @@
+solution: /submissions/accepted/test-hello.py
+data:
+  sample:
+    data:
+      "1":
+        in: "1"
+  secret:
+    data:
+      - "": gen.py {seed}

bin/parallel.py Outdated Show resolved Hide resolved
bin/fuzz.py Outdated Show resolved Hide resolved
bin/parallel.py Outdated Show resolved Hide resolved
bin/parallel.py Show resolved Hide resolved
@mpsijm
Copy link
Collaborator Author

mpsijm commented Feb 3, 2024

Better late than never 😄

While trying to resolve your feedback, I came to the conclusion that having one class that's doing very different things in the sequential vs. the parallel case is not very maintainable. So, I decided to split the two functionalities into two subclasses of a common abstract superclass.

The consequence of this is that @mzuenni's fix for the fuzzer in #327 is no longer necessary: the user of parallel.create can always transparently treat it as if it was a parallel queue, even though it's secretly a sequential queue when args.jobs == 0 😄

Copy link
Owner

@RagnarGrootKoerkamp RagnarGrootKoerkamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM; just some small comments.

Still not entirely convinced we actually need this but happy to merge either way.

bin/fuzz.py Outdated Show resolved Hide resolved
bin/fuzz.py Outdated Show resolved Hide resolved
bin/parallel.py Show resolved Hide resolved
@mpsijm mpsijm force-pushed the fix-parallel-without-threads branch from 50c2d1d to 896b747 Compare February 10, 2024 16:34
Previously, running `bt fuzz --jobs 0` would crash with a stack overflow
after 244 iterations (±4 stack frames per iteration). This happened
because after every GeneratorTask, a new task was started by calling
Parallel.put, which in turn called its lambda, which called the new
GeneratorTask, which called Parallel.put again in finish_task, etc.

I've rewritten the task handling in the case of `not num_threads`
by moving it all to Parallel.done. Parallel.put now puts the task in
the same priority queue as in the case where we _do_ want parallelism
(num_threads > 0).
The fuzzer should now be able to run more than 244 iterations again 😄
This reverts commit 97ab1d7.
@mpsijm mpsijm force-pushed the fix-parallel-without-threads branch from 896b747 to f348174 Compare February 16, 2024 07:15
@mpsijm mpsijm merged commit d502dcd into RagnarGrootKoerkamp:master Feb 16, 2024
1 check passed
@mpsijm mpsijm deleted the fix-parallel-without-threads branch February 16, 2024 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants