Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sys.monitoring for scrutineer on 3.12+ #3776

Merged
merged 14 commits into from
Nov 4, 2023

Conversation

tybug
Copy link
Member

@tybug tybug commented Oct 26, 2023

Closes #3773.

This is an RFC - please pick apart my api design decisions here! Self-review to follow.

Here's a micro-benchmark:

from hypothesis import *
from hypothesis.strategies import *
import time

N = 100

t = time.time()
for _ in range(N):
    @given(integers(), integers())
    def add(x, y):
        assert x != 0
    try:
        add()
    except Exception:
        pass

print((time.time() - t) / N)
master: 0.208
this pr: 0.166

Roughly 1.25x speed!

MONITORING_EVENTS = {sys.monitoring.events.LINE: "trace_line"}


def settracer(tracer: Tracer):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is effectively formalizing Tracer as an abstraction over sys.settrace and sys.monitoring. An alternative is to make this even more formal, with an abstract class that fully implements the functionality of both, and is inherited by Tracer. This could allow hypofuzz to reuse said abstract class (Zac-HD/hypofuzz#9), depending on the kind of reuse you were thinking about.

If you think this is a good direction, I'd be open to implementing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above for the context manger note - I think having two implementations we can use in the same way would be useful, but since I don't ever expect for there to be a third it seems overkill to use an abstract class.

hypothesis-python/src/hypothesis/internal/scrutineer.py Outdated Show resolved Hide resolved
sys.settrace(None)
return

sys.monitoring.free_tool_id(MONITORING_TOOL_ID)
Copy link
Member Author

@tybug tybug Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP 669 claims that use_tool_id, free_tool_id, and set_events "should be regarded as slow". This PR currently calls all of these once per test execution. I haven't profiled these, but if this has an impact, we could consider registering once at the global level instead, and enabling/disabling the callbacks per test execution to avoid unwanted traces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think we should just ignore this for now and do the simple thing, and we can complicate things iff profiling shows we need to later.

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First things first - thanks so much for taking this on! I'm really excited to pick up the speedup and any qualitative improvements that will enable down the line (e.g.: if it's fast, can we always trace?). Other quick thoughts:

  • having a clean context-manager abstraction seems like a good way to factor this out. Unfortunately, I think that performance considerations suggest we should not share any code, since a function call will lead to measurable slowdowns when we do it so often.
  • it looks like this is already super close, and would already be a serious win for users on Python 3.12, so let's aim to ship it soon!

hypothesis-python/src/hypothesis/internal/scrutineer.py Outdated Show resolved Hide resolved
self._previous_location = current_location
self.trace_line(frame.f_code, frame.f_lineno)

def trace_line(self, code: types.CodeType, line_number: int):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for sys.monitoring, we actually want to trace_branch - we're using lines in the existing tracer because branches aren't natively available.

However, this will also require some adjustment of the reporting and maybe analysis logic, so I suggest we get this working with line-monitoring first to get the big performance win, and then refactor to branch analysis in a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I can take that on after this PR! (unless you want to adjust it yourself).

MONITORING_EVENTS = {sys.monitoring.events.LINE: "trace_line"}


def settracer(tracer: Tracer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above for the context manger note - I think having two implementations we can use in the same way would be useful, but since I don't ever expect for there to be a third it seems overkill to use an abstract class.

sys.settrace(None)
return

sys.monitoring.free_tool_id(MONITORING_TOOL_ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think we should just ignore this for now and do the simple thing, and we can complicate things iff profiling shows we need to later.

@tybug
Copy link
Member Author

tybug commented Oct 28, 2023

I've rewritten as a context manager and switched to a non-reserved tool id (3) as suggested!

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @tybug! (and sorry it took so long; it's been a crazy week)

I spent a while trying to work out how to use the neat DISABLE functionality before realizing that we can't actually do that until we start observing branches directly instead of lines... ah well, that's the follow-up project 🙂

A few tiny tweaks below, but this looks great to me and I'm looking forward to merging it!

hypothesis-python/src/hypothesis/internal/scrutineer.py Outdated Show resolved Hide resolved
hypothesis-python/RELEASE.rst Outdated Show resolved Hide resolved
@tybug
Copy link
Member Author

tybug commented Nov 4, 2023

no need to apologize! 🙂

I'll take a look at branch coverage next week, if you don't beat me to it.

Zac-HD and others added 4 commits November 4, 2023 06:00
The packaging toolchain has now had python-requires support for long enough that I think it's no longer worth carrying this check and (very slightly) slowing down imports just for the sake of a different error message in a very very rare case.
@Zac-HD Zac-HD enabled auto-merge November 4, 2023 06:15
@Zac-HD Zac-HD merged commit a04236d into HypothesisWorks:master Nov 4, 2023
46 checks passed
@Zac-HD
Copy link
Member

Zac-HD commented Nov 4, 2023

@tybug, I've really appreciated your recent PRs - both because they're significant technical contributions for our users, and because it's been a pleasure collaborating with you on them 😍

I've therefore extended you an offer to join the Hypothesis team. If you accept, there's no obligation to continue contributing of course (we're all volunteers!); but if you choose to review PRs you'll also have the ability to approve and merge them once you feel they're ready. Thanks again - and I hope we'll keep working together!

@tybug
Copy link
Member Author

tybug commented Nov 4, 2023

I'm honored! I expect nothing will change for the immediate future, as I still have nowhere near the experience required with the Hypothesis codebase to feel comfortable merging anything but the simplest PRs. But maybe my 2c will be useful on occasion.

It's been a joy working with you as well, and I intend to continue contributing to Hypothesis as I'm able to. I gladly accept 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use sys.monitoring (PEP-669) for reduced overhead in the explain phase on Python 3.12+
2 participants