-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[experimental] Run crosshair in CI #4034
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
175b347
to
424943f
Compare
@Zac-HD your triage above is SO great. I am investigating. |
Knocked out a few of these in 0.0.60.
More soon. |
Ah - the
|
This comment was marked as outdated.
This comment was marked as outdated.
Most/all of the "expected x, got symbolic" errors are symptoms of an underlying error in my experience (often operation on symbolic while not tracing). In this case running with |
ah-ha, seems like we might want some #4029 - style 'don't cache on backends with avoid_realize=True' logic. |
1d2345d
to
7bf8983
Compare
Still here and excited about this! I am on a detour of doing a real symbolic implementation of the |
cc07927
to
018ccab
Compare
Triaging a pile of the So I've tried de-nesting those, which seems to work nicely and even makes things a bit faster by default; and when CI finishes we'll see how much it helps on crosshair 🤞 |
This comment was marked as outdated.
This comment was marked as outdated.
40df9ba
to
380aac8
Compare
16272d1
to
9df503e
Compare
9df503e
to
eaa50ab
Compare
36d5442
to
2dd5577
Compare
78ca207
to
0fe069a
Compare
0fe069a
to
cc74c4c
Compare
most of the currently failing tests look like they might be crosshair issues, cc @pschanely:
and I'll skip the database test - crosshair just finishes exploring sooner than that test expects 😁 |
3701cdd
to
f0c9a70
Compare
f0c9a70
to
c7816d0
Compare
@pschanely huge progress from recent updates! The At this point my only real reason to avoid merging is that crosshair updates often cause a fair bit of churn, causing some tests to start failing and some to start xpassing - it's net-good, but would be toil in our CI. I feel like we've crossed from an alpha-version which is a neat proof of concept, to a beta-version which is still early but already both useful and clearly on a path to stability and wider adoption. Incredibly excited about this ✨ If you want to pull out Crosshair issues,
|
So great.
Frankly, I'm not sure it makes sense to block hypothesis on a crosshair-related failure, even in a very distant, stable future. Would love your ideas making the integration more "eventually" correct. Maybe a dedicated testing repo that pulls the hypothesis source and has these pytest markers externally applied? (or submodules? but those scare me)
Always. Thanks for the commit breakdown. More updates soon! |
For clarity, "blocking" would mean 'when we update our pinned dependencies, if Crosshair has changed we'll update the xfail markers accordingly and report any issues upstream, or maybe add a In practice I expect I'll just keep updating this PR for now, and you can grab a local copy of the branch if you want to run the tests before a Crosshair release 😁 (and note the test-selection tips at the top of the pr!) |
Fair enough! I was concerned about how much churn in CrossHair pass/fails you'll see for unrelated hypothesis changes, but it's also true that I want to know about what you see. Current plan SGTM.
Yup! I've been doing this a little already; works for me. |
See #3914
To reproduce this locally, you can run
make check-crosshair-cover/nocover/niche
for the same command as in CI, but I'd recommendpytest --hypothesis-profile=crosshair hypothesis-python/tests/{cover,nocover,datetime} -m xf_crosshair --runxfail
to select and run only the xfailed tests.Hypothesis' problems
Flaky: Inconsistent results from replaying a failing test...
- mostly backend-specific failures; we've both"hypothesis/internal/conjecture/data.py", line 2277, in draw_boolean
assert p > 2 ** (-64)
, fixed in1f845e0
(#4049)@given
, fixed in 3315be6target()
, fixed in85712ad
(#4049)typing_extensions
when crosshair depends on it@xfail_on_crosshair(...)
..too_slow
and.filter_too_much
, and skip remaining affected tests under crosshair.-k 'not decimal'
once we're closerPathTimeout
; see RarePathTimeout
errors inprovider.realize(...)
pschanely/hypothesis-crosshair#21 and Stable support for symbolic execution #3914 (comment)Add
BackendCannotProceed
to improve integration #4092Probably Crosshair's problems
Duplicate type "<class 'array.array'>" registered
from repeated imports? pschanely/hypothesis-crosshair#17RecursionError
, seeRecursionError
in_issubclass
pschanely/CrossHair#294unsupported operand type(s) for -: 'float' and 'SymbolicFloat'
intest_float_clamper
TypeError: descriptor 'keys' for 'dict' objects doesn't apply to a 'ShellMutableMap' object
(or'values'
or'items'
). Fixed in Implement various fixes for hypothesis integration pschanely/CrossHair#269TypeError: _int() got an unexpected keyword argument 'base'
hashlib
requires the buffer protocol, which symbolics bytes don't provide pschanely/CrossHair#272typing.get_type_hints()
raisesValueError
, seetyping.get_type_hints()
raisesValueError
when used inside Crosshair pschanely/CrossHair#275TypeError
in bytes regex, seeTypeError
in bytes regex pschanely/CrossHair#276provider.draw_boolean()
insideFeatureStrategy
, see Invalid combination of arguments todraw_boolean(...)
pschanely/hypothesis-crosshair#18dict(name=value)
, see Support nameddict
init syntax pschanely/CrossHair#279PurePath
constructor, seePurePath(LazyIntSymbolicStr)
error pschanely/CrossHair#280zlib.compress()
not symbolic, see a bytes-like object is required, notSymbolicBytes
when callingzlib.compress(b'')
pschanely/CrossHair#286int.from_bytes(map(...), ...)
, see Acceptmap()
object - or any iterable - inint.from_bytes()
pschanely/CrossHair#291base64.b64encode()
and friends pschanely/CrossHair#293TypeError: conversion from SymbolicInt to Decimal is not supported
; see also snan belowTypeVar
problem, seez3.z3types.Z3Exception: b'parser error'
from interaction withTypeVar
pschanely/CrossHair#292RecursionError
inside Lark, see Weird failures using sets pschanely/CrossHair#297Error in
operator.eq(Decimal('sNaN'), an_int)
Cases where crosshair doesn't find a failing example but Hypothesis does
Seems fine, there are plenty of cases in the other direction. Tracked with
@xfail_on_crosshair(Why.undiscovered)
in case we want to dig in later.Nested use of the Hypothesis engine (e.g. given-inside-given)
This is just explicitly unsupported for now. Hypothesis should probably offer some way for backends to declare that they don't support this, and then raise a helpful error message if you try anyway.