make `PyErrState` thread-safe #4671

davidhewitt · 2024-10-30T19:56:06Z

This PR resolves the thread-safety challenges of #4584 for us to be able to at least ship 0.23.

I don't love the complexity that this lazy state creates inside error-handling pathways, so I think in the future I will work to proceed with #4669 and further steps to remove the lazy state. But 0.23 is already breaking enough, users don't need more changes and this should be an in-place drop-in.

ngoldbaum · 2024-10-31T15:05:07Z

I noticed clippy was failing so I just pushed a fix. I'll try to get the CI green on this if there are any more issues.

src/err/err_state.rs

ngoldbaum · 2024-10-31T15:35:49Z

src/err/err_state.rs

-            match self_state {
-                Some(PyErrStateInner::Normalized(n)) => n,
-                _ => unreachable!(),
+            let normalized_state = PyErrStateInner::Normalized(state.normalize(py));


I think the only spot where there might be a deadlock is here, if normalize somehow leads to arbitrary Python code execution.

Is that possible? If not I think it deserves a comment explaining why.

If it can deadlock, I'm not sure what we can do, since at this point we haven't actually constructed any Python objects yet and we only have a handle to an FnOnce that knows how to construct them.

Great observation; I've added a wrapping call to py.allow_threads before potentially blocking on the Once, which I think avoids the deadlock (I pushed a test which did deadlock before that change).

ngoldbaum · 2024-10-31T15:36:42Z

The algorithm makes sense to me, I agree that this ensures that normalizing an error state can't be done simultaneously in two threads.

codspeed-hq · 2024-10-31T21:13:43Z

CodSpeed Performance Report

Merging #4671 will degrade performances by 25.38%

_{Comparing davidhewitt:threadsafe-err (f5fa452) with main (5464f16)}

Summary

❌ 2 regressions
✅ 81 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`davidhewitt:threadsafe-err`	Change
❌	`enum_from_pyobject`	19 µs	24.7 µs	-23.22%
❌	`not_a_list_via_extract_enum`	13.5 µs	18 µs	-25.38%

ngoldbaum · 2024-10-31T21:17:28Z

Huh, I can reproduce the test failure happening on CI. It's flakey, but you can trigger it with cargo test --no-default-features --features "multiple-pymethods abi3-py37 full" --test "test_declarative_module" running in a while loop.

ngoldbaum · 2024-10-31T21:24:11Z

src/err/err_state.rs

-                .expect("Cannot normalize a PyErr while already normalizing it.")
-        };
+        // avoid deadlock of `.call_once` with the GIL
+        py.allow_threads(|| {


I guess somehow dropping the GIL somehow allows a race condition to happen where multiple threads try to simultaneously create a module...

Yeah, I think it's a combination with GILOnceCell in the test_declarative_module; we allow racing in GILOnceCell under the condition where switching the GIL, so this module does actually attempt to get created multiple times. I think it's a bug in using GILOnceCell for that test, but this also just makes me dislike this lazy stuff even more...

I guess this is just a fundamental issue with GILOnceCell being racey if the code it wraps ever drops the GIL.

EDIT: jinx!

I've opened #4676, if I apply that patch on this branch, the problem goes away.

src/err/err_state.rs

davidhewitt and others added 2 commits October 30, 2024 19:54

make PyErrState thread-safe

08beaa5

fix clippy

8421034

ngoldbaum reviewed Oct 31, 2024

View reviewed changes

src/err/err_state.rs Show resolved Hide resolved

ngoldbaum reviewed Oct 31, 2024

View reviewed changes

davidhewitt added 2 commits October 31, 2024 20:52

add test of reentrancy, fix deadlock

b4d3a94

newsfragment

4a30dde

ngoldbaum reviewed Oct 31, 2024

View reviewed changes

src/err/err_state.rs Outdated Show resolved Hide resolved

fix MSRV

f5fa452

davidhewitt mentioned this pull request Oct 31, 2024

add sync::OnceExt and sync::OnceLockExt traits #4676

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make `PyErrState` thread-safe #4671

make `PyErrState` thread-safe #4671

davidhewitt commented Oct 30, 2024

ngoldbaum commented Oct 31, 2024

ngoldbaum Oct 31, 2024

ngoldbaum Oct 31, 2024

davidhewitt Oct 31, 2024

ngoldbaum commented Oct 31, 2024

codspeed-hq bot commented Oct 31, 2024 •

edited

Loading

ngoldbaum commented Oct 31, 2024

ngoldbaum Oct 31, 2024

davidhewitt Oct 31, 2024 •

edited

Loading

ngoldbaum Oct 31, 2024 •

edited

Loading

davidhewitt Oct 31, 2024

make PyErrState thread-safe #4671

Are you sure you want to change the base?

make PyErrState thread-safe #4671

Conversation

davidhewitt commented Oct 30, 2024

ngoldbaum commented Oct 31, 2024

ngoldbaum Oct 31, 2024

Choose a reason for hiding this comment

ngoldbaum Oct 31, 2024

Choose a reason for hiding this comment

davidhewitt Oct 31, 2024

Choose a reason for hiding this comment

ngoldbaum commented Oct 31, 2024

codspeed-hq bot commented Oct 31, 2024 • edited Loading

CodSpeed Performance Report

Merging #4671 will degrade performances by 25.38%

Summary

Benchmarks breakdown

ngoldbaum commented Oct 31, 2024

ngoldbaum Oct 31, 2024

Choose a reason for hiding this comment

davidhewitt Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

ngoldbaum Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

davidhewitt Oct 31, 2024

Choose a reason for hiding this comment

make `PyErrState` thread-safe #4671

make `PyErrState` thread-safe #4671

codspeed-hq bot commented Oct 31, 2024 •

edited

Loading

davidhewitt Oct 31, 2024 •

edited

Loading

ngoldbaum Oct 31, 2024 •

edited

Loading