-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix dangling thread state issue #5870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@b-pass @oremanj This is essentially a one-line change, could you please look? — I looked myself, but I'm not sufficiently familiar with the thread handling mechanisms to be sure this is correct. Where I'm getting lost:
@daltairwalter how did you convince yourself that this change actually helps? Asking the same question from a different angle: Do you have a unit test we could add here? I also don't understand: What is the exact sequence of events that 1. assigns to |
Without my change, python is getting the current PyThreadState from whatever it currently is when a module is loaded. This thread state is not guaranteed to be valid in the future. If you want an exact sequence, it would be:
The simplest test case requires an embedded interpreter that is not started by pybind11. In the past, the pybind11 community has been very opposed to this being a supported use case, I am not sure where the community is on this now. Either way, this will also fail in a complex use case with a well written/valid non-pybind11 pyd and a pybind11 pyd being used together. There is a code sample that fails in the ticket #2888 - I updated the test case slightly for newer pybind11 at the end of the ticket, but it is essentially the same. I am not sure if setting tstate to nullptr is actually meaningful or would otherwise be the default, the primary point of my change is to stop setting tstate = cur_tstate; |
Thanks for the explanation. That's making things a little clearer for me. Did you consider building with
My vote: If it's wrong (like an actual dangling pointer), we should fix it. But:
|
Despite the small code footprint of the fix, this is extremely subtle and I will need more time to review it. There are some comments in subinterpreter.h suggesting that the logic there is relying on the internals to be created with the current tstate initialized. |
Yes, this makes sense to me. I think this line was added accidentally when cleaning up the internals constructor, the code pre-cleanup used a local with the same name as the member. This field is only used by |
It's used in |
Ah I see, it's setting it for the benefit of |
OK, I stared at this a bunch and cross-referenced with CPython and I believe it's safe. We could go further: On 3.12 and later, I believe we don't need
TL;DR CPython is now smart enough to handle the caching that we once needed to do to work around the previous limitations of the GILState APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include/pybind11/detail/internals.h
Outdated
|
||
// The PyThreadState that is returned by PyThreadState_Get is not owned by us and may be | ||
// destroyed by the owner at some point in the future. Saving this PyThreadState here can | ||
// lead to dangling pointer undefined behavior in complex multi-threaded scenarios. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment only makes sense in the context of this PR, but will be very confusing in the future, because we're not saving "this PyThreadState" here anymore.
WDYT about:
- Deleting this comment entirely.
- Adding a comment above like this:
tstate.set(nullptr); // See PR #5870
While at it, could you please also change this line in pybind11/subinterpreter.h
?
// save this in internals for scoped_gil calls (see also: PR #5870)
detail::get_internals().tstate = tstate_;
(super-easy way to make this context more discoverable for future maintainers)
Thanks @daltairwalter @b-pass @oremanj! Merging. |
@daltairwalter I just noticed: Could you please fill in the "Suggested changelog entry" in the PR description? (I just put back the template; usually we just have a sentence or two.) |
Thank you all for making this process go so smoothly. |
Description
fix #2888
This fixes undefined behavior in Gil handling associated with storing a dangling pointer. Saving this pointer is not functionally needed, because if the TLS tstate lookup returns a nullptr in gil_scoped_acquire(), the next bit of code does tstate=PyGILState_GetThisThreadState(). In the cases where the tstate wasn't dangling the results will be the same with one extra TLS lookup (minor penalty for correctness), in the cases where the saved tstate was dangling the correct one will be loaded from PyGILState_GetThisThreadState().
Suggested changelog entry:
Fix undefined behavior that occurred when importing pybind11 modules from non-main threads created by C API modules or embedded python interpreters.