-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array export and mass incref/decref #15
Comments
Can you sketch the batching export API? Separately, what are the use cases for the mass incref/decref operations? When does one encounter an array of |
See the linked discussion for use cases & possible solutions. I might summarize when I have more spare cycles.
If we add batch export for containers so users can avoid a call per each item, we need batch decref for the same reason. If we need decref, IMO we should add incref simply for symmetry. The use cases I saw are pretty niche, but there's not much point arguing their validity :) |
While holding a strong reference is appealing in terms of safety, I fear that it can have a big performance overhead. In issue python/cpython#106593 I propose a different approach: proposed PyObject_AsObjectArray() gives a "read-only" access to an array of objects with borrowed references (the sequence holds references, not the view), but the proposed PyResource API makes sure that the sequence remains alive while the view is being used. The implementation only increases (Py_INCREF) the sequence reference count while the view is used, and then decreases it. It's a similar idea than another idea of a Py_HOLD_REF() macro which keeps an object valid while it's being accessed: issue python/cpython#99481 But here the PyResource API is more generic and gives the type more freedom on how to make sure that the view remains valid while being used (it can copy memory, increment the refcount of items, etc.).
Proposed PyObject_AsObjectArray() creates a whole view at once to then gives a raw access to a Other approaches are discussed in issue python/cpython#106593 |
Resizable containers need to reallocate the underlying buffer, so keeping the container alive isn't enough. |
If we provide a read-only view, if the view contains the whole sequence or just a short slice, we must provide a way to warranty that it's safe to access safely the view under some conditions. One conditon can be that the loop consuming the view must not modify directly or indirectly the sequence. If there is a risk that the sequence is modified, the safest option is to copy the sequence. The contraint is that API is used for efficiency. So we need some kind of tradeoffs for performance. Obvisously, there is the safest implementation: always copy the sequence. But you loose the benefits of best performance. Or another safe option is to iterate on the list with PyIter_Next(), but here what is wanted is a direct access to an array of objects to let the compiler does its magic optimizations like vectorization, loop unrolling, etc. |
Let's add the safe API first. If we need maximum efficiency API with raw access, we should add it as an alternative. |
There is no unique API fitting all use cases: #9 In python/cpython#106593 I propose to have an unsafe but efficient view by default, and safe but slow as an opt-in option. In the majority of use cases, the unsafe option is safe in practice. And that's why people write C code, for performance. |
I agree that batching is the way forward to reconcile performance, safety and abstraction. However I would suggest that batching be done on a caller-specified slice as in this example. |
For a stable ABI -- of any flavor -- incref and decref will need to be functions, and so they'll be relatively expensive. We should allow minimizing their usage.
We're missing ways to:
PyObject*
(strong references)PyObject*
Discussion in https://discuss.python.org/t/15993 (confusingly mixed with other topics -- please make a new topic rather than reply)
The text was updated successfully, but these errors were encountered: