Performance Tip for calling py object's method #2960
noahkim0723
started this conversation in
Ideas
Replies: 1 comment 2 replies
-
I would argue that this (caching attribute look-ups) is a common technique when optimizing pure Python code as well, i.e. not specific to PyO3 or even native extensions. Also note that this changes observable behaviour in a dynamic language like Python, e.g. the objects could theoretically change their method attribute to point to different code with each invocation which would not work with the optimized variant. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Let me share the tips to improve performance for python object's method in particular cases.
I found this while working on performance benchmark between pyo3 and C++ swig extension for python in the firm
Let's say we want to define a caller class that receives multiple py objects and then call their methods with the same signature as follows:
As can be seen from the method ** call_methods_unopt**, the logic is very stratightforward
Given K py objects and N iterations, the time complexity of the method call_methods_unopt is O(K * N)
However, we can further optimize the method call_methods_unopt; actually we can even double the calling throughput for the method call_methods_unopt by simply changing as follows:
The difference between the above two methods seems not significant, but the performance gap is huge!! (almost 70% throughput difference without considering the duration for each py object's method)
Why so??
well I noticed this when I look into the actual implementation of the
call_method
andcall
in pyo3you may notice as well when you have a look at the above two implementations.
Did you find out??
The answer is due to
let ptr = ffi::PyObject_GetAttr(self.as_ptr(), name);
In CPython, everything in python including functions, objects, lists, etc is pointer, when you call a py object's method called "foo" this is what happens:
__getattr__
method in the Pythoncall_method
actually involves in calling two methods in sequence: the first one isgetattr
method (this corresponds toffi::PyObject_GetAttr(self.as_ptr(), name);
in the above) and the second is the original target method.Therefore, we need to extract the target method pointers before actually executing them to avoid calling
getattr
methodLet's look at the optimised one again
As can be seen, it first of all extracts the target method pointers ahead and then start to execute.
But of course, in real cases, the actual bottleneck exists often in the python side, so this effect may not be significant. Though, this performance gain is huge by changing code a bit
Beta Was this translation helpful? Give feedback.
All reactions