-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-126703: Add freelists for iterators and range, method and builtin_function_or_method objects #128368
base: main
Are you sure you want to change the base?
Conversation
I don't think we should share the freelists for iterators. We're not using that much memory and it's really bug-prone to share them. |
I agree with you. I am experimenting a bit to see whether it is possible at all to do this this with different types (maybe for PyType_GenericAlloc, or some size based freelist), but for the iterators I will probably split it again. |
The results are excellent! 1% faster geomean. Great work and congrats Pieter! |
I am not sure that it's worth adding the free list every time if there is a small margin (<3-5%). |
pycfunctionobject / pycmethodobject / class_method / shared_iters are maybe good to be added. |
Benchmark results show consistent 1% geomean speedup on pyperformance. That's pretty worth it (for comparison, the entire types optimizer in the JIT is only 1% speedup and is way more code). Though you're probably right that not all of them are worth it. I'm thinking the method and list/tuple iters are most worth it. |
I made PRs for the individual components that are worthwhile (based on the stats). The ones that do not have a PR yet (because the implementation would be more complex) are generators, StopIteration (or more general exceptions) and ints of small size (e.g. 2 of 3). I will close this PR as it is superseded by the others. |
In this PR we add freelists for the top most allocated objects (measured using pyperformance benchmark). Some often allocated objects that have not yet been added: ints with 2 or 3 digits, exceptions (
StopIteration
,IndexError
) and generators.If the freelists increase performance, the PR should probably be split into multiple ones.
Microbenchmarks:
The list, float and int freelists are already in main, so we don't expect an improvement there. The iterator benchmarks show a modest improvement. bench_builtin_or_method shows an improvement, but is a a bit artificial benchmark.
Benchmark script