-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify stdlib code by using itertools.batched()
#126317
Comments
I'm not sure we want to refactor |
There is no great need in such change, but it makes the code slightly smaller, so I have no objections. Please show the results of microbenchmarks for pickling large list, dict and set. Even a small speedup would be an additional argument for this change. A slowdown would be a sign that there is something wrong with itertools.batched(). |
import pickle
import pyperf
large_list = list(range(10**6)) # 1,000,000 list
large_dict = {str(i): i for i in range(10**6)} # 1,000,000 dict
large_set = set(range(10**6)) # 1,000,000 set
def pickle_save(data, filename):
with open(filename, "wb") as f:
pickle.dump(data, f)
def pickle_load(filename):
with open(filename, "rb") as f:
return pickle.load(f)
def run_benchmarks():
runner = pyperf.Runner()
# Pickle dump test
runner.bench_func("pickle_save_large_list", pickle_save, large_list, "large_list.pkl")
runner.bench_func("pickle_save_large_dict", pickle_save, large_dict, "large_dict.pkl")
runner.bench_func("pickle_save_large_set", pickle_save, large_set, "large_set.pkl")
# Pickle load test
runner.bench_func("pickle_load_large_list", pickle_load, "large_list.pkl")
runner.bench_func("pickle_load_large_dict", pickle_load, "large_dict.pkl")
runner.bench_func("pickle_load_large_set", pickle_load, "large_set.pkl")
if __name__ == "__main__":
run_benchmarks() result
The result of running the benchmark using pyperf. I'm sorry, my results are a little different from what I've done before, so I'm guessing there's a difference in configure. If you get a different result, I'd appreciate it if you could let me know. |
#98364 introduced
itertools.batched()
but since it's pretty new it's still rarely used inside the standard library.pickle.py
includes a custom version multiple times which all could be replaced byitertools.batched()
for improved readability.This probably has a negligible performance impact (at the very least it won't be slower) but should make the code easier to understand.
Let me know if you're interested in a contribution for this or whether you'd rather not have the extra churn for a minor change like this.
Linked PRs
The text was updated successfully, but these errors were encountered: