Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to specify Packer's internal buf_size #547

Closed
wants to merge 1 commit into from

Conversation

rtobar
Copy link
Contributor

@rtobar rtobar commented Jul 18, 2023

Giving this flexibility to users means that internal reallocations can be avoided if the buffer size is good enough, at the expense of potentially allocating more memory than needed. This, together with reusing a Packer object, means that multiple serialisations can end up requiring no memory allocations other than the initial buffer creation, which can be a big win in some situations.

The default value is still 1MB, making this backwards compatible.

Giving this flexibility to users means that internal reallocations can
be avoided if the buffer size is good enough, at the expense of
potentially allocating more memory than needed. This, together with
reusing a Packer object, means that multiple serialisations can end up
requiring no memory allocations other than the initial buffer creation,
which can be a big win in some situations.

The default value is still 1MB, making this backwards compatible.

Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
@rtobar
Copy link
Contributor Author

rtobar commented Jul 28, 2023

@methane a gentle ping to bring attention to this fairly simple PR I opened about 10 days ago, I saw that you have been contributing the most lately to this file so I imagine you are a good person to review it.

@methane
Copy link
Member

methane commented Jul 28, 2023

a gentle ping to bring attention to this fairly simple PR

Adding public API is not "simple". I need to consider many aspects of it.

If you want to help review, please show a benchmark result that demonstraits how reallocation is important.

This, together with reusing a Packer object, means that multiple serialisations can end up requiring no memory allocations other than the initial buffer creation, which can be a big win in some situations.

Reusing Packer objects makes also reduce buffer size increase even without this change. So it doesn't make a big win.

@rtobar
Copy link
Contributor Author

rtobar commented Jul 28, 2023

@methane thanks for the fast answer, much appreciated.

Adding public API is not "simple". I need to consider many aspects of it.

True, I think my previous statement was an oversimplification. I was mostly hoping that, since it was a well-constrained and backwards-compatible change, it wouldn't be too much of a burden to include it. Of course you will have a better view of what else adding this would entail.

Reusing Packer objects makes also reduce buffer size increase even without this change. So it doesn't make a big win.

True as well, I was thinking this would be more of a win for users of msgpack.packb, although when combined would give you a zero-allocation scenario.

If you want to help review, please show a benchmark result that demonstraits how reallocation is important.

This is a smaller version of the benchmark I was running. We are using xarrays, but you can simplify it further if you simply use a numpy array. The version here doesn't include the buf_size=... argument, which I did set to 200MB when testing. This was running on Linux x64, Python 3.10/3.11. Results with 3.11 basically don't have the big drop at ~64 MB, I'm assuming something changed in Python's memory allocators in 3.11 that removes that cliff. If you open the plots next to each other you'll see (noise aside) that packb has a slightly better performance when being able to allocate only once. This is using Timer.autorange for timings, so the best/worst-case scenarios are probably a bit more extreme.

import msgpack
import msgpack_numpy as mnp
import numpy as np
import xarray as xa
import timeit
import functools
import seaborn as sns
import pandas as pd
import matplotlib.pyplot

to_dict = lambda x: x.to_dict(data="array")
mser = lambda x: msgpack.packb(x, default=mnp.encode)
packer = msgpack.Packer(default=mnp.encode, autoreset=False)
def mser_packer(x):
    packer.reset()
    packer.pack(x)
    return packer

all_benchmarks = {
    "write_only": {
        'Dataset -> dict + msgpack.packb': [to_dict, mser],
        'Dataset -> dict + msgpack.Packer': [to_dict, mser_packer],
    }
}

def benchmark_for_sizes(functions, nitems):
    results = []
    for nitem in nitems:
        arr = np.random.rand(nitem)
        xarr = xa.Dataset({"x": arr, "y": arr + 30})
        size = xarr.nbytes
        print(f"  {nitem=}, {size=}")
        timer = timeit.Timer('functools.reduce(lambda v, f: f(v), functions, xarr)', setup="import functools", globals=locals())
        n_executions, total_duration = timer.autorange()
        duration = total_duration / n_executions
        results.append((size, duration))
    return results


def run_benchmarks(nitems, benchmarks):
    results = {}
    for name, functions in benchmarks.items():
        print(f"Benchmarking {name}")
        results[name] = benchmark_for_sizes(functions, nitems)
    flat_results = list((name, size, duration) for name, values in results.items() for size, duration in values)
    df = pd.DataFrame(flat_results, columns=("Benchmark", "Size", "Duration"))
    df['Size [MB]'] = df['Size'] / 1024 / 1024
    df['Speed [MB/s]'] = df['Size'] / 1024 / 1024 / df["Duration"]
    return df

def run_all():
    sns.set_theme()
    nitems = list(range(10000000, 2000000, -120000))
    dfs = []
    for group, benchmarks in all_benchmarks.items():
        df = run_benchmarks(nitems, benchmarks)
        df["Group"] = group
        dfs.append(df)

    df = pd.concat(dfs)
    sns.relplot(data=df, x='Size [MB]', y='Speed [MB/s]', kind='line', hue='Benchmark', col='Group')
    matplotlib.pyplot.savefig(f'benchmark_results.png')

if __name__ == '__main__':
    run_all()

Python 3.10: no_bufsize v/s bufsize
310_nobufsize 310_bufsize

Python 3.11: no_bufsize v/s bufsize
311_nobufsize 311_bufsize

@rtobar
Copy link
Contributor Author

rtobar commented Aug 3, 2023

@methane: a gentle ping to learn if you have any further thoughts on whether you'd be willing to accept this change or not, given the evidence above, and the hopefully small maintenance burden?

@methane
Copy link
Member

methane commented Aug 3, 2023

Thank you for detailed report.
I am tending toward incorporating this feature, but will not do so now.
I will make final decision when I start developing v1.1.

@methane
Copy link
Member

methane commented Aug 3, 2023

There is no schedule for v1.1 yet.
At least one more release (v1.0.6) will be released when Python 3.12 become RC.

@rtobar
Copy link
Contributor Author

rtobar commented Aug 3, 2023

Thanks @methane, much appreciated

@methane
Copy link
Member

methane commented May 5, 2024

committed in #604.

@methane methane closed this May 5, 2024
@rtobar rtobar deleted the bufsize-argument branch May 6, 2024 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants