Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask for template histogram construction #401

Open
alexander-held opened this issue Apr 19, 2023 · 1 comment
Open

Dask for template histogram construction #401

alexander-held opened this issue Apr 19, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@alexander-held
Copy link
Member

alexander-held commented Apr 19, 2023

With dask-awkward and integration with hist, it seems conceptually possible to make the template histogram creation fully Dask-driven. This could result in two significant improvements:

  • parallelization of template histogram construction,
  • optimization of data processing, e.g. avoiding duplicate data reading when filling lots of histograms that only differ by a weight.

The latter requires dask-awkward and dask-histogram and needs to build the full task graph before calling compute. This is a much bigger change than providing an interface that allows distribution of template construction (which acts as a black box function) via e.g. Dask.

@alexander-held alexander-held added the enhancement New feature or request label Apr 19, 2023
@alexander-held
Copy link
Member Author

An intermediate solution that might also help is the possibility to run histogram production on a subset of all templates via some filter, allowing to parallelize manually by calling the same function with different filters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant