Skip to content

Output npy of hdf5 file using the processor #475

Answered by kondratyevd
ico1036 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @ico1036,
if you can convert the outputs of your processor to Pandas DataFrames, then you should be able to use Dask executor with argument use_dataframes=True.

The output will be a distributed Dask dataframe.
If you want to continue working with it, or print out as a single dataframe, you will also need to call output.compute() after you retrieve the outputs from run_uproot_job. Otherwise, you can directly save chunks of the output dataframe as Parquet files using dd.to_parquet(df=output).

Please let me know if you run into any issues, I will be happy to help.

Replies: 3 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@kondratyevd
Comment options

Answer selected by lgray
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants