Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix minor bug in event_model's rechunk_event_pages #297

Merged
merged 5 commits into from
Mar 28, 2024

Conversation

cjtitus
Copy link
Contributor

@cjtitus cjtitus commented Mar 6, 2024

rechunk_event_pages has a bug if you try to re-chunk an EventPage that has an empty "filled" key, because the dictionary comprehension to create the new "filled" dictionaries uses keys from "data". This means that you can't rechunk the EventPages that are emitted by the RunEngine.

Description

I have just changed one word -- instead of
"filled": {key: page["filled"][key][start:stop] for key in page["data"].keys()},
I have written
"filled": {key: page["filled"][key][start:stop] for key in page["filled"].keys()},
which correctly leaves filled empty, if it was empty to begin with.

Motivation and Context

It is necessary to fix this bug in order to re-chunk EventPages that are currently too large to go into Kafka after long fly-scans.

How Has This Been Tested?

I have tested this by manually calling rechunk_event_pages on some generated EventPages, and also by running the built-in tests.

@cjtitus
Copy link
Contributor Author

cjtitus commented Mar 6, 2024

Ahh, I spoke too soon! It turns out that rechunk_event_pages still fails if you increase the chunk size. So, if you input a list of EventPages and want to increase the chunk size to decrease the number of pages, the constructor for EventPages also needs to have a non-empty "filled" key.

I need to fix this as well, so don't merge yet.

@cjtitus cjtitus marked this pull request as draft March 6, 2024 21:04
@cjtitus
Copy link
Contributor Author

cjtitus commented Mar 6, 2024

Ok, now I can chunk up or down. I wonder if some tests should be implemented for this...

@cjtitus cjtitus marked this pull request as ready for review March 7, 2024 14:31
Copy link
Member

@mrakitin mrakitin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix, @cjtitus.

Please fix the formatting which became broken due to the fix.

If you could add a simple test to demonstrate the bug and its fix, it would be great!

@cjtitus
Copy link
Contributor Author

cjtitus commented Mar 7, 2024

Ok, formatting fixed, and I added parametrization to the rechunk_event_pages test so that it now tests documents that have filled: {}.

I have checked that this test fails on the master branch (and passes for this PR)

@tacaswell tacaswell requested a review from mrakitin March 26, 2024 16:41
@danielballan danielballan dismissed mrakitin’s stale review March 28, 2024 20:02

Issues were addressed in follow-up commits

@danielballan danielballan merged commit 707a108 into bluesky:main Mar 28, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants