Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add true position and E to hits dataset #127

Closed
wants to merge 6 commits into from
Closed

Conversation

YifanC
Copy link
Collaborator

@YifanC YifanC commented May 24, 2024

  • Added true drift position and true energy reconstructed hits.
  • Adapted to use configuration instead hardcoded numbers for hit reconstruction

@YifanC YifanC requested review from diaza, krwood and cuddandr May 24, 2024 19:14
Copy link
Member

@krwood krwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these changes look good to me, except for the namesake of this PR (adding truth information to the hits dataset), but it would be good to see some validation plots before merging this in.

@@ -161,6 +161,33 @@ def run(self, source_name, source_slice, cache):

has_mc_truth = packet_seg_bt is not None

if has_mc_truth:
self.calib_hits_dtype = np.dtype(self.calib_hits_dtype.descr + [('x_true_seg_t', 'f8'), ('E_true_recomb_elife', 'f8')])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rational for this? I don't think we should be mixing truth and reco information in the same dataset. This is what the back tracking datasets are for, no? We should also try and keep the "reco" datasets consistent for mc and data..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hits with true t0 position is required for MLreco label making. Storing all contributions with the new commit (also corrected a unit issue). Given our setup, the output recombination (close to 0.7), lifetime (close to 1), E values (~0.5) all make sense to me. The differences between x and x_true_seg_t, E and E_true_recomb_elife also fits the scale and is consistent with the making.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truth information should go into the backtracking and truth datasets, not the reco datasets.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make more concrete suggestions how to structure this? The current proposal makes sense to me in a way that these are extension of what calib_prompt_hits have. Especially E_true_recomb_elife is half true half readout. Putting in mc_truth requires backtracking, not straightforward matching. It doesn't affect the processing of the dataset and the name of the variable is clear about that it uses truth information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants