Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define API for reader and image_mapper #143

Draft
wants to merge 44 commits into
base: master
Choose a base branch
from
Draft

Conversation

TjarkMiener
Copy link
Member

This PR suggests APIs for the reader and image_mapper. The configuration system and component scheme of ctapipe is adopted. Besides we adopt astropy tables for the batch generation. Reading is working in monoscopic and stereoscopic mode for DL1 images and R1 waveforms. The code properly process data with different array and telescope (divergent) pointings.

A subclass designed for the adv. trigger system processing R0 waveforms will be added in a separated PR.

–––
Closes #31 #104

For looping over a given dl1 table and a single dl1 event (charges, peak times and mask), we can now retrieve the 2D images (input of the CNNs) without init and running the dl1dh reader.
also create separate function for the trigger patches on R0 data
remove also apply IM functions because it can be now replace by the internal .map_image() function
If prefix camera_frame is in the file, the user should add this prefix to the config file.
added batch generator

removed dl1dh transform

dl1dh provide now a static batch and we get the relevant information about the labels in the data loader of ctlearn
now stored in dl0 monitoring tree
last dimension of sample was missing
replaced by new design
Mainly astropy table operations are now used to retrieve the exmaple identifiers for the stereo reading mode. Code base is therefore heavily reduced and operations are more efficient.

Moved parameter settings outside the dl1dh. User can request to also the read dl1b  parameters by passing a list of column names in the batch_generation()

Removed init skip when pandas hdf5 with example identifiers is provided. It is not needed anymore since we are now fast and efficient with astropy tables and their operations.

split transformation into sub-functions for better readability.
this is removing redundant code

astopy table operations should be used a retrieved sum(), min or max etc.
Everything related to the selection of the subarray is done with ctapipe now

Whenever a new file is processed, it checks the consistency of the SubarrayDescription to the reference which is the first provided file; this ensures that all files have the subarray.
It defines a reading API with two childs for reading in mono and stereo mode. Then, the Image and Waveform childs inherits from both child classes (mono and stereo). Finally, the trigger child only inherits from the mono child
this is somehow needed because we modify the batch Table for the Trigger subclass

fix trigger subclass
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Collaborator

@Pablitinho Pablitinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First revision round 🙂

dl1_data_handler/image_mapper.py Outdated Show resolved Hide resolved
dl1_data_handler/image_mapper.py Outdated Show resolved Hide resolved
self.image_shapes[camtype][1] + self.default_pad * 4,
self.image_shapes[camtype][2],
)
def _get_virtual_pixels(self, x_ticks, y_ticks, pix_x, pix_y):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General suggestion: Add parameters meainig in "Doc Strings". You can use tools to generate it automatically in order to speed up this tedious task. Nobody like documentation 😂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not add the parameters for the internal functions of the ImageMapper class, since I thought it was not necessary. Only for the one (map_image()) that can be called from outside. If you are insisting, I can add them here as well ;-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you only need the public functions docstrings for user documentation, the dev documentation will greatly benefit of having members documented. I'm with @Pablitinho on this. Just use copilot, you will be surprised how smart it can be ;)

dl1_data_handler/image_mapper.py Outdated Show resolved Hide resolved
dl1_data_handler/image_mapper.py Outdated Show resolved Hide resolved
dl1_data_handler/image_mapper.py Show resolved Hide resolved
dl1_data_handler/reader.py Outdated Show resolved Hide resolved
dl1_data_handler/reader.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Pablitinho Pablitinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@mexanick mexanick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments inline. It is quite difficult to comment on such a huge PR.
I suggest an implementation of a proper testing suite instead of a test notebook in order to automatize testing. Also, would be great to see a class diagram for the refactored ImageMapper showing what functionality is inherited, what is overloaded and what is extended.

self.camera_type = self.geometry.name
self.n_pixels = self.geometry.n_pixels
# Rotate the pixel positions by the pixel to align
self.geometry.rotate(self.geometry.pix_rotation)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this? If I understand correctly, pix_rotation is an angle, at which every pixel is rotated, but not necessarily the camera, as for that one there's cam_rotation. From the comment above this line, perhaps cam_rotation shall be used instead?

self.image_shapes[camtype][1] + self.default_pad * 4,
self.image_shapes[camtype][2],
)
def _get_virtual_pixels(self, x_ticks, y_ticks, pix_x, pix_y):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you only need the public functions docstrings for user documentation, the dev documentation will greatly benefit of having members documented. I'm with @Pablitinho on this. Just use copilot, you will be surprised how smart it can be ;)

self.geometry.pix_y.value, decimals=constants.decimal_precision
)

self.x_ticks = np.unique(self.pix_x).tolist()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right you assume regularly spaced pixels for any kind of camera geometry? If not, did you test any geometry with shuffle step (e.g. square pixels where rows are shifted by e.g. 25%?)

self.pix_x, self.x_ticks = self._smooth_ticks(self.pix_x, self.x_ticks)
self.pix_y, self.y_ticks = self._smooth_ticks(self.pix_y, self.y_ticks)

# At the edges of the cameras some mapping methods run into issues.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because your "ticks" maxes out at the maxima of pix_x, pix_y and do not take into account the pixel's area (border)?

def _create_virtual_hex_pixels(
self, first_ticks, second_ticks, first_pos, second_pos
):
"""Create virtual hexagonal pixels outside of the camera."""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(even inline) will help

**kwargs,
)

if geometry.pix_type != PixelShape.HEXAGON:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why one can't oversample a square pixel grid?

**kwargs,
)

if geometry.pix_type != PixelShape.HEXAGON:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as above

----------
quality_query : TableQualityQuery
An instance of TableQualityQuery to apply quality criteria to the data.
files : OrderedDict

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need an OrderedDict here? I don't see any use of specific features for it in the code. On the other hand, a standard dict since python 3.6 retains original order of items.

}
)
def _multiplicity_cut_tel_type(table, key_colnames):
self.min_telescopes_of_type.attach_subarray(self.subarray)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad design, you modify here some objects that are out of scope for this function. Also you don't use key_colnames local variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Astropy filter functionality requires a function with those two arguments. See here.

events = events.group_by(["obs_id", "event_id"])

def _multiplicity_cut_subarray(table, key_colnames):
return len(table) >= self.min_telescopes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need key_colnames?

@TjarkMiener TjarkMiener marked this pull request as draft October 16, 2024 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

use more ctapipe functionality
3 participants