Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add calibration calculators #2609

Merged
merged 116 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from 115 commits
Commits
Show all changes
116 commits
Select commit Hold shift + click to select a range
58d868c
added stats extractor parent component
Apr 29, 2024
edfccc6
added stats extractor based on sigma clipping
Apr 29, 2024
d19168c
added cut of outliers
Apr 30, 2024
49f0f8a
update docs
May 2, 2024
9035fb4
formatted with black
May 2, 2024
8a0e89f
added pass for __call__ function
May 2, 2024
82b29e9
added unit tests
May 3, 2024
1805cb2
added changelog
May 3, 2024
a4cea6a
fix lint
May 3, 2024
16e5920
Small commit for prototyping
May 7, 2024
63d1774
Removed unneeded functions
May 10, 2024
5768ed7
Merge branch 'stats_extractor' of https://github.com/cta-observatory/…
May 23, 2024
e6ba177
I altered the class variables to th evariance statistics extractor
May 23, 2024
ddf342b
added a container for mean variance images and fixed docustring
May 24, 2024
0beb975
I changed the container type for the StarVarianceExtractor
May 27, 2024
b4df919
fix pylint
May 31, 2024
3483fd4
change __call__() to _extract()
May 31, 2024
5bbedda
minor renaming
May 31, 2024
14a0910
use pytest.fixture for Extractors
May 31, 2024
f126748
reduce duplicated code of the call function
May 31, 2024
0ec81b9
edit description of StatisticsContainer
TjarkMiener Jun 12, 2024
3910c22
added feature to shift the extraction sequence
TjarkMiener Jun 12, 2024
5c6595b
fix boundary case for the last chunk
TjarkMiener Jun 12, 2024
bc65984
I made prototypes for the CalibrationCalculators
Jun 12, 2024
d7a65a3
I made PSFModel a generic class
Jun 12, 2024
ba00bf3
fix tests
TjarkMiener Jun 12, 2024
d3aae32
fix ruff
TjarkMiener Jun 12, 2024
825b25b
I fixed some variable names
Jun 12, 2024
a5b878d
Added a method for calibrating variance images
Jun 13, 2024
e8ebdd8
Merge branch 'stats_extractor' into CalibrationCalculators
Jun 13, 2024
78481cc
Commit before push for tjark
Jun 14, 2024
41a4246
added StatisticsCalculator
TjarkMiener Jun 28, 2024
010aea6
make faulty_pixels_threshold and chunk_shift as traits
TjarkMiener Jul 5, 2024
26ff8a4
added stats extractor parent component
Apr 29, 2024
d142995
added stats extractor based on sigma clipping
Apr 29, 2024
0737b7f
added cut of outliers
Apr 30, 2024
225346b
update docs
May 2, 2024
74c72bb
formatted with black
May 2, 2024
faad542
added pass for __call__ function
May 2, 2024
9867431
Small commit for prototyping
May 7, 2024
3a85ffa
Removed unneeded functions
May 10, 2024
69ebd1c
added unit tests
May 3, 2024
96b2dbb
added changelog
May 3, 2024
a007773
fix lint
May 3, 2024
96fe563
I altered the class variables to th evariance statistics extractor
May 23, 2024
67d34c4
added a container for mean variance images and fixed docustring
May 24, 2024
f1429da
I changed the container type for the StarVarianceExtractor
May 27, 2024
c9ec02b
fix pylint
May 31, 2024
444e370
change __call__() to _extract()
May 31, 2024
1c4cadb
minor renaming
May 31, 2024
0ebb464
use pytest.fixture for Extractors
May 31, 2024
bbbd891
reduce duplicated code of the call function
May 31, 2024
5deef74
I made prototypes for the CalibrationCalculators
Jun 12, 2024
f09bbe1
I made PSFModel a generic class
Jun 12, 2024
b8f0a4f
I fixed some variable names
Jun 12, 2024
39fca21
Added a method for calibrating variance images
Jun 13, 2024
b114638
edit description of StatisticsContainer
TjarkMiener Jun 12, 2024
5d18f61
added feature to shift the extraction sequence
TjarkMiener Jun 12, 2024
4df8c94
fix boundary case for the last chunk
TjarkMiener Jun 12, 2024
86eb15c
fix tests
TjarkMiener Jun 12, 2024
6200653
fix ruff
TjarkMiener Jun 12, 2024
4f604c5
Commit before push for tjark
Jun 14, 2024
6dfce15
added StatisticsCalculator
TjarkMiener Jun 28, 2024
8e873d3
make faulty_pixels_threshold and chunk_shift as traits
TjarkMiener Jul 5, 2024
bf36b90
Merge branch 'CalibrationCalculators' of https://github.com/cta-obser…
Aug 7, 2024
f7d3223
solved merge conflicts
TjarkMiener Aug 7, 2024
91812da
I made prototypes for the CalibrationCalculators
Jun 12, 2024
326c202
I made PSFModel a generic class
Jun 12, 2024
fa029f7
I fixed some variable names
Jun 12, 2024
7c90e0f
Added a method for calibrating variance images
Jun 13, 2024
8c44a42
Commit before push for tjark
Jun 14, 2024
bff611a
added StatisticsCalculator
TjarkMiener Jun 28, 2024
d85dd50
make faulty_pixels_threshold and chunk_shift as traits
TjarkMiener Jul 5, 2024
c522434
adding interpolator and Merge branch 'CalibrationCalculators' of http…
Aug 8, 2024
9192790
Removed Pointing Calculator
Aug 20, 2024
aecc5dc
Removing PointingCalculator, PSF model and interpolators
Aug 20, 2024
e3113ab
implement the StatisticsCalculator
TjarkMiener Aug 23, 2024
b9dc929
removed the helper function to get the start and end slices
TjarkMiener Aug 24, 2024
ef920da
polish docstrings
TjarkMiener Aug 25, 2024
9bfc16a
further polishing of docstrings
TjarkMiener Aug 25, 2024
aee5c10
fix typo
TjarkMiener Aug 25, 2024
1484eca
move check of config settings before loading of data
TjarkMiener Aug 25, 2024
601cbb4
moved Interpolator outside
TjarkMiener Aug 25, 2024
0347153
removed Interpolator artifacts
TjarkMiener Aug 25, 2024
b2208d5
removed reading part with TableLoader
TjarkMiener Aug 25, 2024
61936e5
removed writing part
TjarkMiener Aug 25, 2024
52ecd8a
add unit tests for calculators
TjarkMiener Aug 26, 2024
4ffd39e
add unit tests for calculators
TjarkMiener Aug 26, 2024
c404a4a
split __call__ function into two function for the first and second pass
TjarkMiener Aug 26, 2024
a59664e
fix docs and add changelog
TjarkMiener Aug 26, 2024
0ed515d
removed check for any faulty chunk
TjarkMiener Aug 27, 2024
48c686b
removed base class CalibrationCalculator and only have one Statistics…
TjarkMiener Aug 27, 2024
3bae046
fix fstring in logging
TjarkMiener Aug 28, 2024
aa5666a
bug fix
TjarkMiener Aug 28, 2024
ef53ebb
removed print
TjarkMiener Aug 28, 2024
3ca7252
removed me from CODEOWNERS
TjarkMiener Sep 3, 2024
4af4e74
Merge branch 'main' into CalibrationCalculators
TjarkMiener Sep 23, 2024
1e684b1
renamed to PixelStatisticsCalculator
TjarkMiener Oct 8, 2024
0bd7702
remove option to pass stats aggregator in constructor
TjarkMiener Oct 15, 2024
83ee522
rename n_pix to n_pixels
TjarkMiener Oct 15, 2024
6a29808
factored out _find_outliers()
TjarkMiener Oct 15, 2024
55451d2
use fraction instead of percentage for faulty_pixels_threshold
TjarkMiener Oct 15, 2024
5b8ed0b
combine outlier mask for two gains in a more readable way
TjarkMiener Oct 15, 2024
6e0d00a
update tests to set everything from the config
TjarkMiener Oct 15, 2024
fea13eb
remove redundant import
TjarkMiener Oct 15, 2024
1a99eb4
use np.flatnonzero instead of np.where
TjarkMiener Oct 15, 2024
e8e7f37
polish docstrings
TjarkMiener Oct 15, 2024
bc4fac9
also store the outlier mask for each detector
TjarkMiener Oct 15, 2024
88977ed
fix ruff formatting
TjarkMiener Oct 15, 2024
9a0914d
validate outlier_detector_list trait
TjarkMiener Oct 16, 2024
f300846
rename faulty_pixels_threshold to faulty_pixels_fraction
TjarkMiener Oct 17, 2024
7cb7f5f
pass config into OutlierDetector component
TjarkMiener Oct 22, 2024
710e291
fix outlier tests
TjarkMiener Oct 22, 2024
4893529
remove copy() and pop() when iterating over the OutlierDetector dicts
TjarkMiener Oct 22, 2024
4a5f6e6
Break long line
maxnoe Oct 23, 2024
eeaaa05
correct docstring
TjarkMiener Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/api-reference/monitoring/calculator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
.. _calibration_calculator:

**********************
Calibration Calculator
**********************


Reference/API
=============

.. automodapi:: ctapipe.monitoring.calculator
3 changes: 2 additions & 1 deletion docs/api-reference/monitoring/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Monitoring data are time-series used to monitor the status or quality of hardwar

This module provides some code to help to generate monitoring data from processed event data, particularly for the purposes of calibration and data quality assessment.

Currently, only code related to :ref:`stats_aggregator` and :ref:`outlier_detector` is implemented here.
Code related to :ref:`stats_aggregator`, :ref:`calibration_calculator`, and :ref:`outlier_detector` is implemented here.


Submodules
Expand All @@ -20,6 +20,7 @@ Submodules
:maxdepth: 1

aggregator
calculator
interpolation
outlier

Expand Down
1 change: 1 addition & 0 deletions docs/changes/2609.features.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add calibration calculators which aggregates statistics, detects outliers, handles faulty data chunks.
1 change: 1 addition & 0 deletions src/ctapipe/calib/camera/calibrator.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Definition of the `CameraCalibrator` class, providing all steps needed to apply
calibration and image extraction, as well as supporting algorithms.
"""

from functools import cache

import astropy.units as u
Expand Down
8 changes: 4 additions & 4 deletions src/ctapipe/monitoring/aggregator.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,18 @@ def __call__(
and call the relevant function of the particular aggregator to compute aggregated statistic values.
The chunks are generated in a way that ensures they do not overflow the bounds of the table.
- If ``chunk_shift`` is None, chunks will not overlap, but the last chunk is ensured to be
of size `chunk_size`, even if it means the last two chunks will overlap.
of size ``chunk_size``, even if it means the last two chunks will overlap.
- If ``chunk_shift`` is provided, it will determine the number of samples to shift between the start
of consecutive chunks resulting in an overlap of chunks. Chunks that overflows the bounds
of the table are not considered.

Parameters
----------
table : astropy.table.Table
table with images of shape (n_images, n_channels, n_pix)
and timestamps of shape (n_images, )
table with images of shape (n_images, n_channels, n_pixels), event IDs and
timestamps of shape (n_images, )
masked_pixels_of_sample : ndarray, optional
boolean array of masked pixels of shape (n_pix, ) that are not available for processing
boolean array of masked pixels of shape (n_channels, n_pixels) that are not available for processing
chunk_shift : int, optional
number of samples to shift between the start of consecutive chunks
col_name : string
Expand Down
337 changes: 337 additions & 0 deletions src/ctapipe/monitoring/calculator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,337 @@
"""
Definition of the ``PixelStatisticsCalculator`` class, providing all steps needed to
calculate the montoring data for the camera calibration.
"""

import numpy as np
from astropy.table import Table, vstack

from ctapipe.core import TelescopeComponent
from ctapipe.core.traits import (
ComponentName,
Dict,
Float,
Int,
List,
TelescopeParameter,
TraitError,
)
from ctapipe.monitoring.aggregator import StatisticsAggregator
from ctapipe.monitoring.outlier import OutlierDetector

__all__ = [
"PixelStatisticsCalculator",
]


class PixelStatisticsCalculator(TelescopeComponent):
"""
Component to calculate statistics from calibration events.

The ``PixelStatisticsCalculator`` is responsible for calculating various statistics from
calibration events, such as pedestal and flat-field data. It aggregates statistics,
detects outliers, and handles faulty data periods.
This class holds two functions to conduct two different passes over the data with and without
overlapping aggregation chunks. The first pass is conducted with non-overlapping chunks,
while overlapping chunks can be set by the ``chunk_shift`` parameter for the second pass.
The second pass over the data is only conducted in regions of trouble with a high fraction
of faulty pixels exceeding the threshold ``faulty_pixels_fraction``.
"""

stats_aggregator_type = TelescopeParameter(
trait=ComponentName(
StatisticsAggregator, default_value="SigmaClippingAggregator"
),
default_value="SigmaClippingAggregator",
help="Name of the StatisticsAggregator subclass to be used.",
).tag(config=True)

outlier_detector_list = List(
maxnoe marked this conversation as resolved.
Show resolved Hide resolved
trait=Dict(),
maxnoe marked this conversation as resolved.
Show resolved Hide resolved
default_value=None,
allow_none=True,
help=(
"List of dicts containing the name of the OutlierDetector subclass to be used, "
TjarkMiener marked this conversation as resolved.
Show resolved Hide resolved
"the aggregated statistic value to which the detector should be applied, "
"and the validity range of the detector. "
"E.g. ``[{'apply_to': 'std', 'name': 'RangeOutlierDetector', 'validity_range': [2.0, 8.0]},]``."
),
).tag(config=True)

chunk_shift = Int(
default_value=None,
allow_none=True,
help=(
"Number of samples to shift the aggregation chunk for the calculation "
"of the statistical values. Only used in the second_pass(), since the "
"first_pass() is conducted with non-overlapping chunks (chunk_shift=None)."
),
).tag(config=True)

faulty_pixels_fraction = Float(
default_value=0.1,
allow_none=True,
help="Minimum fraction of faulty camera pixels to identify regions of trouble.",
).tag(config=True)

def __init__(
self,
subarray,
config=None,
parent=None,
**kwargs,
):
"""
Parameters
----------
subarray: ctapipe.instrument.SubarrayDescription
Description of the subarray. Provides information about the
camera which are useful in calibration. Also required for
configuring the TelescopeParameter traitlets.
config: traitlets.loader.Config
Configuration specified by config file or cmdline arguments.
Used to set traitlet values.
This is mutually exclusive with passing a ``parent``.
parent: ctapipe.core.Component or ctapipe.core.Tool
Parent of this component in the configuration hierarchy,
this is mutually exclusive with passing ``config``
"""
super().__init__(subarray=subarray, config=config, parent=parent, **kwargs)

# Initialize the instances of StatisticsAggregator
self.stats_aggregators = {}
for _, _, name in self.stats_aggregator_type:
self.stats_aggregators[name] = StatisticsAggregator.from_name(
name, subarray=self.subarray, parent=self
)

# Initialize the instances of OutlierDetector from the configuration
self.outlier_detectors, self.apply_to_list = [], []
if self.outlier_detector_list is not None:
for d, outlier_detector in enumerate(self.outlier_detector_list):
# Check if all required keys are present
missing_keys = {
"apply_to",
"name",
"config",
} - outlier_detector.keys()
if missing_keys:
raise TraitError(
f"Entry '{d}' in the ``outlier_detector_list`` trait"
f"is missing required key(s): {', '.join(missing_keys)}"
)
self.apply_to_list.append(outlier_detector["apply_to"])
self.outlier_detectors.append(
OutlierDetector.from_name(
outlier_detector["name"],
subarray=self.subarray,
parent=self,
**outlier_detector["config"],
)
)

def first_pass(
self,
table,
tel_id,
masked_pixels_of_sample=None,
col_name="image",
) -> Table:
"""
Calculate the monitoring data for a given set of events with non-overlapping aggregation chunks.

This method performs the first pass over the provided data table to calculate
various statistics for calibration purposes. The statistics are aggregated with
non-overlapping chunks (``chunk_shift`` set to None), and faulty pixels are detected
using a list of outlier detectors.


Parameters
----------
table : astropy.table.Table
DL1-like table with images of shape (n_images, n_channels, n_pixels), event IDs and
timestamps of shape (n_images, )
tel_id : int
Telescope ID for which the calibration is being performed
masked_pixels_of_sample : ndarray, optional
TjarkMiener marked this conversation as resolved.
Show resolved Hide resolved
Boolean array of masked pixels of shape (n_channels, n_pixels) that are not available for processing
col_name : str
Column name in the table from which the statistics will be aggregated

Returns
-------
astropy.table.Table
Table containing the aggregated statistics, their outlier masks, and the validity of the chunks
"""
# Get the aggregator
aggregator = self.stats_aggregators[self.stats_aggregator_type.tel[tel_id]]
# Pass through the whole provided dl1 table
aggregated_stats = aggregator(
table=table,
masked_pixels_of_sample=masked_pixels_of_sample,
col_name=col_name,
chunk_shift=None,
)
# Detect faulty pixels with multiple instances of ``OutlierDetector``
# and append the outlier masks to the aggregated statistics
self._find_and_append_outliers(aggregated_stats)
# Get valid chunks and add them to the aggregated statistics
aggregated_stats["is_valid"] = self._get_valid_chunks(
aggregated_stats["outlier_mask"]
)
return aggregated_stats

def second_pass(
self,
table,
valid_chunks,
tel_id,
masked_pixels_of_sample=None,
col_name="image",
) -> Table:
"""
Conduct a second pass over the data to refine the statistics in regions with a high percentage of faulty pixels.

This method performs a second pass over the data with a refined shift of the chunk in regions where a high percentage
of faulty pixels were detected during the first pass. Note: Multiple first passes of different calibration events are
performed which may lead to different identification of faulty chunks in rare cases. Therefore a joined list of faulty
chunks is recommended to be passed to the second pass(es) if those different passes use the same ``chunk_size``.

Parameters
----------
table : astropy.table.Table
DL1-like table with images of shape (n_images, n_channels, n_pixels), event IDs and timestamps of shape (n_images, ).
valid_chunks : ndarray
Boolean array indicating the validity of each chunk from the first pass.
Note: This boolean array can be a ``logical_and`` from multiple first passes of different calibration events.
tel_id : int
Telescope ID for which the calibration is being performed.
masked_pixels_of_sample : ndarray, optional
Boolean array of masked pixels of shape (n_channels, n_pixels) that are not available for processing.
col_name : str
Column name in the table from which the statistics will be aggregated.

Returns
-------
astropy.table.Table
Table containing the aggregated statistics after the second pass, their outlier masks, and the validity of the chunks.
"""
# Check if the chunk_shift is set for the second pass
if self.chunk_shift is None:
raise ValueError(
"chunk_shift must be set if second pass over the data is requested"
)
# Check if at least one chunk is faulty
if np.all(valid_chunks):
raise ValueError(
maxnoe marked this conversation as resolved.
Show resolved Hide resolved
"All chunks are valid. The second pass over the data is redundant."
)
# Get the aggregator
aggregator = self.stats_aggregators[self.stats_aggregator_type.tel[tel_id]]
# Conduct a second pass over the data
aggregated_stats_secondpass = []
faulty_chunks_indices = np.flatnonzero(~valid_chunks)
for index in faulty_chunks_indices:
# Log information of the faulty chunks
self.log.info(
"Faulty chunk detected in the first pass at index '%s'.", index
)
# Calculate the start of the slice depending on whether the previous chunk was faulty or not
slice_start = (
aggregator.chunk_size * index
if index - 1 in faulty_chunks_indices
else aggregator.chunk_size * (index - 1)
)
# Set the start of the slice to the first element of the dl1 table if out of bound
# and add one ``chunk_shift``.
slice_start = max(0, slice_start) + self.chunk_shift
# Set the end of the slice to the last element of the dl1 table if out of bound
# and subtract one ``chunk_shift``.
slice_end = min(len(table) - 1, aggregator.chunk_size * (index + 2)) - (
self.chunk_shift - 1
)
# Slice the dl1 table according to the previously calculated start and end.
table_sliced = table[slice_start:slice_end]
# Run the stats aggregator on the sliced dl1 table with a chunk_shift
# to sample the period of trouble (carflashes etc.) as effectively as possible.
# Checking for the length of the sliced table to be greater than the ``chunk_size``
# since it can be smaller if the last two chunks are faulty. Note: The two last chunks
# can be overlapping during the first pass, so we simply ignore them if there are faulty.
if len(table_sliced) > aggregator.chunk_size:
aggregated_stats_secondpass.append(
aggregator(
table=table_sliced,
masked_pixels_of_sample=masked_pixels_of_sample,
col_name=col_name,
chunk_shift=self.chunk_shift,
)
)
# Stack the aggregated statistics of each faulty chunk
aggregated_stats_secondpass = vstack(aggregated_stats_secondpass)
# Detect faulty pixels with multiple instances of OutlierDetector of the second pass
# and append the outlier mask to the aggregated statistics
self._find_and_append_outliers(aggregated_stats_secondpass)
aggregated_stats_secondpass["is_valid"] = self._get_valid_chunks(
aggregated_stats_secondpass["outlier_mask"]
)
return aggregated_stats_secondpass

def _find_and_append_outliers(self, aggregated_stats):
"""
Find outliers and append the masks in the aggregated statistics.

This method detects outliers in the aggregated statistics using the
outlier detectors defined in the configuration. Table containing the
aggregated statistics will be appended with the outlier masks for each
detector and a combined outlier mask.

Parameters
----------
aggregated_stats : astropy.table.Table
Table containing the aggregated statistics.

"""
outlier_mask = np.zeros_like(aggregated_stats["mean"], dtype=bool)
for d, (column_name, outlier_detector) in enumerate(
zip(self.apply_to_list, self.outlier_detectors)
):
aggregated_stats[f"outlier_mask_detector_{d}"] = outlier_detector(
aggregated_stats[column_name]
)
outlier_mask = np.logical_or(
outlier_mask,
aggregated_stats[f"outlier_mask_detector_{d}"],
)
aggregated_stats["outlier_mask"] = outlier_mask

def _get_valid_chunks(self, outlier_mask):
"""
Identify valid chunks based on the outlier mask.

This method processes the outlier mask to determine which chunks of data
are considered valid or faulty. A chunk is marked as faulty if the fraction
of outlier pixels exceeds a predefined threshold ``faulty_pixels_fraction``.

Parameters
----------
outlier_mask : numpy.ndarray
Boolean array indicating outlier pixels. The shape of the array should
match the shape of the aggregated statistics.

Returns
-------
numpy.ndarray
Boolean array where each element indicates whether the corresponding
chunk is valid (True) or faulty (False).
"""
# Check if the camera has two gain channels
if outlier_mask.shape[1] == 2:
# Combine the outlier mask of both gain channels
maxnoe marked this conversation as resolved.
Show resolved Hide resolved
outlier_mask = np.logical_or.reduce(outlier_mask, axis=1)
# Calculate the fraction of faulty pixels over the camera
faulty_pixels = (
np.count_nonzero(outlier_mask, axis=-1) / np.shape(outlier_mask)[-1]
)
# Check for valid chunks if the threshold is not exceeded
valid_chunks = faulty_pixels < self.faulty_pixels_fraction
return valid_chunks
Loading