A time clustering algorithm for image cleaning #2401

clara-escanuela · 2023-09-20T14:33:55Z

No description provided.

ctapipe/image/cleaning.py

maxnoe · 2023-09-20T15:15:52Z

ctapipe/image/cleaning.py

+
+    X = np.column_stack((time[precut_mask] / t_scale, pix_x, pix_y))
+
+    db = DBSCAN(eps=eps, min_samples=minpts).fit(X)


slightly shorter: labels = DBSCAN(...).fit_predict(X).

Shouldn't this also use sample_weights=image? Otherwise you are using image only for the initial cut

I tried sample_weight before. I used this weighting:
$\frac{1}{1+exp\left(-\frac{image+s}{d}\right)}$
It introduces two new parameters but I can add weights as an option

Why not simply weight with the image itself? I.e. cluster the photo electrons?

I have been thinking about this and I think we could introduce a fourth dimension in the cluster, which is 1/SNR, SNR is the signal-to-noise ratio, image/pedestal. Using just image does not work as the algorithm finds clusters with similar data points but if you plot 1/image, we find clusters:

It does not seem to make any difference though

maxnoe · 2023-09-22T12:52:13Z

Docs failure is fixed in main, please rebase

kosack · 2023-09-26T14:31:20Z

Not sure if it's interesting, but they just introduced an improved DBSCAN algorithm in scikit-learn

kosack · 2023-09-26T14:39:42Z

ctapipe/image/cleaning.py

+            n_noise=self.noise_cut.tel[tel_id],
+            d_scale=self.d_scale.tel[tel_id],
+            t_scale=self.t_scale.tel[tel_id],
+            pedestal=self.pedestal.tel[tel_id],


I think here you should really use the correct pedestal, not some hard-coded value. There is for example

ped = event.mon.tel[7].calibration.pedestal_per_sample

That is per channel, so you need to use the high/low gain switch mask. We probably should properly process simulated pedestals into DL1 information though, so you don't have to do this.

Maybe I should change the name but what I need is the variance of the reconstructed noise per pixel in units of pe. Is that available?

Yeah, i realized that just after posting... the pedestals themselves are not what you need. In the simulations currently we don't have the variances, but this is something we really need to fix. Either we should properly compute pedestals and not rely on the ones given in the sims (which are technically inputs, not outputs), or see if somewhere the variances are already included or ask for them to be included. That's not just or this PR, but for any "realistic" cleaning method, e.g. the Whipple-10m-style cleaning that is just tailcuts but with the thresholds expressed in pedestal variance units.

This should probably be a separate issue and PR. For now, what you can maybe do is just don't have these pedestal dispersions hard-coded in the class, but rather make them a TelescopeParameter that is configurable. That way we could adapt to multiple prods without having to edit the code. Once we have a "correct" way to read the pedestal dispersions for sims, we can then properly fix it.

You should definitely change the name, since "pedestal" is the mean value of the noise, not the dispersion. Here, you are assuming it's Gaussian, right (even though it's not)? And you are using it as a standard deviation, not variance, so maybe pedestal_std

it also occurs to me that this threshold could also just be a threshold in PE for now (like the tailcuts), which would work fine for simulations, where the pedestal variances don't vary across the FOV or in time. Then later we upgrade it to use pedestal variances once we have those properly implemented for simulations.

ctapipe/image/cleaning.py

clara-escanuela · 2024-01-15T16:49:49Z

This is the basic implementation of the method. I think we could finish it soon and then if we need to add some more details, which is possible, we could do so. I am just wondering about unit tests

codecov · 2024-01-17T11:00:18Z

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (main@18832ea). Click here to learn what that means.

❗ Current head 1f94003 differs from pull request most recent head afc1704. Consider uploading reports for the commit afc1704 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2401   +/-   ##
=======================================
  Coverage        ?   92.48%           
=======================================
  Files           ?      234           
  Lines           ?    20005           
  Branches        ?        0           
=======================================
  Hits            ?    18502           
  Misses          ?     1503           
  Partials        ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ctapipe/image/cleaning.py

maxnoe · 2024-02-05T15:17:27Z

ctapipe/image/cleaning.py

+    """
+
+    space_scale_m = FloatTelescopeParameter(
+        default_value=0.25, help="Pixel space scaling parameter in m"


Can we determine the scale parameters from the information in CameraDescription?

I am not sure. That parameter needs optimization, it could 2x or 3x the pixel spacing of the camera and teh optimized number depends on the telescope

Tobychev · 2024-02-22T16:11:06Z

@kosack @maxnoe This PR seems to have picked up quite a few unrelated changes, presumably from an attempt to rebase on main, what is the preferred way to deal with this?

maxnoe · 2024-02-22T16:57:40Z

I am not sure what happened here.

@clara-escanuela What git commands did you run?

I see issues like this one from time to time but never could figure out what mistake leads to these.

The proper way of updating a branch is (assuming upstream points to cta-observatory/ctapipe and origin points to clara-escanuela/ctapipe)

$ git fetch upstream # get latest changes from main repo
$ git rebase upstream/main
# fix any conflicts that might occure by editing and then running git add / git rebase --continue

I think could locally salvage this branch here by cherry-picking the commits that seem to be originally from @clara-escanuela on this branch after resetting the branch hard to upstream/main.

$ git reset --hard upstream/main
$ git cherry-pick 37e5ea864 55cf8bafb 7e8daf8d4 895ff9774 a9f81e99d 357d02b51 722dfcf17 3c799aaee b9ab65e67 afc170480

clara-escanuela added algorithm cleaning labels Sep 20, 2023

maxnoe reviewed Sep 20, 2023

View reviewed changes

ctapipe/image/cleaning.py Show resolved Hide resolved

maxnoe reviewed Sep 20, 2023

View reviewed changes

ctapipe/image/cleaning.py Show resolved Hide resolved

maxnoe reviewed Sep 20, 2023

View reviewed changes

clara-escanuela force-pushed the time_clustering branch from be9949d to de7b550 Compare September 22, 2023 15:07

kosack reviewed Sep 26, 2023

View reviewed changes

kosack requested changes Sep 26, 2023

View reviewed changes

ctapipe/image/cleaning.py Outdated Show resolved Hide resolved

ctapipe/image/cleaning.py Outdated Show resolved Hide resolved

clara-escanuela requested review from maxnoe and kosack November 2, 2023 09:01

clara-escanuela force-pushed the time_clustering branch from 4856259 to 1f94003 Compare January 17, 2024 10:49

kosack previously approved these changes Jan 18, 2024

View reviewed changes

clara-escanuela requested a review from Tobychev January 31, 2024 16:27

Tobychev reviewed Feb 5, 2024

View reviewed changes

ctapipe/image/cleaning.py Outdated Show resolved Hide resolved

ctapipe/image/cleaning.py Outdated Show resolved Hide resolved

ctapipe/image/cleaning.py Show resolved Hide resolved

ctapipe/image/cleaning.py Show resolved Hide resolved

maxnoe reviewed Feb 5, 2024

View reviewed changes

ctapipe/image/cleaning.py Outdated Show resolved Hide resolved

maxnoe reviewed Feb 5, 2024

View reviewed changes

ctapipe/image/cleaning.py Show resolved Hide resolved

maxnoe reviewed Feb 5, 2024

View reviewed changes

clara-escanuela dismissed kosack’s stale review via 173c482 February 15, 2024 10:20

maxnoe and others added 8 commits February 15, 2024 11:33

Add more nitpick ignores to fix docs build after traitlets update

0d9eb7c

Add ctapipe-train-disp-reconstructor config to quickstart tool

bdf210e

Add changelog

02b3589

Add note about performance of provided ml configs

d06051e

Changed some numpy calls following the numpy 2.0 migration guide

56a22cd

Changelog

4fe8f7e

Fix import order

1541808

Fix import order

9ed2d29

Tobychev and others added 25 commits February 15, 2024 11:40

Trying new magic conditional, see if this works as intended.

c70d797

More tests of how to formulate conditional

3547162

Think I understand the problem now

ad2127c

Rebasing on main

8f7d639

Messing about with syntax

60c406e

Fix conflict with main

5c68368

Adding skip to more stages

87374fc

Changed the label triggering skipping tests

7638195

Think I understand the problem now

3a57132

Fix conflict with main

3aa8a73

Rebasing on main

065c3d2

Change label

f51466b

Update ci.yml

c743b04

Update ci.yml

5fd6952

Propose cep3: remove image parameters in camera frame

f83615e

added time clustering algortihm

37e5ea8

minpts is integer

55cf8ba

added unit tests and references

7e8daf8

added unit tests and references

895ff97

add docs

a9f81e9

style changes

357d02b

improved documentation

722dfcf

hard cut on pe

3c799aa

added more units tests

b9ab65e

updated time cleaning

afc1704

clara-escanuela force-pushed the time_clustering branch from 173c482 to afc1704 Compare February 15, 2024 10:43

clara-escanuela requested a review from HealthyPear as a code owner February 15, 2024 10:43

burmist-git mentioned this pull request Aug 22, 2024

Add DBScan-based trigger algorithm #2606

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A time clustering algorithm for image cleaning #2401

A time clustering algorithm for image cleaning #2401

clara-escanuela commented Sep 20, 2023

maxnoe Sep 20, 2023

maxnoe Sep 20, 2023

clara-escanuela Sep 20, 2023

maxnoe Sep 22, 2023

clara-escanuela Sep 26, 2023

clara-escanuela Sep 26, 2023

maxnoe commented Sep 22, 2023

kosack commented Sep 26, 2023

kosack Sep 26, 2023

clara-escanuela Sep 27, 2023 •

edited

Loading

kosack Sep 27, 2023 •

edited

Loading

kosack Sep 27, 2023 •

edited

Loading

kosack Sep 28, 2023

clara-escanuela commented Jan 15, 2024

codecov bot commented Jan 17, 2024 •

edited

Loading

maxnoe Feb 5, 2024 •

edited

Loading

clara-escanuela Feb 15, 2024

Tobychev commented Feb 22, 2024

maxnoe commented Feb 22, 2024


		X = np.column_stack((time[precut_mask] / t_scale, pix_x, pix_y))

		db = DBSCAN(eps=eps, min_samples=minpts).fit(X)

A time clustering algorithm for image cleaning #2401

Are you sure you want to change the base?

A time clustering algorithm for image cleaning #2401

Conversation

clara-escanuela commented Sep 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maxnoe commented Sep 22, 2023

kosack commented Sep 26, 2023

Choose a reason for hiding this comment

clara-escanuela Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

kosack Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

kosack Sep 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clara-escanuela commented Jan 15, 2024

codecov bot commented Jan 17, 2024 • edited Loading

Codecov Report

maxnoe Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tobychev commented Feb 22, 2024

maxnoe commented Feb 22, 2024

clara-escanuela Sep 27, 2023 •

edited

Loading

kosack Sep 27, 2023 •

edited

Loading

kosack Sep 27, 2023 •

edited

Loading

codecov bot commented Jan 17, 2024 •

edited

Loading

maxnoe Feb 5, 2024 •

edited

Loading