From 76c9b5bd92b66b14eab2f5ec59501b526eba6434 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Date: Wed, 24 Jan 2024 15:49:18 +0100 Subject: [PATCH 01/19] Update index.rst --- docs/lstchain_api/datachecks/index.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/lstchain_api/datachecks/index.rst b/docs/lstchain_api/datachecks/index.rst index 068b8150b6..0a3aed9516 100644 --- a/docs/lstchain_api/datachecks/index.rst +++ b/docs/lstchain_api/datachecks/index.rst @@ -9,7 +9,10 @@ Datachecks (`datachecks`) Introduction ============ -Module containing functions producing the LST datachecks. Currently reaching DL1 level. +Module containing functions producing the LST datachecks. Currently the checks are done at the DL1 level. + +Using the datacheck files for selecting good-quality data +========================================================= Reference/API ============= From 772a63a838cf822824886b2fbf059a9e28f6b22f Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Wed, 24 Jan 2024 20:15:48 +0100 Subject: [PATCH 02/19] Improved docs --- docs/lstchain_api/datachecks/index.rst | 33 ++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/docs/lstchain_api/datachecks/index.rst b/docs/lstchain_api/datachecks/index.rst index 0a3aed9516..30889aede9 100644 --- a/docs/lstchain_api/datachecks/index.rst +++ b/docs/lstchain_api/datachecks/index.rst @@ -10,6 +10,39 @@ Introduction ============ Module containing functions producing the LST datachecks. Currently the checks are done at the DL1 level. +The DL1 datacheck files are produced using the following scripts: + +* :py:obj:`lstchain.scripts.lstchain_check_dl1` + + This takes as input subrun-wise DL1 files, e.g.: + + .. code-block:: bash + + lstchain_check_dl1 --input-file dl1_LST-1.1.Run01881.0000.h5 --output-dir OUTPUT_DIR --omit-pdf + + + and produces a subrun-wise file, ``datacheck_dl1_LST-1.Run01881.0000.h5`` which contains many quantities that can be + used to judge the quality of the data (see class :py:obj:`lstchain.datachecks.containers.DL1DataCheckContainer`) + + +* :py:obj:`lstchain.scripts.lstchain_check_dl1` + + The same script is used (by providing as input the subrun-wise datacheck files produced above) to produce a run-wise + datacheck file. It also needs to know where the subrun-wise .fits files (produced in the R0 to DL1 analysis step) + containing the muon ring information are stored ("MUONS_DIR"): + + .. code-block:: bash + + lstchain_check_dl1 --input-file "datacheck_dl1_LST-1.Run01881.*.h5" --output-dir OUTPUT_DIR --muons-dir MUONS_DIR + + + The output is now a run-wise file, ``datacheck_dl1_LST-1.Run01881.h5`` which contains the information from the + subrun-wise files. It also produces a .pdf file ``datacheck_dl1_LST-1.Run01881.pdf`` with many plots of the quantities + stored in the DL1DataCheckContainer objects. + +* b +* c + Using the datacheck files for selecting good-quality data ========================================================= From de4bd0017cebe9ac23d5bf40d122849908f1b01f Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Thu, 25 Jan 2024 20:24:42 +0100 Subject: [PATCH 03/19] Added more explanations --- docs/lstchain_api/datachecks/index.rst | 61 ++++++++++++++++++++------ 1 file changed, 47 insertions(+), 14 deletions(-) diff --git a/docs/lstchain_api/datachecks/index.rst b/docs/lstchain_api/datachecks/index.rst index 30889aede9..15e3690d2c 100644 --- a/docs/lstchain_api/datachecks/index.rst +++ b/docs/lstchain_api/datachecks/index.rst @@ -10,38 +10,71 @@ Introduction ============ Module containing functions producing the LST datachecks. Currently the checks are done at the DL1 level. -The DL1 datacheck files are produced using the following scripts: +The DL1 datacheck files are produced by running the following scripts sequentially: * :py:obj:`lstchain.scripts.lstchain_check_dl1` - This takes as input subrun-wise DL1 files, e.g.: + This takes as input a DL1 file (including DL1a information, i.e. camera images & times) from a data subrun, e.g.: .. code-block:: bash - lstchain_check_dl1 --input-file dl1_LST-1.1.Run01881.0000.h5 --output-dir OUTPUT_DIR --omit-pdf + lstchain_check_dl1 --input-file dl1_LST-1.1.Run14619.0000.h5 --output-dir OUTPUT_DIR --omit-pdf - and produces a subrun-wise file, ``datacheck_dl1_LST-1.Run01881.0000.h5`` which contains many quantities that can be - used to judge the quality of the data (see class :py:obj:`lstchain.datachecks.containers.DL1DataCheckContainer`) + The script produces a data check file for the subrun, ``datacheck_dl1_LST-1.Run14619.0000.h5`` which contains many + quantities that can be used to judge the quality of the data (see class :py:obj:`~lstchain.datachecks.containers.DL1DataCheckContainer`) +| * :py:obj:`lstchain.scripts.lstchain_check_dl1` - The same script is used (by providing as input the subrun-wise datacheck files produced above) to produce a run-wise - datacheck file. It also needs to know where the subrun-wise .fits files (produced in the R0 to DL1 analysis step) - containing the muon ring information are stored ("MUONS_DIR"): + The same script is run again, but now providing as input the subrun-wise datacheck files produced above (all those of + a given run must be provided). It also needs to know where the subrun-wise .fits files (produced in the R0 to DL1 + analysis step) which contain the muon ring information are stored ("MUONS_DIR"): .. code-block:: bash - lstchain_check_dl1 --input-file "datacheck_dl1_LST-1.Run01881.*.h5" --output-dir OUTPUT_DIR --muons-dir MUONS_DIR + lstchain_check_dl1 --input-file "datacheck_dl1_LST-1.Run14619.*.h5" --output-dir OUTPUT_DIR --muons-dir MUONS_DIR - The output is now a run-wise file, ``datacheck_dl1_LST-1.Run01881.h5`` which contains the information from the - subrun-wise files. It also produces a .pdf file ``datacheck_dl1_LST-1.Run01881.pdf`` with many plots of the quantities - stored in the DL1DataCheckContainer objects. + The output is now a run-wise file, ``datacheck_dl1_LST-1.Run14619.h5`` which contains all the information from the + subrun-wise files. It also produces a .pdf file ``datacheck_dl1_LST-1.Run14619.pdf`` with various plots of the + quantities stored in the DL1DataCheckContainer objects, plus others obtained from the muon ring analysis. Note that + the muon ring information is not copied to the run-wise datacheck files, it is just used for the plotting. + +| + +* :py:obj:`lstchain.scripts.lstchain_longterm_dl1_check` + + This merges the run-wise datacheck files of (typically) one night, stored in INPUT_DIR, and produces a single .h5 file + as output (e.g. ``DL1_datacheck_20230920.h5``). The file contains a (run-wise) summarized version of the information in the + input files, including the muon ring .fits files. + + .. code-block:: bash + + lstchain_longterm_dl1_check --input-dir INPUT_DIR --muons-dir MUONS_DIR --output-file DL1_datacheck_20230920.h5 --batch + + It also creates an .html file (with the same name, except for the extension, as the .h5 file) which can be + opened with any web browser and which contains various interactive plots which allow to make a quick check of the data + of a night. + +| + +* :py:obj:`lstchain.scripts.lstchain_cherenkov_transparency` + + This script analyzes image intensity histograms in the run-wise datacheck files (which must be stored under the + standard location ``/fefs/aswg/data/real/DL1/YYYYMMDD/INPUT_DIR``, where INPUT_DIR is a path provided as a command-line + argument). + + .. code-block:: bash + + lstchain_cherenkov_transparency --update_datacheck_file DL1_datacheck_20230920.h5 --input_dir v0.10/tailcut84/datacheck + + The script updates the night-wise datacheck .h5 file ``DL1_datacheck_20230920.h5`` with a new table containing parameters + related to the image intensity spectra for cosmic ray events (i.e., a Cherenkov-transparency - like approach, see e.g. + https://arxiv.org/abs/1310.1639). + -* b -* c Using the datacheck files for selecting good-quality data From 586e52a9086767a6006c2e31c20714d5635e0d9c Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Fri, 26 Jan 2024 16:42:08 +0100 Subject: [PATCH 04/19] Improved docs --- docs/lstchain_api/datachecks/index.rst | 40 +++++++++++++++----------- 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/docs/lstchain_api/datachecks/index.rst b/docs/lstchain_api/datachecks/index.rst index 15e3690d2c..4cd4c18500 100644 --- a/docs/lstchain_api/datachecks/index.rst +++ b/docs/lstchain_api/datachecks/index.rst @@ -9,7 +9,12 @@ Datachecks (`datachecks`) Introduction ============ -Module containing functions producing the LST datachecks. Currently the checks are done at the DL1 level. +Module containing functions for checking the quality of the LST data. + +DL1 data checks +=============== + +Currently the checks are done at the DL1 level. The DL1 datacheck files are produced by running the following scripts sequentially: * :py:obj:`lstchain.scripts.lstchain_check_dl1` @@ -29,40 +34,43 @@ The DL1 datacheck files are produced by running the following scripts sequential * :py:obj:`lstchain.scripts.lstchain_check_dl1` The same script is run again, but now providing as input the subrun-wise datacheck files produced above (all those of - a given run must be provided). It also needs to know where the subrun-wise .fits files (produced in the R0 to DL1 - analysis step) which contain the muon ring information are stored ("MUONS_DIR"): + a given run must be provided). It also needs to know where the subrun-wise ``muons_LST-1.*.fits files`` (produced in + the R0 to DL1 analysis step) which contain the muon ring information are stored ("MUONS_DIR"): .. code-block:: bash lstchain_check_dl1 --input-file "datacheck_dl1_LST-1.Run14619.*.h5" --output-dir OUTPUT_DIR --muons-dir MUONS_DIR - The output is now a run-wise file, ``datacheck_dl1_LST-1.Run14619.h5`` which contains all the information from the - subrun-wise files. It also produces a .pdf file ``datacheck_dl1_LST-1.Run14619.pdf`` with various plots of the - quantities stored in the DL1DataCheckContainer objects, plus others obtained from the muon ring analysis. Note that - the muon ring information is not copied to the run-wise datacheck files, it is just used for the plotting. + The output is now a data check file for the whole run, ``datacheck_dl1_LST-1.Run14619.h5`` which contains all the + information from the subrun-wise files. It also produces a .pdf file ``datacheck_dl1_LST-1.Run14619.pdf`` with + various plots of the quantities stored in the DL1DataCheckContainer objects, plus others obtained from the muon ring + analysis. Note that the muon ring information is not propagated to the run-wise datacheck files, it is just used for + the plotting. | * :py:obj:`lstchain.scripts.lstchain_longterm_dl1_check` This merges the run-wise datacheck files of (typically) one night, stored in INPUT_DIR, and produces a single .h5 file - as output (e.g. ``DL1_datacheck_20230920.h5``). The file contains a (run-wise) summarized version of the information in the - input files, including the muon ring .fits files. + for the whole night as output (e.g. ``DL1_datacheck_20230920.h5``). The "longterm" in the script name is a bit of an + overstatement - in principle it can be run over data from many nights, but the output html (see below) becomes too + heavy and some of the interactive features work poorly. .. code-block:: bash lstchain_longterm_dl1_check --input-dir INPUT_DIR --muons-dir MUONS_DIR --output-file DL1_datacheck_20230920.h5 --batch - It also creates an .html file (with the same name, except for the extension, as the .h5 file) which can be - opened with any web browser and which contains various interactive plots which allow to make a quick check of the data - of a night. + The output .h5 file contains a (run-wise) summarized version of the information in the input files, including the muon + ring .fits files. It also creates an .html file (e.g. ``DL1_datacheck_20230920.html``) which can be opened with any + web browser and which contains various interactive plots which allow to make a quick check of the data of a night. See + an example `here `_ (password protected). | * :py:obj:`lstchain.scripts.lstchain_cherenkov_transparency` - This script analyzes image intensity histograms in the run-wise datacheck files (which must be stored under the + This script analyzes image intensity histograms in the **run-wise** datacheck files (which must be stored under the standard location ``/fefs/aswg/data/real/DL1/YYYYMMDD/INPUT_DIR``, where INPUT_DIR is a path provided as a command-line argument). @@ -70,9 +78,9 @@ The DL1 datacheck files are produced by running the following scripts sequential lstchain_cherenkov_transparency --update_datacheck_file DL1_datacheck_20230920.h5 --input_dir v0.10/tailcut84/datacheck - The script updates the night-wise datacheck .h5 file ``DL1_datacheck_20230920.h5`` with a new table containing parameters - related to the image intensity spectra for cosmic ray events (i.e., a Cherenkov-transparency - like approach, see e.g. - https://arxiv.org/abs/1310.1639). + The script updates the **night-wise** datacheck .h5 file ``DL1_datacheck_20230920.h5`` with a new table containing + parameters related to the image intensity spectra for cosmic ray events (i.e., a Cherenkov-transparency - like + approach, see e.g. https://arxiv.org/abs/1310.1639). From e5ebb11f8b8589363ee3c775e967481008ebc915 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Mon, 29 Jan 2024 15:50:37 +0100 Subject: [PATCH 05/19] Improvements in documentation Solved issue with lstchain_tune_nsb.py (gave error if 'image_modifier' key did not exist in input config file!) --- docs/lst_analysis_workflow.rst | 30 ++++++++++++++++++--- docs/lstchain_api/datachecks/index.rst | 36 +++++++++++++++----------- lstchain/scripts/lstchain_tune_nsb.py | 6 ++++- 3 files changed, 53 insertions(+), 19 deletions(-) diff --git a/docs/lst_analysis_workflow.rst b/docs/lst_analysis_workflow.rst index 13fd3484e9..61bf678c11 100644 --- a/docs/lst_analysis_workflow.rst +++ b/docs/lst_analysis_workflow.rst @@ -51,6 +51,17 @@ Files and configuration The DL1 files to use obviously depend on the source you want to analyse. Unless you have a good reason, the latest version of the DL1 files should be used. +Selection of DL1 files +---------------------- + +The selection of the DL1 files (run-wise, i.e. those produced by lstosa by merging the subrun-wise DL1 files) for a +specific analysis (i.e., a given source and time period) can be performed using the notebook +``cta_lstchain/notebooks/data_quality.ipynb``. The selection also takes into account the quality of the data, mostly in +terms of atmospheric conditions - evaluated using the rate of cosmic-ray showers as a proxy. Data taken under poor +conditions will not be included in the list of selected runs. Instructions and further details can be found inside the +notebook. + + RF models --------- @@ -65,11 +76,24 @@ The RF models are stored in the following directory: Tuning of DL1 files and RF models --------------------------------- -In case of high NSB in the data, it is possible to tune the DL1 files and the RF models to improve the performance of the analysis. +The default MC production is generated with a level of noise in the images which corresponds to the level of diffuse +night-sky background ("NSB") in a "dark" field of view (i.e. for observations with moon below the horizon, at not-too-low +galactic latitudes and not affected by other sources of noise, like the zodiacal light). In general, observations of +**extragalactic** sources in dark conditions can be properly analyzed with the default MC (i.e. with the standard RF models). + +The median of the standard deviation of the pixel charges recorded in interleaved pedestal events (in which a camera +image is recorded in absence of a physics trigger) is a good measure of the NSB level in a given data run. This is computed +by the data selection notebook ``cta_lstchain/notebooks/data_quality.ipynb`` (see above). For data with an NSB level +significantly higher than the "dark field" one, it is possible to tune (increase) the noise in the MC files, and produce +from them RF models (and "test MC" for computing instrument response functions) which improve the performance of the +analysis (relative to using the default, low-NSB MC). + This is done by changing the `config` file of the RF models and producing new DL1 files and training new RF models. -To produce a config tuned to the data you want to analyse, you may run ``lstchain_tune_nsb`` function that will produce a ``tuned_nsb_config.json`` file. +To produce a config tuned to the data you want to analyse, you have to run the script +``cta_lstchain/scripts/lstchain_tune_nsb.py`` (link: :py:obj:`~lstchain.scripts.lstchain_tune_nsb`) that will produce a +``tuned_nsb_config.json`` file. -To request a new production of RF models, you can open a pull-request on the lstmcpipe repository, producing a complete MC config, using: +To request a **new production of RF models**, you can open a pull-request on the lstmcpipe repository, producing a complete MC config, using: .. code-block:: diff --git a/docs/lstchain_api/datachecks/index.rst b/docs/lstchain_api/datachecks/index.rst index 4cd4c18500..e5ef12e683 100644 --- a/docs/lstchain_api/datachecks/index.rst +++ b/docs/lstchain_api/datachecks/index.rst @@ -33,8 +33,8 @@ The DL1 datacheck files are produced by running the following scripts sequential * :py:obj:`lstchain.scripts.lstchain_check_dl1` - The same script is run again, but now providing as input the subrun-wise datacheck files produced above (all those of - a given run must be provided). It also needs to know where the subrun-wise ``muons_LST-1.*.fits files`` (produced in + The same script is run again, but now providing as input the **subrun-wise datacheck files** produced earlier (all subruns + of a given run must be provided). It also needs to know where the subrun-wise ``muons_LST-1.*.fits files`` (produced in the R0 to DL1 analysis step) which contain the muon ring information are stored ("MUONS_DIR"): .. code-block:: bash @@ -43,7 +43,7 @@ The DL1 datacheck files are produced by running the following scripts sequential The output is now a data check file for the whole run, ``datacheck_dl1_LST-1.Run14619.h5`` which contains all the - information from the subrun-wise files. It also produces a .pdf file ``datacheck_dl1_LST-1.Run14619.pdf`` with + information from the subrun-wise files. The script also produces a .pdf file ``datacheck_dl1_LST-1.Run14619.pdf`` with various plots of the quantities stored in the DL1DataCheckContainer objects, plus others obtained from the muon ring analysis. Note that the muon ring information is not propagated to the run-wise datacheck files, it is just used for the plotting. @@ -52,10 +52,10 @@ The DL1 datacheck files are produced by running the following scripts sequential * :py:obj:`lstchain.scripts.lstchain_longterm_dl1_check` - This merges the run-wise datacheck files of (typically) one night, stored in INPUT_DIR, and produces a single .h5 file - for the whole night as output (e.g. ``DL1_datacheck_20230920.h5``). The "longterm" in the script name is a bit of an - overstatement - in principle it can be run over data from many nights, but the output html (see below) becomes too - heavy and some of the interactive features work poorly. + This script merges the run-wise datacheck files of (typically) one night, stored in INPUT_DIR, and produces **a + single .h5 file for the whole night** as output (e.g. ``DL1_datacheck_20230920.h5``). The "longterm" in the script + name is a bit of an overstatement - in principle it can be run over data from many nights, but the output html (see + below) becomes too heavy and some of the interactive features work poorly. .. code-block:: bash @@ -64,23 +64,23 @@ The DL1 datacheck files are produced by running the following scripts sequential The output .h5 file contains a (run-wise) summarized version of the information in the input files, including the muon ring .fits files. It also creates an .html file (e.g. ``DL1_datacheck_20230920.html``) which can be opened with any web browser and which contains various interactive plots which allow to make a quick check of the data of a night. See - an example `here `_ (password protected). + an example of the .html file `here `_ + (password protected). | * :py:obj:`lstchain.scripts.lstchain_cherenkov_transparency` - This script analyzes image intensity histograms in the **run-wise** datacheck files (which must be stored under the - standard location ``/fefs/aswg/data/real/DL1/YYYYMMDD/INPUT_DIR``, where INPUT_DIR is a path provided as a command-line - argument). + This script analyzes the image intensity histograms (one per subrun) stored in the **run-wise** datacheck files (which + must exist in INPUT_DIR) .. code-block:: bash - lstchain_cherenkov_transparency --update_datacheck_file DL1_datacheck_20230920.h5 --input_dir v0.10/tailcut84/datacheck + lstchain_cherenkov_transparency --update-datacheck-file DL1_datacheck_20230920.h5 --input-dir INPUT_DIR - The script updates the **night-wise** datacheck .h5 file ``DL1_datacheck_20230920.h5`` with a new table containing - parameters related to the image intensity spectra for cosmic ray events (i.e., a Cherenkov-transparency - like - approach, see e.g. https://arxiv.org/abs/1310.1639). + The script updates the **night-wise** datacheck .h5 file ``DL1_datacheck_20230920.h5`` with a new table (with one entry + per subrun) containing parameters related to the image intensity spectra for cosmic ray events (i.e., a + Cherenkov-transparency - like approach, see e.g. https://arxiv.org/abs/1310.1639). @@ -88,6 +88,12 @@ The DL1 datacheck files are produced by running the following scripts sequential Using the datacheck files for selecting good-quality data ========================================================= +The night-wise datacheck .h5 files, ``DL1_datacheck_YYYYMMDD.h5`` can be used to select a subsample of good quality data +from a large sample of observations. The files are relatively light, 6 MB per night in average. A large sample of them +can be processed with the notebook ``cta_lstchain/notebooks/data_quality.ipynb`` (instructions can be found inside the +notebook) + + Reference/API ============= diff --git a/lstchain/scripts/lstchain_tune_nsb.py b/lstchain/scripts/lstchain_tune_nsb.py index 8ab069d870..92cd8d230e 100644 --- a/lstchain/scripts/lstchain_tune_nsb.py +++ b/lstchain/scripts/lstchain_tune_nsb.py @@ -93,7 +93,11 @@ def main(): if args.output_file: cfg = read_configuration_file(args.config) - cfg['image_modifier'].update(dict_nsb) + if 'image_modifier' in cfg: + cfg['image_modifier'].update(dict_nsb) + else: + cfg['image_modifier'] = dict_nsb + dump_config(cfg, args.output_file, overwrite=args.overwrite) From 29428a6426f3764859cd600e53efda04c5b35b30 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Mon, 29 Jan 2024 16:31:57 +0100 Subject: [PATCH 06/19] Improved description of NSB tuning procedure --- docs/lst_analysis_workflow.rst | 45 ++++++++++++++++++++++++---------- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/docs/lst_analysis_workflow.rst b/docs/lst_analysis_workflow.rst index 61bf678c11..dd17243a7c 100644 --- a/docs/lst_analysis_workflow.rst +++ b/docs/lst_analysis_workflow.rst @@ -73,8 +73,8 @@ The RF models are stored in the following directory: ``/fefs/aswg/data/models/...`` -Tuning of DL1 files and RF models ---------------------------------- +Tuning of MC DL1 files and RF models +------------------------------------ The default MC production is generated with a level of noise in the images which corresponds to the level of diffuse night-sky background ("NSB") in a "dark" field of view (i.e. for observations with moon below the horizon, at not-too-low @@ -88,24 +88,43 @@ significantly higher than the "dark field" one, it is possible to tune (increase from them RF models (and "test MC" for computing instrument response functions) which improve the performance of the analysis (relative to using the default, low-NSB MC). -This is done by changing the `config` file of the RF models and producing new DL1 files and training new RF models. -To produce a config tuned to the data you want to analyse, you have to run the script -``cta_lstchain/scripts/lstchain_tune_nsb.py`` (link: :py:obj:`~lstchain.scripts.lstchain_tune_nsb`) that will produce a -``tuned_nsb_config.json`` file. +This is done by changing the configuration file for the MC processing, producing new DL1(b) files, and training new RF models. +To produce a config tuned to the data you want to analyse, you first have to obtain the standard analysis configuration +file (for MC) for the desired version of lstchain (= the version with which the real DL1 files you will use were produced). +This can be done with the script :py:obj:`~lstchain.scripts.lstchain_dump_config`: -To request a **new production of RF models**, you can open a pull-request on the lstmcpipe repository, producing a complete MC config, using: +.. code-block:: + + lstchain-dump-config --mc --output-file standard_lstchain_config.json + +Now you have to update the file with the parameters needed to increase the NSB level. For this you need a simtel.gz MC +file from the desired production (any will do, it can be either a gamma or a proton file), and a "typical" subrun DL1 file +from your **real data** sample. "Typical" means one in which the NSB level is close to the median value for the sample +to be analyzed. The data selection notebook ``cta_lstchain/notebooks/data_quality.ipynb`` (see above) provides a list of +a few such subruns for your selected sample - you can use any of them. To update the config file you have to use the +script :py:obj:`~lstchain.scripts.lstchain_tune_nsb` , e.g. : .. code-block:: - lstchain-dump-config --mc --update-with tuned_nsb_config.json --output-file PATH_TO_OUTPUT_FILE + lstchain_tune_nsb.py --config standard_lstchain_config.json \ + --input-mc .../simtel_corsika_theta_6.000_az_180.000_run10.simtel.gz \ + --input-data .../dl1_LST-1.Run10032.0069.h5 \ + -o tuned_nsb_lstchain_config.json + +To request a **new production of RF models**, you can open a pull-request on the lstmcpipe repository, providing +the .json configuration file produced following the steps above. + + +Keeping track of lstchain configurations +---------------------------------------- +The lstchain configuration file used to process the simulations and produce the RF models of a given MC production is +provided in the lstmcpipe repository, as well as in the models directory. -lstchain config ---------------- +It is important that the software version, and the configuration used for processing real data and MC are the same. For a +given lstchain version, this should be guaranteed by following the procedure above which makes use of +:py:obj:`~lstchain.scripts.lstchain_dump_config`. -The lstchain config used to produce the RF models of a production is provided in the lstmcpipe repository, as well as in the models directory. -It is a good idea to use the same config for the data analysis. -You can also produce a config using `lstchain-dump-config`. DL3/IRF config files -------------------- From 37e03cf2f48c3491d1c4628ee5ada1c136f127aa Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Mon, 29 Jan 2024 16:53:32 +0100 Subject: [PATCH 07/19] Remove link to not-yet existing script so that docs compilation works --- docs/lstchain_api/datachecks/index.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/lstchain_api/datachecks/index.rst b/docs/lstchain_api/datachecks/index.rst index e5ef12e683..31e17d9ba6 100644 --- a/docs/lstchain_api/datachecks/index.rst +++ b/docs/lstchain_api/datachecks/index.rst @@ -69,7 +69,10 @@ The DL1 datacheck files are produced by running the following scripts sequential | -* :py:obj:`lstchain.scripts.lstchain_cherenkov_transparency` +.. Heading below be replaced by this once script is merged! :py:obj:`lstchain.scripts.lstchain_cherenkov_transparency` + +* ``lstchain_cherenkov_transparency`` + This script analyzes the image intensity histograms (one per subrun) stored in the **run-wise** datacheck files (which must exist in INPUT_DIR) From db73e8b73144cf4086bbb1ee6be3c20904d0c892 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Mon, 29 Jan 2024 20:47:13 +0100 Subject: [PATCH 08/19] Compute and show median of FF pixel charge (for crosscheck) --- lstchain/image/modifier.py | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 59b0622933..0fa2f2d9dc 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -214,6 +214,18 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # HG adc to pe conversion factors from interleaved calibrations: data_HG_dc_to_pe = data_dl1_calibration['dc_to_pe'][:, 0, :] + + # Mean HG charge in interleaved FF events, to spot possible issues: + data_HG_FF_mean = data_dl1_flatfield['charge_mean'][1:, 0, :] + dummy = [] + # indices which connect each FF calculation to a given calibration: + calibration_id = data_dl1_flatfield['calibration_id'][1:] + for i, x in enumerate(data_HG_FF_mean[:, ]): + dummy.append(x * data_HG_dc_to_pe[calibration_id[i],]) + dummy = np.array(dummy) + # Average for all interleaved calibrations (in case there are more than one) + data_HG_FF_mean_pe = np.mean(dummy, axis=0) # one value per pixel + # Pixel-wise pedestal standard deviation (for an unbiased extractor), # in adc counts: data_HG_ped_std = data_dl1_pedestal['charge_std'][1:, 0, :] @@ -224,7 +236,6 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, for i, x in enumerate(data_HG_ped_std[:, ]): dummy.append(x * data_HG_dc_to_pe[calibration_id[i],]) dummy = np.array(dummy) - # Average for all interleaved calibrations (in case there are more than one) data_HG_ped_std_pe = np.mean(dummy, axis=0) # one value per pixel @@ -232,12 +243,16 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # the average diffuse NSB across the camera data_median_std_ped_pe = np.nanmedian(data_HG_ped_std_pe) data_std_std_ped_pe = np.nanstd(data_HG_ped_std_pe) - log.info(f'Real data: median across camera of good pixels\' pedestal std ' + log.info(f'Real data:') + log.info(f' Median of FF pixel charge: ' + f'{np.nanmedian(data_HG_FF_mean_pe)} p.e.') + log.info(f' Median across camera of good pixels\' pedestal std ' f'{data_median_std_ped_pe:.3f} p.e.') brightness_limit = data_median_std_ped_pe + 3 * data_std_std_ped_pe too_bright_pixels = (data_HG_ped_std_pe > brightness_limit) - log.info(f'Number of pixels beyond 3 std dev of median: ' - f'{too_bright_pixels.sum()}, (above {brightness_limit:.2f} p.e.)') + log.info(f' Number of pixels beyond 3 std dev of median: ' + f' {too_bright_pixels.sum()}, (above {brightness_limit:.2f} ' + f'p.e.)') ped_mask = data_dl1_parameters['event_type'] == 2 # The charges in the images below are obtained with the extractor for From 9e95c436ffe44de64bce06fc1b1a57a5afda4878 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Mon, 29 Jan 2024 20:49:23 +0100 Subject: [PATCH 09/19] Added forgotten read_table --- lstchain/image/modifier.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 0fa2f2d9dc..d53a6ec910 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -194,6 +194,8 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, '/dl1/event/telescope/monitoring/calibration') data_dl1_pedestal = read_table(data_dl1_filename, '/dl1/event/telescope/monitoring/pedestal') + data_dl1_flatfield = read_table(data_dl1_filename, + '/dl1/event/telescope/monitoring/flatfield') data_dl1_parameters = read_table(data_dl1_filename, '/dl1/event/telescope/parameters/LST_LSTCam') data_dl1_image = read_table(data_dl1_filename, From d993fce153b983b4960075190130e9e0e04d3502 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Mon, 29 Jan 2024 20:53:29 +0100 Subject: [PATCH 10/19] good pixels filter for FF median --- lstchain/image/modifier.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index d53a6ec910..34b1dee4a6 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -247,7 +247,7 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, data_std_std_ped_pe = np.nanstd(data_HG_ped_std_pe) log.info(f'Real data:') log.info(f' Median of FF pixel charge: ' - f'{np.nanmedian(data_HG_FF_mean_pe)} p.e.') + f'{np.nanmedian(data_HG_FF_mean_pe[good_pixels]):.3f} p.e.') log.info(f' Median across camera of good pixels\' pedestal std ' f'{data_median_std_ped_pe:.3f} p.e.') brightness_limit = data_median_std_ped_pe + 3 * data_std_std_ped_pe From 525661882bbd1012c7422bd637540ba2f55b8541 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 00:22:04 +0100 Subject: [PATCH 11/19] Use good_pixels for printed average quantities --- lstchain/image/modifier.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 34b1dee4a6..9ba2ad79ce 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -243,8 +243,8 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # Identify noisy pixels, likely containing stars - we want to adjust MC to # the average diffuse NSB across the camera - data_median_std_ped_pe = np.nanmedian(data_HG_ped_std_pe) - data_std_std_ped_pe = np.nanstd(data_HG_ped_std_pe) + data_median_std_ped_pe = np.nanmedian(data_HG_ped_std_pe[good_pixels]) + data_std_std_ped_pe = np.nanstd(data_HG_ped_std_pe[good_pixels]) log.info(f'Real data:') log.info(f' Median of FF pixel charge: ' f'{np.nanmedian(data_HG_FF_mean_pe[good_pixels]):.3f} p.e.') From 089994fd53f83324e9e806bf87ffa0c3d83b1b3c Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 00:24:46 +0100 Subject: [PATCH 12/19] Print number of bad pixels --- lstchain/image/modifier.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 9ba2ad79ce..758bf40a2e 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -245,7 +245,8 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # the average diffuse NSB across the camera data_median_std_ped_pe = np.nanmedian(data_HG_ped_std_pe[good_pixels]) data_std_std_ped_pe = np.nanstd(data_HG_ped_std_pe[good_pixels]) - log.info(f'Real data:') + log.info(f'\nReal data:') + log.info(f' Number of bad pixels (from calibration): {bad_pixels.sum()}') log.info(f' Median of FF pixel charge: ' f'{np.nanmedian(data_HG_FF_mean_pe[good_pixels]):.3f} p.e.') log.info(f' Median across camera of good pixels\' pedestal std ' From 4e226a71a0ef7534c82cb93dfdbf65567877de9c Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 00:34:39 +0100 Subject: [PATCH 13/19] Check array dimension to avoid occasional error --- lstchain/image/modifier.py | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 758bf40a2e..0accb78074 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -205,8 +205,9 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # Locate pixels with HG declared unusable either in original calibration or # in interleaved events: bad_pixels = unusable[0][0] # original calibration - for tf in unusable[1:][0]: # calibrations with interleaveds - bad_pixels = np.logical_or(bad_pixels, tf) + if unusable.shape[0] > 1: + for tf in unusable[1:][0]: # calibrations with interleaveds + bad_pixels = np.logical_or(bad_pixels, tf) good_pixels = ~bad_pixels # First index: 1,2,... = values from interleaveds (0 is for original From 83d6d4220e7fffa415dbf21592ad5e2b4f32fbb1 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 00:56:30 +0100 Subject: [PATCH 14/19] Check existence of interleaved events info in monitoring table --- lstchain/image/modifier.py | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 0accb78074..e56be69b80 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -218,11 +218,22 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # HG adc to pe conversion factors from interleaved calibrations: data_HG_dc_to_pe = data_dl1_calibration['dc_to_pe'][:, 0, :] + if data_dl1_flatfield['charge_mean'].shape[0] < 2: + logging.error('Could not find interleaved FF calibrations in ' + 'monitoring table!') + return np.nan, np.nan, np.nan + + if data_dl1_pedestal['charge_std'].shape[0] < 2 : + logging.error('Could not find interleaved pedestal info in ' + 'monitoring table!') + return np.nan, np.nan, np.nan + # Mean HG charge in interleaved FF events, to spot possible issues: data_HG_FF_mean = data_dl1_flatfield['charge_mean'][1:, 0, :] dummy = [] # indices which connect each FF calculation to a given calibration: calibration_id = data_dl1_flatfield['calibration_id'][1:] + for i, x in enumerate(data_HG_FF_mean[:, ]): dummy.append(x * data_HG_dc_to_pe[calibration_id[i],]) dummy = np.array(dummy) From 8163ea5b31afd3e09ac2bf23295e768c1c4ff09f Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 00:59:17 +0100 Subject: [PATCH 15/19] Exit in case of problems with monitoring table --- lstchain/image/modifier.py | 4 ++-- lstchain/scripts/lstchain_tune_nsb.py | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index e56be69b80..7dc40efed2 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -219,12 +219,12 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, data_HG_dc_to_pe = data_dl1_calibration['dc_to_pe'][:, 0, :] if data_dl1_flatfield['charge_mean'].shape[0] < 2: - logging.error('Could not find interleaved FF calibrations in ' + logging.error('\nCould not find interleaved FF calibrations in ' 'monitoring table!') return np.nan, np.nan, np.nan if data_dl1_pedestal['charge_std'].shape[0] < 2 : - logging.error('Could not find interleaved pedestal info in ' + logging.error('\nCould not find interleaved pedestal info in ' 'monitoring table!') return np.nan, np.nan, np.nan diff --git a/lstchain/scripts/lstchain_tune_nsb.py b/lstchain/scripts/lstchain_tune_nsb.py index 92cd8d230e..17669ac9e8 100644 --- a/lstchain/scripts/lstchain_tune_nsb.py +++ b/lstchain/scripts/lstchain_tune_nsb.py @@ -79,6 +79,9 @@ def main(): a, b, c = calculate_noise_parameters(args.input_mc, args.input_data, args.config) + if np.isnan(a): + logging.error('Could not compute NSB tuning parameters. Exiting!') + sys.exit(1) dict_nsb = {"increase_nsb": True, "extra_noise_in_dim_pixels": round(a, 3), From af1f458612291e81f0b05b0b6b9c563b82bb4b5f Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 01:01:15 +0100 Subject: [PATCH 16/19] Return None's instead of nans --- lstchain/image/modifier.py | 4 ++-- lstchain/scripts/lstchain_tune_nsb.py | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index 7dc40efed2..e73f4e358b 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -221,12 +221,12 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, if data_dl1_flatfield['charge_mean'].shape[0] < 2: logging.error('\nCould not find interleaved FF calibrations in ' 'monitoring table!') - return np.nan, np.nan, np.nan + return None, None, None if data_dl1_pedestal['charge_std'].shape[0] < 2 : logging.error('\nCould not find interleaved pedestal info in ' 'monitoring table!') - return np.nan, np.nan, np.nan + return None, None, None # Mean HG charge in interleaved FF events, to spot possible issues: data_HG_FF_mean = data_dl1_flatfield['charge_mean'][1:, 0, :] diff --git a/lstchain/scripts/lstchain_tune_nsb.py b/lstchain/scripts/lstchain_tune_nsb.py index 17669ac9e8..5f512cdd40 100644 --- a/lstchain/scripts/lstchain_tune_nsb.py +++ b/lstchain/scripts/lstchain_tune_nsb.py @@ -79,7 +79,7 @@ def main(): a, b, c = calculate_noise_parameters(args.input_mc, args.input_data, args.config) - if np.isnan(a): + if a is None: logging.error('Could not compute NSB tuning parameters. Exiting!') sys.exit(1) From 22c8e8cd9218166c242b9007bf5932bf8ff4678e Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 01:11:16 +0100 Subject: [PATCH 17/19] Improved docs --- docs/lst_analysis_workflow.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/lst_analysis_workflow.rst b/docs/lst_analysis_workflow.rst index dd17243a7c..92f8393138 100644 --- a/docs/lst_analysis_workflow.rst +++ b/docs/lst_analysis_workflow.rst @@ -51,8 +51,8 @@ Files and configuration The DL1 files to use obviously depend on the source you want to analyse. Unless you have a good reason, the latest version of the DL1 files should be used. -Selection of DL1 files ----------------------- +Selection of the real-data DL1 files +------------------------------------ The selection of the DL1 files (run-wise, i.e. those produced by lstosa by merging the subrun-wise DL1 files) for a specific analysis (i.e., a given source and time period) can be performed using the notebook From 498637f3b89a66519e3c9c95404858f8b7d3f521 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 01:15:41 +0100 Subject: [PATCH 18/19] Fix message string --- lstchain/image/modifier.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lstchain/image/modifier.py b/lstchain/image/modifier.py index e73f4e358b..84f64b8078 100644 --- a/lstchain/image/modifier.py +++ b/lstchain/image/modifier.py @@ -257,7 +257,7 @@ def calculate_noise_parameters(simtel_filename, data_dl1_filename, # the average diffuse NSB across the camera data_median_std_ped_pe = np.nanmedian(data_HG_ped_std_pe[good_pixels]) data_std_std_ped_pe = np.nanstd(data_HG_ped_std_pe[good_pixels]) - log.info(f'\nReal data:') + log.info('\nReal data:') log.info(f' Number of bad pixels (from calibration): {bad_pixels.sum()}') log.info(f' Median of FF pixel charge: ' f'{np.nanmedian(data_HG_FF_mean_pe[good_pixels]):.3f} p.e.') From e05ffe58e7242e5bd6459555fce880d0076ddf66 Mon Sep 17 00:00:00 2001 From: Abelardo Moralejo Olaizola Date: Tue, 30 Jan 2024 16:51:40 +0100 Subject: [PATCH 19/19] typo --- docs/lst_analysis_workflow.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/lst_analysis_workflow.rst b/docs/lst_analysis_workflow.rst index 92f8393138..52667c4fd0 100644 --- a/docs/lst_analysis_workflow.rst +++ b/docs/lst_analysis_workflow.rst @@ -95,7 +95,7 @@ This can be done with the script :py:obj:`~lstchain.scripts.lstchain_dump_config .. code-block:: - lstchain-dump-config --mc --output-file standard_lstchain_config.json + lstchain_dump_config --mc --output-file standard_lstchain_config.json Now you have to update the file with the parameters needed to increase the NSB level. For this you need a simtel.gz MC file from the desired production (any will do, it can be either a gamma or a proton file), and a "typical" subrun DL1 file