Skip to content

Running Artus

HaleSert edited this page Oct 2, 2020 · 59 revisions

C++ Executable

The Higgs analysis code provides a single C++ executable, which takes as options only a JSON configuration and a log level setting.

HiggsToTauTauAnalysis -h

The call of this executable is wrapped by a Python script which provides more functionallity for convenience reasons. The most important use case for direct calls of this C++ executable is debugging, e.g.

gdb --args HiggsToTauTauAnalysis <config.json>

See C++ tutorial for more information.

Only in very rare cases it is needed to implement changes directly for this executable. Developments rather consist of changes to Artus processors, see Software Packages.

Python wrapper

A python wrapper building on top of a more general Artus version is provided. Again here, this script only needs changes/developments in very rare cases.

HiggsToTauTauAnalysis.py -h

The c++ call HiggsToTauTauAnalysis <config.json> is equivalent to

HiggsToTauTauAnalysis.py -c <config.json> -i <input>

Here, <input> denotes input ROOT files, file lists to ROOT files or paths to a directory of ROOT files (or a mixture of them).

However, in practice, an analysis configuration is constructed using python configuration files. It creates the json config which is needed by the C++ executable. The wrapper has the arguments --channels which let you choose which channels to run: et, mt, em, tt, mm, gen, and --systematics which lets you choose the systematic pipelines to be included. wanting to run the nominal pipeline for et and tt is done by the following command:

HiggsToTauTauAnalysis.py --channels et tt --systematics nominal -i HiggsAnalysis/KITHiggsToTauTau/data/Samples/<list.txt>

It is possible to run different combinations of channels and systematics by specifying channels and systematics multiple time. The following command will let you run em with the eleEs uncertainty and et mt tt with the tauEsOneProng, tauEsOneProngPiZeros and tauEsThreeProng uncertainty:

HiggsToTauTauAnalysis.py --channels em --systematics eleEs --channels et mt tt --systematics tauEsOneProng tauEsOneProngPiZeros tauEsThreeProng -i HiggsAnalysis/KITHiggsToTauTau/data/Samples/<list.txt>

In practice u will probably use ready ArtusWrapperConfig files which let you run Artus for certain pipelines. The config files for the 2016 CP analysis can be found in HiggsAnalysis/KITHiggsToTauTau/data/ArtusWrapperConfigs/python_configs_CP/

A specific example

The 2016 CP nominal analysis is performed using the following command:

HiggsToTauTauAnalysis.py @HiggsAnalysis/KITHiggsToTauTau/data/ArtusWrapperConfigs/python_configs_CP/Run2CPStudies_Nominal.cfg -i HiggsAnalysis/KITHiggsToTauTau/data/Samples/XROOTD_collection_SM_Run2Analysis_Summer16_plusHToTauTauM110-140.txt [-f 1 -e 1000]

When you specify -f 1, Artus processes only one file from the file list. with -e 1000 only 1000 events are processed by Artus.

The printed output contains the following steps:

  1. The JSON config is constructed and temporarily saved:

    Saved JSON config "/tmp/artus_83d5ad095bbf0f7888357c90b107b314.json" for temporary usage.
    

    The filename of the temporary configuration file is unique for each Artus run. This means, that input configuration files can safely be edited while Artus is running.

  2. The C++ executable is called and loads the temporary configuration file:

    Execute "HiggsToTauTauAnalysis /tmp/artus_83d5ad095bbf0f7888357c90b107b314.json".
    Loading Config file from "/tmp/artus_83d5ad095bbf0f7888357c90b107b314.json".
    
  3. The first input file is loaded and the requested branches are accessed:

    Loading ... dcap://dcache-cms-dcap.desy.de//pnfs/desy.de/cms/tier2/store/user/rcaspart/higgs-kit/skimming/2015-11-20/SUSYGluGluToHToTauTau_M-160_TuneCUETP8M1_13TeV-pythia8/crab_SUSYGluGluToHToTauTauM160_RunIISpring15MiniAODv2_74X_13TeV_MINIAOD_pythia8/151123_104114/0000/kappa_SUSYGluGluToHToTauTauM160_RunIISpring15MiniAODv2_74X_13TeV_MINIAOD_pythia8_1.root
    
  4. The event processing starts printing the current progress.

Running Batch Jobs

Make sure, the grid-control executable go.py is in the $PATH

which go.py

and that you use the correct version of grid-control. This should normally be ensured by using the default checkout recipe and by sourcing the ini script.

The artus wrapper is able to set of grid-control configurations for various batch systems. The most commonly used system is the NAF batch system, called BIRD. When you are logged in to a NAF portal, you can access this system by simply specifying

HiggsToTauTauAnalysis.py -b [naf] <arguments as used interactively>

Confirm the grid-control initialisation and wait some minutes for packing the CMSSW environment and sending it to the batch system. After that you will see a job overview table.

-----------------------------------------------------------------
REPORT SUMMARY:                                    GCb6849faf7353
---------------
Total number of jobs:        6     Successful jobs:       0    0%
Jobs assigned to WMS:        6        Failing jobs:       0    0%

Detailed Status Information:
Jobs       INIT:       0    0%     Jobs  SUBMITTED:       0    0%
Jobs   DISABLED:       0    0%     Jobs      READY:       0    0%
Jobs    WAITING:       0    0%     Jobs     QUEUED:       3   50%
Jobs    ABORTED:       0    0%     Jobs    RUNNING:       3   50%
Jobs  CANCELLED:       0    0%     Jobs       DONE:       0    0%
Jobs     FAILED:       0    0%     Jobs    SUCCESS:       0    0%
-----------------------------------------------------------------

The output is written to a subdirectory of $ARTUS_WORK_BASE (or the one specified via -w) containing a date/time stamp, which is called the work directory. After all jobs are finished, the output directory is printed

Output is written to directory <path/to/work directory>/output

The directory <path/to/work directory>/output contains both the ROOT output files (<nick>/<nick>_job_<number>_output.root) and debug level log files (<nick>/<nick>_job_<number>_log.txt).

A specific example

The 2016 CP analysis can be run with:

HiggsToTauTauAnalysis.py [-b --work <project name>] -i HiggsAnalysis/KITHiggsToTauTau/data/Samples/XROOTD_collection_SM_Run2Analysis_Summer16_plusHToTauTauM110-140.txt @HiggsAnalysis/KITHiggsToTauTau/data/ArtusWrapperConfigs/python_configs_CP/Run2CPStudies_Nominal.cfg

Beware that the above example runs on a (very!) large number of input files and has a complicated structure, which is driven by the complexity of the analysis.

Commands are proposed for merging the outputs into single files per processed sample.

artusMergeOutputs.py [-n <number of parallel processes>] <path/to/work directory>

The this creates a new directory <path/to/work directory>/merged. Although, both the outputs in <path/to/work directory>/output and the ones in <path/to/work directory>/merged and be used in the same way for the following post-processing/plotting steps. It is recommended to use the merged files, as they allow for a faster processing later on.

The batch mode of the artus wrapper allows the simultaneous processing of different samples, which is forbidden in the interactive mode. Each batch job then processes only inputs of one sample. A sample is defined by a nickname.

Running jobs in Aachen

Since the default batch system for grid-control is NAF, in order to run in Aachen you need to explicitly specify the batch system. In Aachen you have two possibilities, that you use slightly differently from each other:

  1. -b rwthcondor for running on the desktop cluster. This works without additional settings.

  2. -b rwthtier2condor for running on the upgraded Tier2 SL7 machines using HTCondor setup (since Sep 2020). In this case, you need to specify the dcache output path, because the outputs cannot be written to the default argument of the -w parameter (which is a local path). For RWTH dCache, please use @Artus/Configuration/data/ArtusWrapperConfigs/rwth_dcache.cfg.

Then, if you want to write on the DESY dCache, you do:

HiggsToTauTauAnalysis.py <your-usual-arguments> -b rwthtier2 @Artus/Configuration/data/ArtusWrapperConfigs/desy_dcache.cfg

If you use the rwthtier2 and submit from sl7 pc in aachen:

  1. you have to compile inside the singularity where you have setup kithiggs for sl6
  2. you have to submit your jobs outside the singularity on sl7, after setting up your cmssw for sl6 in sl7 with the command above

if difficulties ask in one of the mattermost chats

It is recommended to use the DESY dCache rather than the RWTH dCache, because it works more reliable. For accessing dCache you need a valid grid proxy. Just appending desy_dcache.cfg works properly only if your CERN username is the same as your local one because of $USER in the path in desy_dcache.cfg. If this is not the case, you can either create your special version of this file, or do the following:

HiggsToTauTauAnalysis.py <your-usual-arguments> -b rwthtier2 -w srm://dcache-se-cms.desy.de:8443/srm/managerv2?SFN=/pnfs/desy.de/cms/tier2/store/user/<your-cern-user>/artus

where you manually set you CERN username. Merging of the ouputs can be done in two ways. The way proposed by the Artus wrapper using artusMergeOutputsWithGC.py is straightforward. In this case, hadd takes reads its inputs directly from dCache, which can be very slow. Therefore, one can first download the files and then merge them locally. This is in many cases much faster.

se_output_download.py -l -m -o <path/to/work directory>/output/ <path/to/work directory>/grid-control_config.conf
artusMergeOutputs.py <path/to/work directory> [-n 8]

In order to add read access for your colleagues to the Artus outputs, do

setfacl -R -m g:inst3b:rx $ARTUS_WORK_BASE
setfacl -d -R -m g:inst3b:rx $ARTUS_WORK_BASE

or

chmod -R a+rx $ARTUS_WORK_BASE/

If someone cant access the files do

setfacl -m g:inst3b:rx /net/scratch_cms3b/<your user name>

Do not forget to set access rights also in the folder above your $ARTUS_WORK_BASE if it is a personal folder, too. It is recommended, that everybody uses the same directory structure in order to ease sharing of Artus outputs.

Singularity need to submit jobs from SL7 desktops to SL6 nodes.

  1. In SL7 do

    cd $CMSSW_BASE/src
    scram b clean
  2. Start Singularity

    singularity shell /cvmfs/singularity.opensciencegrid.org/bbockelm/cms\:rhel6
  3. In Singularity do

    cd $CMSSW_BASE/src
    scram b clean # to increase your luck
    scram b [-j 8]
  4. Make sure a small interactive test run of Artus works in Singularity.

  5. Leave Singularity and submit jobs with GC on SL7

Trouble-shooting

General hints

Access to files via xrootd does not work as reliable as via dcap, although dcap access is no longer available for the DESY dCache from outside the NAF. It might help to download remote files first locally and access them from there, which is done with the option --copy-remote-files. This is the default for batch jobs anyways.

Test files

Single test files can be retrived by tail -n 1 $CMSSW_BASE/src/HiggsAnalysis/KITHiggsToTauTau/data/Samples/<protocol>_sample_<nick>.txt. The output of this can be used in various applications, e.g.

  • tBrowser.sh `tail -n 1 $CMSSW_BASE/src/HiggsAnalysis/KITHiggsToTauTau/data/Samples/<protocol>_sample_<nick>.txt`
  • higgsplot.py -i `tail -n 1 $CMSSW_BASE/src/HiggsAnalysis/KITHiggsToTauTau/data/Samples/<protocol>_sample_<nick>.txt` ...
  • gfal-copy `tail -n 1 $CMSSW_BASE/src/HiggsAnalysis/KITHiggsToTauTau/data/Samples/<protocol>_sample_<nick>.txt` ./

Especially the last case can safe some time for interactive Artus tests by first copying the input file to a local directory and using this file than as input for Artus (with the -i option).

Interactive mode

  • Artus is wrongly configured

    Most probably you will get a warning or an error printed out indicating the wrong configuration. In most cases this is due to a wrong JSON syntax (e.g commas after the last item of lists of dictionaries). In case there is a warning/error from Artus itself, you should grep the location in the code and see for the reason of a possible mis-configuration.

    For more complicated problems it is recommended to use the debug log level: --log-level debug.

  • There is a bug/segmentation fault in the C++ code

    It might help to recompile the C++ with debug flags by

     cd $CMSSW_BASE/src/Artus && scram b clean
     cd $CMSSW_BASE/src/HiggsAnalysis/KITHiggsToTauTau && scram b clean
     cd $CMSSW_BASE/src && scram b -j 4 USER_CXXFLAGS="-g"

    to facilitate more advanced debugging of the code. Find more information (e.g. about GDB) here.

Batch mode

  1. First assumption: Artus is wrongly configured or there is a bug in Artus code.

    In general you will find the Artus log in <path/to/work directory>/workdir/output/job_<number>/gc.stdout after the line

    > Starting HiggsToTauTauAnalysis.py $@ with arguments:
    

    Check for this file in case job number <number> fails and look for possible problems by reading its last lines. In most cases, you will find a direct hint pointing to the problem.

    In case you don't run your Artus jobs on a remote batch system (e.g. in the Grid), Artus can directly write out the log files to local directories. Use the --log-to-se option of your wrapper for that. 
    
  2. Second assumption: Problems with the batch system.

    In this case it is useful to check the error codes of grid-control. You should have a look at the log outputs of grid-control located in <path/to/work directory>/workdir/output/job_<number> in case job number <number> fails. Most probably you will find a hint at the end of gc.stdout (or in gc.stderr).

    When running in the Grid, a common problem is, that libraries are not accessable due to symlinks to absolute paths that do not exist on the Grid nodes. In these cases, grid-control jobs fail with error code 127. The problem is solved by re-running the ini script. Make sure, that ${CMSSW_BASE}/external/${SCRAM_ARCH}/lib/libKappa.so points to a relative path. Unfortunately, the complete GC task is lost, when this problem occurs. You need to start from re-initialising GC again.

In case some jobs fail, they can be collected by getFailedJobOutputs.py into a single directory to ease digesting the errors:

getFailedJobOutputs.py <project directory>

If you want do resubmit failed jobs, use the following script:

go.py <project directory> -d all

Some useful tools

JSON configurations (*.json, *.root, "{...}") can be compared by artusConfigDiff.py:

artusConfigDiff.py <config 1> <config 2>

The (committed) status of two different Artus runs can be compared using artusRepositoryDiff.py:

cd <repository to compare>
artusRepositoryDiff.py <config 1> <config 2>

Using json files as configuration

The old way to create config files is using json config files.

This is no longer supported but for comparison it can be useful

you can run with json files with the following command:

HiggsToTauTauAnalysis.py [-b --work <project name>] -i HiggsAnalysis/KITHiggsToTauTau/data/Samples/XROOTD_<input_files> @HiggsAnalysis/KITHiggsToTauTau/data/ArtusWrapperConfigs/Run2CPStudies_Nominal.cfg --use-json
Clone this wiki locally