Skip to content

reanahub/reana-demo-cms-reco

Repository files navigation

REANA example - CMS Reconstruction

image image image

About

This REANA reproducible analysis example demonstrates the reconstruction procedure of the CMS collaboration from raw data to Analysis Object Data (AOD), for the year 2011 and the data set DoubleElectron.

The workflow consists of the steps need for the samples reconstruction, as taken from the CMS legacy validation repo.

Reconstruction procedure

1 & 2. Input data and Analysis code

Any raw input data from the CERN open data platform should be valid for reconstruction. In this example, the input is taken from: root://eospublic.cern.ch//eos/opendata/cms/Run2011A/DoubleElectron/RAW/v1/000/160/433/C046161E-0D4E-E011-BCBA-0030487CD906.root

The reconstruction step can be repeated with a configuration file that depends on the analyzed data, e.g. this example, or by creating our own configuration file (created in a CMS VM) and then changing the script accordingly:

cmsDriver.py reco -s RAW2DIGI,L1Reco,RECO,USER:EventFilter/HcalRawToDigi/hcallaserhbhehffilter2012_cff.hcallLaser2012Filter --data --conditions FT_R_53_LV5::All --eventcontent AOD --customise Configuration/DataProcessing/RecoTLR.customisePrompt --no_exec --python reco_cmsdriver2011.py

3. Compute environment

In order to be able to rerun the analysis even several years in the future, we need to "encapsulate the current compute environment", for example to freeze the software package versions our analysis is using. We shall achieve this by preparing a Docker container image for our analysis steps.

This analysis example runs within the CMSSW analysis framework that was packaged for Docker in cmsopendata. The different images corresponds to data sets taken in different years. Instructions can be found under this repo.

Moreover, the re-reconstruction task needs access run-time to the condition database and inside a CMS VM, this is achieved with the commands:

$ ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA FT_53_LV5_AN1
$ ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db FT_53_LV5_AN1_RUNA.db

For REANA, the condition database on CVMFS can be accessed with any container, the only requirement is that the user should specify the necessary CVMFS volumes to be live-mounted in the reana.yaml resource section, as described here.

4. Analysis workflow

First, we have to set up the environment variables accordingly for the CMS SW. Although this is done in the docker image, REANA overrides them and they need to be reset. This is done by copying the cms entrypoint.sh script:

$ source /opt/cms/cmsset_default.sh
$ scramv1 project CMSSW CMSSW_5_3_32
$ cd CMSSW_5_3_32/src
$ eval `scramv1 runtime -sh`

The actual commands that are needed to carry out the analysis in the CMS specific environment are then:

$ ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA FT_53_LV5_AN1
$ ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db FT_53_LV5_AN1_RUNA.db
$ ls -l
$ ls -l /cvmfs/
$ cmsRun reco_cmsdriver2011.py

This demo represents a "workflow factory" script that will produce REANA workflows for given parameters for the CMS RAW to AOD reconstruction procedure.

Following successful tests (see other branches), we know that REANA is able to run CMS reconstruction for a variety of RAW samples (e.g. dataset SingleMu) and data-taking years (e.g. 2011).

Example

Before running example, you might want to install necessary packages:

$ # create new virtual environment
$ virtualenv ~/.virtualenvs/myreana
$ source ~/.virtualenvs/myreana/bin/activate
$ # install reana-commons and reana-client
$ pip install git+git://github.com/reanahub/reana-demo-cms-reco.git@master#egg=cms-reco

After, the following will generate the workflow to run the example for a given record id, with its metadata retrieved using the COD Client. This generates a workflow in a given output directory, where the reana.yaml file lives with all necessary inputs:

$ cernopendata-client get-record --recid 39 | tee cms-reco-config.json
# # use the values from the 'cms-reco-config.json' file
$ cms-reco --create-workflow
    Created `cms-reco-SingleElectron-2011` directory.
$ cd cms-reco-SingleElectron-2011
$ reana-client run