Skip to content

Beta version 2.0

Latest
Compare
Choose a tag to compare
@lcorcodilos lcorcodilos released this 15 Jun 21:53
· 38 commits to master since this release

Change log

This is perhaps a bigger update than I was intending but I think justifies the release of Beta 2.0. It incorporates PRs #54, #58, #59, #61, #62, #64, and #68. Changes are listed in reverse chronological order in each section.

General

  • Add option to "Take" with SubCollection() method.
  • Remove SubCollection() being made during CalibrateVars().
  • Various robustness and consistency changes.
  • Reduce jdl_template default CPU and memory requests for condor
  • Remove TIMBER collection structs from Prefire_weight for speed
  • Drop createAllCollections option from analyzer
  • Add option to save Runs branch in analyzer.Snapshot() (default is to save it)
  • In PrintNodeTree(), drop subcollection definitions by default.
  • Deduce extension for image saved by CompareShapes
  • Add item_meta to Group class for lazy template histograms to be possible
  • Change Tpt weight module to drop alpha variation since it's only a normalization effect. Switch the beta class method to eval and drop corr (eval now does the nominal and variations).
  • Change __ prefix on private variables to _ for consistency.
  • Create CollectionOrganizer and implement it. Does not create any user-facing changes but provides infrastructure for future features.
  • Add hardware::Open option (default to "READ") with inTIMBER option for internal and external paths.
  • Add hardware::LoadHisto to load histogram into memory with inTIMBER option for internal and external paths.
  • Make Correction/ModuleWorker constructor arguments more logical - pass correctly typed variable instead of a string of that variable.
  • Add MakeWeightCols() correlation option to correlate uncertainties that are tied together but had to be calculated separately (ex. two top jet tag SFs being applied).
  • Remove repeated clang parsing when cloning ModuleWorker/Correction
  • Change lhaid type from str to int

New Features

  • Nodes now have unique hashes which keep them unique in the analyzer so that Nodes of the same name can be tracked. This is useful in the case where the processing has forked and you'd like to keep node naming consistent across processing branches.
  • HistGroup.Add() has made the name argument optional and, if not specified, it will instead derive it from the hist (via TObject.GetName()). However, this will initiate the RDataFrame loop!
  • Change genEventCount calculation to genEventSumw (for simulation).
  • Argument extraNominal added to MakeWeightCols() which will scale all weight values (ex. xsec*lumi/genEventSumw).

New Additions

  • MemoryFile class to store a string in memory to mimic a file one would open().
  • DictToMarkdownTable() method to convert python dictionary to markdown table (uses MemoryFile).
  • TIMBER/Tools/AutoPU.py added to automate (in pieces or as a whole) the processes of making a pileup weight and applying it.
  • Common.GenerateHash() added for Node hashes.
  • analyzer.GetColumnNames() returns a list of all column names that currently exist.
  • hardware::MultiHadamardProduct for non-nested vectors
  • Update GenMatching tools to be better optimized and to take advantage of new AoS feature

CMS Algos

  • Update luminosity golden jsons.
  • Add DeepAK8 CSV reader and top tagging SF module (note that there have been crashes in some instances for this module that are currently being studied).
  • HEM and Prefire correction modules added.
  • Add JME data tarball information in readme
  • Add W and top tagging scale factor modules (only tau21 and tau32+subjet btag supported, respectively)

Pileup

  • Add WeightCalculatorFromHistogram (from NanoAOD-tools)
  • Add C++ pileup module with "auto" mode to grab npvx distribution from memory
  • Add pileup data files and information on where they are from (+script to get them)

Bug fixes

  • Do not try to get Runs branch if it doesn't exist.
  • Fix bug when making new collections using CollectionOrg.AddBranch().
  • Cleanup plotting in Plot.py to be more consistent and documentation-ready.
  • setup.sh had back-ticks that caused unintended executions and return is more suitable than exit.
  • Return index from Pythonic::InList rather than a bool
  • If ModuleWorker looks in a TIMBER .cc for a function (eval typically) and can't find it, look for it in the equivalent .h (since that's where templates live)