Common CP Algorithms Introduction

Last update: 19 Nov 2024 [History] [Edit]

The combined performance (CP) groups provide recommendations for use in analysis. This ranges from data quality to pileup reweighting to calibrations and uncertainties for physics objects. In order to ensure that CP recommendations can be easily and consistently implemented in analyses regardless of which framework is used, each group provides tools that you can use in your analysis. These tools are wrapped up in “CP algorithms” that run all the tools you need in the right sequence and provide easy ways to consistently configure all the tools.

You can find all recommendations from CP groups on their corresponding twiki pages. All of these pages can easily be accessed from the main AtlasPhysics twiki page.

tip Some existing analyses may be using older methods, but the approach presented here is the recommendation for new analyses moving forward.

CP algorithms are run at the start of your analysis job and they make corrected objects available to be accessed and used as though they were directly from the input file. Additionally, it is possible to produce ntuples directly using CP algorithms without the need for user analysis code. However, for this tutorial, we will still make use of user analysis code to give you a hands-on understanding of how information is stored and accessed in files in ATLAS.

The basic design of the CP algorithms provides a lot of flexibility to the user. Some examples of this are:

  • CP algorithms can be run in either EventLoop or Athena

  • Users can change the configuration of any CP algorithm/tool at will, re-order them and run only those that are needed

  • The algorithms can run multiple working points for a single object type (e.g. muons) and apply decorations for all of them on the same object

  • A basic preselection can be performed on objects/events and the CP algorithms can be applied only to those objects/events that pass the preselection (to save computing time)

  • This approach can produce smaller ntuples (1-2 orders of magnitude smaller) than those produced using older frameworks, which also translates into shorter runtimes when producing histograms.

The configuration of the CP Algorithms for a specific ntuple-making job is done with a .yaml file, which is a text file that uses the YAML markup language (i.e., a text encoding system that uses a set of symbols inserted in a document to control its structure, formatting, and the relationship between its parts). Pieces of text are added to the configuration file to call an algorithm and specify its attributes. Each piece of text in the config file refers to a CP Algorithm “block” (set of tools and algorithms for a single purpose, e.g. for a specific physics object).

Since each CP Algorithm block is a self-contained piece of code, each has its own scope, much like a function or loop does in simpler code. Objects can be passed from block to block as long as they are specified properly between blocks, just as objects are passed between functions in simpler code setups. However, each block needs its properties set to handle these objects the way we desire them to be handled.

Properties of configuration blocks are set by taking advantage of the level of indentation in YAML: each unindented line in the configuration file defines a new block. Each indent underneath an unindented line of text defines a sub-block attached to that block, which we mostly use to set properties of the block. Sub-blocks of the sub-blocks are defined by further indentation.

tip The configuration of the CP algorithms is still under development and the tutorial may be updated from time to time to reflect any changes.

Where to find CP algorithm sequence blocks

The various tools provided by CP groups for use in analysis are spread out throughout the Athena codebase, but the algorithms and algorithm blocks that call these tools are centrally located for ease of use. You can find a list of the available CP algorithm sequence blocks in PhysicsAnalysis/Algorithms. Each algorithm block can be found in its respective directory under the python subdirectory and is called *AnalysisConfig.py. One exception is the Good Runs List selector algorithm, which is currently kept in DataQuality/GoodRunsLists, but will soon be moved to the same directory as the others.

An example of the implementation of most available algorithm sequence blocks can be found in FullCPAlgorithmsTest.py, in the function makeSequenceBlocks(). The ntuples that will be used in later tutorial steps have been produced using a customized version of the code that mostly uses the same blocks.