A Basic AnaAlgorithm

Last update: 16 Aug 2024 [History] [Edit]

The Layout of an Algorithm

The analysis code is based on an algorithm class. An algorithm is the basic unit of code EventLoop or Athena knows about. At least as a beginner you will typically just have a single algorithm in your job that will hold your analysis code. As a beginner, it is perfectly fine to use a single algorithm until you become more familiar with the setup. When you work on larger and more complex projects that require collaboration with other developers, it is generally preferred (if not required) to use a multi-algorithm setup.

The MyAnalysis package contains an empty algorithm called MyxAODAnalysis. This could be named anything, but if you change its name you will need to modify the tutorial code accordingly (so for the first time it is probably best to stick with this name).

The header for the algorithm is MyAnalysis/MyxAODAnalysis.h and the source code is in Root/MyxAODAnalysis.cxx. The MyxAODAnalysis class is derived from the EL::AnaAlgorithm class.

The class contains four methods: the constructor and three methods derived from EL::AnaAlgorithm. The details of each method will be explained later in the tutorial but a brief overview is as follows:

  • MyxAODAnalysis(): This is the constructor. You should put any code here for the base initialization of variables (such as setting pointers to nullptr) as well as declaring all properties for the algorithm, which is explained here.

  • initialize(): You should put everything here that needs to be executed once at the very beginning, such as creating histograms and output trees. The return type is StatusCode which is discussed here.

  • execute(): This is executed for every single event, so the majority of your analysis code, such as retrieving variables, applying cuts, and filling histograms is done here. The return type is StatusCode.

  • finalize(): This is the mirror of initialize() and is called after all events are processed. It is somewhat rare to put any code here. The return type is StatusCode.

Dictionary For The Algorithm (for EventLoop)

tip This is strictly speaking only needed when working in EventLoop (AnalysisBase). You can create it when working with Athena (AthAnalysis), if you want, but it will give you compilation overhead, so you can skip this if you want.

The files MyAnalysis/MyAnalysis/MyAnalysisDict.h and MyAnalysis/MyAnalysis/selection.xml together create the dictionary that allows your algorithm to be used in EventLoop. For this tutorial, the content isn’t important, but you can look at them here and here.

Factory For The Algorithm (for Athena)

tip This is only needed when working with Athena (AthAnalysis), inside EventLoop (AnalysisBase) this will not be used. As such if you know that you will never work in Athena you can leave this out (or add it later).

The file MyAnalysis/src/components/MyAnalysis_entries.cxx creates a factory that allows the algorithm to be used in Athena. For this tutorial, the content isn’t important, but you can look at it here.

The Package CMakeLists.txt File

There is a package-level CMakeLists.txt file in the repository you checked out. Unlike the standard ATLAS top-level file discussed previously, this contains information necessary for compiling your package.

There are three blocks of code in CMakeLists.txt. The main block of code that is relevant for development is:

# Add the shared library:
atlas_add_library (MyAnalysisLib
  MyAnalysis/*.h Root/*.cxx
  PUBLIC_HEADERS MyAnalysis
  LINK_LIBRARIES AnaAlgorithmLib)

This adds the libraries for your algorithm and links additional libraries necessary for compilation. Throughout this tutorial, you will need to add additional libraries to the LINK_LIBRARIES line. This is done by simply adding the names of the libraries separated by spaces.

tip Whenever CMakeLists.txt is modified, you just need to call make again and the changes will be picked up. If you want to be thorough, you can also call cmake ../source after each change to ensure that the changes are not missed. However, this is generally unnecessary because the cmake configuration that automatically scans for changes is rather effective. Forcing cmake to run each time is rather time consuming.