An Introduction to the CP Algorithm Text Configuration

Last update: 26 Nov 2024 [History] [Edit]

The Text Configuration

There is a configuration file in your MyAnalysis/data directory called config.yaml. This file is a text file, which uses YAML, a markup language, to create and return algorithms in an analysis. This file lists all of the specifications needed for an algorithm in an organized way, using tabs, dashes, and new lines, to configure the algorithm as desired by the user.

Let’s start by looking at one of the basic “overhead” commands necessary for any analysis, which is already in the file. Loading and running systematic variations is configured as a part of CommonServices.

# Common (global) services to run.
CommonServices:
    # Turn on/off systematics
    runSystematics: True

Among other functions, this takes care of registering all the systematics to run on, so they can be propagated to subsequent algorithms without configuring them one-by-one. If you do not need to use systematics, you can set the runSystematics option to False. Running systematics increases the processing time and the size of the output ntuple significantly. For this tutorial, we should limit both. Set runSystematics to False. Note that even if you have a large number of algorithms for various object types with many different systematics, you will only ever need one block for CommonServices. The individual algorithms are smart enough to only run for systematics that actually affect them.

tip If systematics are included, some fun regex (regular expression) terms can be added to restrict the systematics used, e.g.:

CommonServices:
   runSystematics: False
   FilterSystematics: '^(?:(?!PseudoData).)*$'

For now, leave those systematics disabled.

Throughout the tutorial, additional text blocks will be added to config.yaml in a format similar to the code example above.

tip The order of text blocks is flexible in the config.yaml file. Blocks can be placed in any order in the file, as long as containers that are referenced in more than one block are properly done so (i.e., with the correct name and spelling in all blocks) to avoid a crash.

Add YAML configuration to steering macro

In order to run the CP algorithms in your job, it is necessary to create a sequence from your steering macro. This is done using a method called makeSequence that is defined in MyAnalysis/python/MyAnalysisAlgorithms.py. Up to this point, the sequence has not been used, but we will rely heavily on it moving forward.

tip The following line in MyAnalysis/CMakeLists.txt installs the MyAnalysisAlgorithms python atlas_install_python_modules for use:

atlas_install_python_modules( python/*.py )

The basic sequence is already set up to add these blocks for you “under the hood”. To use the sequence, add the following to your steering macro or jobOptions before your algorithm is added to the job but after the job is defined:

# Create an algSequence from the YAML configuration file
from MyAnalysis.MyAnalysisAlgorithms import makeSequence
algSeq = makeSequence (options.configPath, dataType=dataType)
print( algSeq ) # For debugging
algSeq.addSelfToJob( job )

tip The makeSequence method takes an argument indicating whether a data or MC sample is used as input. This is defined in the steering macro:

# Set data type to MC
dataType = "mc"

You will need to change this later in the tutorial.

Running your code

ATestRun_eljob.py accepts an argument for --config-path that points to the configuration text (you can see at the top of the macro how it does that). Try running with the new command:

ATestRun_eljob.py --config-path=../source/MyAnalysis/data/config.yaml

tip The macro also allows you to point to a specific submission directory using e.g. --submission-dir=submitDir.

tip Remember that in Athena, you need a - before any options meant to be recognized by your jobOptions are specified.

In general, the path to the text file should be how the file is accessed from the run directory (hence the ../source/... syntax seen above).

Test and commit your changes

Run your code to be sure that it is working properly.

If it is working correctly, you should see additional output that looks like:

Configuring CommonServices
    runSystematics: False
    filterSystematics: None
/*** AlgSequence/AlgSequence ***************************************************
| /*** PythonConfig AsgService/CP::SystematicsSvc/SystematicsSvc *****************
| \--- (End of PythonConfig AsgService/CP::SystematicsSvc/SystematicsSvc) --------
| /*** PythonConfig AsgService/CP::SelectionNameSvc/SelectionNameSvc *************
| \--- (End of PythonConfig AsgService/CP::SelectionNameSvc/SelectionNameSvc) ----
\--- (End of AlgSequence/AlgSequence) ------------------------------------------

When you are happy that it is running correctly, commit and push your changes.