An Introduction to the CP Algorithm Text Configuration

Last update: 11 Mar 2024 [History] [Edit]

The Text Configuration

There is a configuration file in your MyAnalysis/data directory called config.yaml. This file is a text file, which uses YAML, a markup language, to create and return algorithms in an analysis. This file takes all of the specifications needed for an algorithm and lists them in an organized way, using tabs, dashes, and new lines, to configure the algorithm as desired by the user.

Let’s start by looking at one of the basic “overhead” commands necessary for any analysis, which are already in the file. Loading and running systematic variations is configured as a part of CommonServices.

# Common (global) services to run.
CommonServices:
    # Turn on/off systematics
    runSystematics: True

Among other functions, this takes care of registering all the systematics to run on, so they can be propagated to subsequent algorithms without configuring them one-by-one. If you do not need to use systematics, you can set the runSystematics option to False. Running systematics increases the processing time and the size of the output ntuple. For this tutorial, we should limit both. Set runSystematics to False. Note that even if you have a large number of algorithms for various object types with many different systematics, you will only ever need one block for CommonServices. The individual algorithms are smart enough that they will only run for systematics that actually affect them.

We won’t worry about how to use systematic uncertainties for now and will cover that later in the tutorial.

Throughout the tutorial, additional text blocks will be added to config.yaml in a format similar to the code example above.

tip

The order of text blocks is flexible in the config.yaml file. Blocks can be placed in any order in the file, as long as containers that are referenced in more than one block are properly done so (i.e., with the correct name and spelling in all blocks) to avoid a crash.

Add YAML configuration to steering macro

In order to run the CP algorithms in your job, it is necessary to create a sequence from your steering macro. This is done using a method called makeSequence that is defined in python/MyAnalysisAlgorithms.py. Up to this point, the sequence has not been used, but we will rely heavily on it moving forward.

tip The following line in MyAnalysis/CMakeLists.txt enables the use of the sequence:

atlas_install_python_modules( python/*.py )

The basic sequence is already set up to add these blocks for you “under the hood”. To use the sequence, add the following to your steering macro or jobOptions after your algorithm is added to the job but before the job is run using a driver:

# Create an algSequence from the YAML configuration file
from MyAnalysis.MyAnalysisAlgorithms import makeSequence
algSeq = makeSequence (options.configPath, dataType)
print( algSeq ) # For debugging
algSeq.addSelfToJob( job )

tip The makeSequence method takes an argument indicating whether a data or MC sample is used as input. This is defined in the steering macro:

# Set data type to MC
dataType = "mc"

You will need to change this later in the tutorial.

Running your code

In addition to the --submission-dir argument you have been using to run your jobs, ATestRun_eljob.py also accepts an argument for --config-path that points to the configuration text. Try running with the new command:

ATestRun_eljob.py --config-path=../source/MyAnalysis/data/config.yaml --submission-dir=submitDir

In general, the path to the text file should be how the file is accessed from the run directory (hence the ../source/... syntax seen above).

Test and commit your changes

Run your code to be sure that it is working properly.

If it is working correctly, you should see additional output that looks like:

Configuring CommonServices
    runSystematics: False
    filterSystematics: None
/*** AlgSequence/AlgSequence ***************************************************
| /*** PythonConfig AsgService/CP::SystematicsSvc/SystematicsSvc *****************
| \--- (End of PythonConfig AsgService/CP::SystematicsSvc/SystematicsSvc) --------
| /*** PythonConfig AsgService/CP::SelectionNameSvc/SelectionNameSvc *************
| \--- (End of PythonConfig AsgService/CP::SelectionNameSvc/SelectionNameSvc) ----
\--- (End of AlgSequence/AlgSequence) ------------------------------------------

When you are happy that it is running correctly, commit and push your changes.