Configuring Systematics Algorithm Sequences

Last update: 04 Aug 2020 [History] [Edit]

The configuration of analysis algorithm sequences is somewhat more complicated than regular algorithm sequences, basically because the configuration of individual algorithms isn’t independent, but instead later algorithms depend on the configuration of algorithms earlier in the sequence.

The way it works is that each algorithm in the sequence has some meta-configuration associated with it, and then in a post-configuration step the meta-configuration gets converted into actual property values. If you followed the beginners tutorial that is what the algSeq.configure(...) call does. For more details on the post-configuration see the beginner’s tutorial, the rest of this page focuses on the meta-configuration.

The reason for attaching the meta-configuration to the sequence instead of setting the properties directly is that it allows to manipulate the sequence after it has been assembled (but before the post-configuration), e.g. algorithms can be added, removed or reordered.

You can get a better feel of how this works by looking at some existing examples, e.g.:

Systematics Handle Meta-Configuration

One of the basic ideas of configuring analysis sequences is that usually the whole sequence works with one object type (e.g. muons) and the output of each algorithm is used as input for the next algorithm. To that end, the name of the property holding the input container is set via inputPropName and the name of the property holding the output container is outputPropName.

This generally matches nicely with the SysCopyHandle that most CP algorithms use internally, which has both a property for the input container and the output container. It should be noted though that setting the outputPropName can be omitted if the algorithm shouldn’t make a copy internally (or if it doesn’t use a copy handle).

Some algorithms like (MET and overlap removal) have multiple inputs/outputs, in which case these meta-properties are configured using a dictionary instead of a single string.

Since each algorithm should only run for the systematics that affect it instead of all systematics (e.g. we don’t need to redo jet energy calibration to evaluate the electron scale factor systematics), we also need to set the systematics that affect each algorithm. This is done via the configuration option affectingSystematics which holds a regular expression that matches all the systematics affecting this algorithm.

Some important caveats:

  • The regular expressions for the systematics is checked against the systematics for the tool in question, and if a systematic is missed an error is generated at runtime.
  • The regular expressions should only match the systematics for this algorithm, any upstream systematics will automatically be included by the post-configuration.
  • This doesn’t specify the actual list of systematics (that happens via SysListLoaderAlg), but is a filter on the list of systematics. So if the user decides not to run a systematics it can just be omitted in SysListLoaderAlg without any changes here.

dynConfig and metaConfig

There are some properties (like preselection) that are not involved in systematics handling but will still depend on what algorithms came in the sequence before the algorithm in question. In the past we would just have had a python variable we maintain as we configure the algorithm sequence. And while that worked fine as long as you used the sequence as is, it made it very hard to add/remove/reorder algorithms.

So the idea is that instead of tracking these properties in a python variable we add meta-information to the algorithm sequence and then create the actual property value during post-configuration. In more practical terms, you use metaConfig to set the configuration values that you want to communicate to subsequent algorithms, and then dynConfig to declare the actual property to set. You can also add default/starting values to the entire sequence via addMetaConfigDefault.

Three of the finer points:

  • Since there are multiple variables to track, the mechanism generally relies on dictionaries and identifies the variables via names.
  • Since the variables are accumulated from multiple algorithms they consist of lists that get concatenated (plus that was what we used for those variables before).
  • Since the format of the variables being set can vary wildly, dynConfig uses a lambda function to convert the list of values into the property value.