Job Steering

Last update: 17 May 2024 [History] [Edit]

Job Transforms

The production Athena jobs are steering using so called Job Transforms. Examples of those are, and more. They can usually be recognised by ending with

Each transform can consist on multiple substeps, also depending on the configuration. Some arguments are “substep-aware”, which means that one can set the name of the step as a prefix, e.g. --MyOpt 'Overlay:True' will only set MyOpt for the overlay step. There are a few helpers available:

  • all: applies a specific argument to all substeps (--preInclude 'all:Campaigns.MC20e')
  • default: applies a specific argument unless overridden explicitly by another argument using a substep name (--postInclude '')
  • first: only apply this argument to the first substep

These options are meant to be combined, e.g.
--postInclude "all:PyJobTransforms.UseFrontier" "RAWtoESD:MyModifter
in this case in all steps Frontier DB will be used, and in addition in RAWtoESD a custom file will be included to modify the setup. Note that there is no = after --postInclude. This is to allow --postInclude to consume all arguments (this is standard python options parsing behaviour).

There are many ways on how job transforms can be steered. A --help argument can be used to get a list of all available properties (e.g. --help). In production most of those properties are steered by AMI tags. Any AMI tag can also be evaluated from the command line e.g. using --AMIConfig q442. Note that usually inputs and outputs need to be specified explicitly, but in case of this q-tag they are also present in the definition.

Modifying the Job Configuration

While higher-level job configuration can be steered using different arguments there is also a possibility to affect it lower-level. The following arguments are used for that

  • preExec: Can call arbitrary python on flags (only this is exposed).
  • preInclude: Calls a function that accepts configuration flags as the only argument. Passed as an “import string” of the form <package>.<module>.<function> (or <package>.<function> if aliased in
  • postExec: Can call arbitrary python but only flags and the top-level CA called cfg are exposed.
  • postInclude: Calls a function that accepts configuration flags and the main CA instance. If a function with only one mandatory argument (the flags) is passed, it is assumed to be a configuration fragment and is merged with the top-level CA.

Any of the described arguments can be applied to all trans

The order of execution can be transform dependent but it is usually like:

  1. Autoconfigure flags based on the type of the job and the input.
  2. Load preIncludes.
  3. Execute preExecs.
  4. Configure the main ComponentAccumulator.
  5. Load postIncludes.
  6. Execute postExecs.
  7. Run the job.