Job Steering

Last update: 09 Nov 2021 [History] [Edit]

Job Transforms

The production Athena jobs are steering using so called Job Transforms. Examples of those are, and more. They can usually be recognised by ending with

Each transform can consist on multiple substeps, also depending on the configuration. Some arguments are “substep-aware”, which means that one can set the name of the step as a prefix, e.g. --CA 'Overlay:True' will only run ComponentAccumulator-based configuration for overlay. There are a few helpers available:

  • all: applies a specific argument to all substeps (--preInclude 'all:Campaigns.MC20e')
  • default: applies a specific argument unless overridden explicitly by another argument using a substep name (--postInclude '')
  • first: only apply this argument to the first substep

There are many ways on how job transforms can be steered. A --help argument can be used to get a list of all available properties (e.g. --help). In production most of those properties are steered by AMI tags. Any AMI tag can also be evaluated from the command line e.g. using --AMIConfig q442. Note that usually inputs and outputs need to be specified explicitly, but in case of this q-tag they are also present in the definition.

Modifying the Job Configuration

While higher-level job configuration can be steered using different arguments there is also a possibility to affect it lower-level. The following arguments are used for that

  • preExec: Can call arbitrary python on ConfigFlags (only this is exposed).
  • preInclude: Calls a function that accepts ConfigFlags as the only argument. Passed as an “import string” of the form <package>.<module>.<function> (or <package>.<function> if aliased in
  • postExec: Can call arbitrary python but only ConfigFlags and the top-level CA called cfg are exposed.
  • postInclude: Calls a function that accepts ConfigFlags and the main CA instance. If a function with only one mandatory argument (the flags) is passed, it is assumed to be a configuration fragment and is merged with the top-level CA.

Any of the described arguments can be applied to all trans

The order of execution can be transform dependent but it is usually like:

  1. Autoconfigure flags based on the type of the job and the input.
  2. Load preIncludes.
  3. Execute preExecs.
  4. Configure the main ComponentAccumulator.
  5. Load postIncludes.
  6. Execute postExecs.
  7. Run the job.