Steering the Job

Last update: 05 Jul 2023 [History] [Edit]

Transforms

For the end of the tutorial we will look at the job transforms and how to modify the default behaviour of them. One of the simplest cases is the overlay transform Overlay_tf.

HITS_File="/cvmfs/atlas-nightlies.cern.ch/repo/data/data-art/Tier0ChainTests/mc16_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.simul.HITS.e6337_s3681/HITS.25836812._004813.pool.root.1"
RDO_BKG_File="/cvmfs/atlas-nightlies.cern.ch/repo/data/data-art/OverlayTests/PresampledPileUp/22.0/Run2/large/mc20_13TeV.900149.PG_single_nu_Pt50.digit.RDO.e8307_s3482_s3136_d1715/RDO.26811908._031801.pool.root.1"

Overlay_tf.py \
--CA \
--detectors Truth \
--inputHITSFile ${HITS_File} \
--inputRDO_BKGFile ${RDO_BKG_File} \
--outputRDOFile MC_plus_MC.RDO.pool.root \
--maxEvents 5 \
--conditionsTag OFLCOND-MC16-SDR-RUN2-09  \
--geometryVersion ATLAS-R2-2016-01-00-01

To run ComponentAccumulator-based configuration in transforms the --CA flag needs to be passed in the transition period. Note that not all transforms have been migrated yet.

The easiest way to familiarise with any transform (and also many other utility scripts) is to use the --help argument, e.g. Overlay_tf.py --help. This will list all available command line arguments.

In this exercise you will try to modify the default job configuration using the preExec, preInclude and postExec arguments. To make the job run faster we will only run truth overlay using --detectors Truth.

The first step is to dump the pickle file. To do that, add the following postExec:

--postExec 'with open("Overlay.pkl", "wb") as f: cfg.store(f)'

A reminder a postExec can only access the global CA instance cfg and the flags configured earlier in the job, but can execute any python code on those.

The postInclude has the same capability as postExec but the instructions are read from a python file. The argument of postInclude is interpreted as a fully qualified function name. For instance: --postInclude "A.B.someAlgCfg" would mean function someAlgCfg from package A and file B.py. The someAlgCfg needs to either be regular CA generator or a function taking CA and flags. In 1st case it will be invoked with the flags and produced CA will be merged to the rest of job configuration. In the 2nd case the function is given the CA itself and can modify the content, e.g.:

# assuming it is in the file available in working dir in file myPostInclude
#--postInclude "myPostInclude.myModifier" 
def myModifier(flags, acc):
   acc.getEventAlg("CaloCellMaker").CaloCellsOutputName = "OtherCells"
   acc.dropEventAlgo("JetRec")
   acc.merge(...) # add some other stuff instead
   #...

Next let’s look at the preExec and preInclude. In production nowadays preInclude is used extensively with so called “Campaign Configurations” where we setup a specific MC processing campaign using a common preInclude. For MC20 these are set in Campaings.MC20. To use the MC20e campaign setup, add all:Campaigns.MC20e preInclude. The all keyword indicates it should be used for all substeps (although in our example there is only one).

Overlay_tf.py \
--CA \
--detectors Truth \
--inputHITSFile ${HITS_File} \
--inputRDO_BKGFile ${RDO_BKG_File} \
--outputRDOFile MC_plus_MC.RDO.pool.root \
--maxEvents 5 \
--conditionsTag OFLCOND-MC16-SDR-RUN2-09  \
--geometryVersion ATLAS-R2-2016-01-00-01 \
--preInclude 'all:Campaigns.MC20e' \
--postExec 'with open("OverlayMC20e.pkl", "wb") as f: cfg.store(f)'

If you compare the pickle files before and after you will see that the configuration actually changed. Our familiar algorithm, the BeamSpotReweightingAlg has been added.

confTool.py --diff Overlay.pkl OverlayMC20e.pkl
Step 1: reference file #components: 65
Step 2: file to check  #components: 66
Legend:
Differences in components Settings in 1st file Settings in 2nd file
 Component AthAlgSeq differ
    Members =  ['xAODMaker::EventInfoCnvAlg/EventInfoCnvAlg', 'xAODMaker::EventInfoOverlay/EventInfoOverlay', 'CopyMcEventCollection/CopyMcEventCollection', 'CopyJetTruthInfo/CopyInTimeAntiKt4JetTruthInfo', 'CopyJetTruthInfo/CopyOutOfTimeAntiKt4JetTruthInfo', 'CopyJetTruthInfo/CopyInTimeAntiKt6JetTruthInfo', 'CopyJetTruthInfo/CopyOutOfTimeAntiKt6JetTruthInfo', 'CopyPileupParticleTruthInfo/CopyPileupParticleTruthInfo', 'CopyTimings/CopyTimings', 'CopyTrackRecordCollection/CopyTrackRecordCollectionMuonExitLayer', 'CopyTrackRecordCollection/CopyTrackRecordCollectionMuonEntryLayer', 'CopyTrackRecordCollection/CopyTrackRecordCollectionCaloEntryLayer']  vs  ['xAODMaker::EventInfoCnvAlg/EventInfoCnvAlg', 'xAODMaker::EventInfoOverlay/EventInfoOverlay', 'CopyMcEventCollection/CopyMcEventCollection', 'CopyJetTruthInfo/CopyInTimeAntiKt4JetTruthInfo', 'CopyJetTruthInfo/CopyOutOfTimeAntiKt4JetTruthInfo', 'CopyJetTruthInfo/CopyInTimeAntiKt6JetTruthInfo', 'CopyJetTruthInfo/CopyOutOfTimeAntiKt6JetTruthInfo', 'CopyPileupParticleTruthInfo/CopyPileupParticleTruthInfo', 'CopyTimings/CopyTimings', 'CopyTrackRecordCollection/CopyTrackRecordCollectionMuonExitLayer', 'CopyTrackRecordCollection/CopyTrackRecordCollectionMuonEntryLayer', 'CopyTrackRecordCollection/CopyTrackRecordCollectionCaloEntryLayer', 'Simulation::BeamSpotReweightingAlg/BeamSpotReweightingAlg']   <<
      >>  only in 2nd file :   ['Simulation::BeamSpotReweightingAlg/BeamSpotReweightingAlg']

 Component  BeamSpotReweightingAlg  only in 2nd file

As a final example you can try again changing the value of flags.Digitization.InputBeamSigmaZ to 50 using

--preExec 'flags.Digitization.InputBeamSigmaZ = 50'

This resolves in the following configuration difference.

confTool.py --diff OverlayMC20e.pkl OverlayFlags.pkl
Step 1: reference file #components: 66
Step 2: file to check  #components: 66
Legend:
Differences in components Settings in 1st file Settings in 2nd file
 Component BeamSpotReweightingAlg differ
    Input_beam_sigma_z =  42.0  vs  50.0   <<