TRUTH Derivations

Last update: 20 Nov 2024 [History] [Edit]

Before an MC request can be submitted for central production, it needs to be validated to ensure that the output is as expected. The first step to do this is to produce TRUTH derivations from the EVNT file produced by the Gen_tf.py command.

Introduction to derivations

Derivations are in “xAOD” format and typically contain a subset of the available information as well as additional information compared to the AODs produced by the reconstruction software. In the case of TRUTH derivations, they include information about the truth record (simulated particles before they go through the detector simulation) for all events.

There are three supported types of TRUTH derivations, each containing a different amount of the truth record. The formats are as follows:

  • TRUTH0: An exact copy of the input EVNT file in xAOD format. This includes all particles and vertices and their corresponding links.
  • TRUTH1: The reduced containers from TRUTH3 and complete collection of particles from TRUTH0 as well as links to truth jet constituents and truth charged particle jets.
  • TRUTH3: This is the main truth analysis format. It contains containers with the main truth information that is needed for analyses purposes. This is the same format that the DAOD_PHYS and DAOD_PHYSLITE formats use.

For this tutorial, we will be using TRUTH1 for the MC validation. The format provides the complete truth record navigation necessary to validate the physics process.

Producing TRUTH derivations

Use a new shell for this part. Begin by creating the directory MCTutorial/MCDerivation.

dir tutorial/MCTutorial/MCDerivation

Set up a recent Athena release in a new shell:

setupATLAS
asetup Athena,24.0.62

tip If you just finished the MC Sample Generation step, you might want to use a fresh terminal (log out and log in again) to avoid an environment that accidentally mixes releases.

To produce a derivation from evgen.root, use the following command:

Derivation_tf.py \
    --inputEVNTFile=../MCGeneration/evgen.root \
    --outputDAODFile=mcval.pool.root \
    --formats TRUTH1

tip You can provide a path to evgen.root as the --inputEVNTFile argument or copy it into the same directory where the command is being run. Using a path reduces the number of redundant file copies and ensures you run over the latest input file.

tip You can restrict the number of events your derivation makes with --maxEvents=10, for example. By default, the derivation will be made using all the events in the input file, which is equivalent to --maxEvents=-1.

This will produce a file in the DAOD_TRUTH1 format, as indicated in its name. This will be the file we will use for the validation. It also produces a logfile named log.EVNTtoDAOD that can be useful for debugging.

tip You can simultaneously produce multiple derivations in the same command. This is done by adding multiple DAOD formats separated by commas as the formats argument. E.g., --formats TRUTH1,TRUTH3

You can check the content of your DAOD using

checkxAOD.py DAOD_TRUTH1.mcval.pool.root

This will print information about all of the xAOD containers available in the file. This can be useful later in case a tool complains that it does not have input data that it needs (you can check if the information was available in the input file), or if you would like to reduce the size of your derivation and need to know what containers and objects consume the most disk space, for example.

tip Although it’s tempting, you cannot simply produce an analysis derivation format like DAOD_PHYSLITE by specifying PHYSLITE as an output format here. To produce formats like PHYS and PHYSLITE, the detector simulation, digitization, trigger simulation, and reconstruction must be run to produce an AOD file. Only from those AOD files can one make a PHYS or PHYSLITE derivation.