Writing NTuples and Trees using the Text Configuration

Last update: 19 Nov 2024 [History] [Edit]

We previously saw how to fill and save histogram outputs. Many analysis workflows, including the one presented here, rely heavily on TTree outputs (often referred to as ntuples). If you are not familiar with the interface of TTree, you should probably first look at the following pages and examples:

It is possible to build an output tree by hand, but we can make use of existing tools to store the outputs of the CP algorithms in a compact format that minimizes disk space requirements, especially when systematic variations are used. This ntupling step is done using the Output CP algorithm.

Create the output stream (EventLoop)

Unlike histogram outputs, EventLoop does not automatically create an output file with the tree. You have to tell EventLoop explicitly that you want it to do so. This results in a different file than the one with the histograms.

You do this by adding the following into your job steering macro:

# Add output stream
job.outputAdd (ROOT.EL.OutputStream ('ANALYSIS'))

After running, you can find your created output file under submitDir/data-ANALYSIS. If you run and open it, you’ll find that it’s empty — we haven’t told the job what to write out yet, so that’s expected.

Add Output to your Job

To add the desired output ntuple structure to your job, add the following code to your config.yaml file:

# Specify the name of the output tree and any variables associated
# with a container to save.
Output:
    treeName: 'analysis'
    # Variables associated with containers other than MET
    #   Syntax without systematics: '<Container>_NOSYS -> <branch name>'
    #   Syntax with systematics: '<Container>_%SYS% -> <branch name>'
    vars: []
    containers:
        '': 'EventInfo'

We have done a few things here: firstly, we have specified that the name of the tree containing our information in the ntuple will be analysis. Secondly, we have specified that we want no other branches in our ntuple that may contain additional variables, as indicated by the vars: [] line. Finally, we have asked for the EventInfo container to be written to the output. containers is a dictionary, and the key is used as a prefix for the variable names in the output tree. And empty string ('') indicates that no prefix should be used. Only events passing the selections in our analysis (e.g. GRL selection and event cleaning) will be saved.

Examining the CP Algorithms Output

Open the file submitDir/data-ANALYSIS/dataset.root and check its contents. You should see a TTree named analysis. Using the command analysis->Print() or using a TBrowser, check the contents of the tree. There should be branches named runNumber, eventNumber, mcChannelNumber and several weight branches.

tip These branches have the same names as the variables in the EventInfo; if you provided a prefix like ei_ in the containers: list in the previous section, you would see branch names like ei_runNumber and ei_eventNumber.

Try drawing runNumber and eventNumber to see if they match the distributions you made in your output histograms.

Commit and push your changes when you are satisfied with what you see.