Writing NTuples and Trees using the Text Configuration

We previously saw how to fill and save histogram outputs. Many analysis workflows, including the one presented here, rely heavily on TTree outputs (often referred to as ntuples). If you are not familiar with the interface of TTree, you should probably first look at the following pages and examples:

It is possible to build an output tree by hand, but we can make use of existing tools to store the outputs of the CP algorithms in a compact format that minimizes disk space requirements, especially when systematic variations are used. This ntupling step is done using the Output CP algorithm.

Add Output to your Job

To add the desired output ntuple structure to your job, add the following code to your config.yaml file:

# Specify the name of the output tree and any variables associated
# with a container to save.
Output:
    treeName: 'analysis'
    # Variables associated with containers other than MET
    #   Syntax without systematics: '<Container>_NOSYS -> <branch name>'
    #   Syntax with systematics: '<Container>_%SYS% -> <branch name>'
    vars: []
    containers:
        '': 'EventInfo'

We have done a few things here: firstly, we have specified that the name of the tree containing our information in the ntuple will be analysis. Secondly, we have specified that we want no other branches in our ntuple that may contain additional variables, as indicated by the vars: [] line. Finally, we have asked for the EventInfo container to be written to the output. containers is a dictionary, and the key is used as a prefix for the variable names in the output tree. And empty string ('') indicates that no prefix should be used. Only events passing the selections in our analysis (e.g. GRL selection and event cleaning) will be saved.

Examining the CP Algorithms Output

Open the file workDir/data-ANALYSIS/dataset.root and check its contents. You should see a TTree named analysis. Using the command analysis->Print() or using a TBrowser, check the contents of the tree. There should be branches named runNumber, eventNumber, mcChannelNumber and several weight branches.

These branches have the same names as the variables in the EventInfo; if you provided a prefix like ei_ in the containers: list in the previous section, you would see branch names like ei_runNumber and ei_eventNumber.

Try drawing runNumber and eventNumber to see if they match the distributions you made in your output histograms.

Commit and push your changes when you are satisfied with what you see.