Creating and Filling Trees/NTuples

Last update: 16 Nov 2022 [History] [Edit]

In a typical analysis workflow, you will want to process information in an xAOD and write outputs to histograms or a TTree (ntuple) further analysis processing. In this section, we will show you how to create a TTree in your analysis algorithm, and write some simple variables to it. If you are not familiar with the interface of TTree, you should probably first look at the following pages and examples:

Note that the infrastructure allows you to create as many trees and output files as you want in your analysis, with some limitations. The grid has hard limits in place preventing the creation of too many output files. Additionally, it is not optimal to produce a separate tree for each systematic variation. Doing so is a very inefficient use of space by 1-2 orders of magnitude.

The tutorial workflow creates ntuples from xAOD inputs, but your analysis may create histograms instead. We will provide details about creating histograms for completeness, but subsequent tutorial steps rely on ntuples.

In the exercise we will be filling information into our output tree from xAOD::EventInfo.

C++ Code

In order to write variables into an output TTree, you need to declare them as member variables in your algorithm class. For this exercise add the following private members to your algorithm header (MyxAODAnalysis.h):

#include <TTree.h>
  /// Output variables for the current event
  unsigned int m_runNumber = 0; ///< Run number
  unsigned long long m_eventNumber = 0; ///< Event number

NB: this is necessary for every variable you need to access across multiple functions in your algorithm.

The setup of the tree happens in the initialize() function of your algorithm. To set the tree up, add the following to the initialization function:

  ANA_CHECK (book (TTree ("analysis", "My analysis ntuple")));
  TTree* mytree = tree ("analysis");
  mytree->Branch ("RunNumber", &m_runNumber);
  mytree->Branch ("EventNumber", &m_eventNumber);

And this shows us one of the most inconvenient features of writing a TTree. When you write a “primitive” and an “object” variable, you have to use an ever so slightly different formalism. For primitive variables, like the ones we will be getting from xAOD::EventInfo, we need to provide the TTree::Branch function with a pointer to the primitive variable. For object variables, such as the pT of electrons, it is necessary to use a std::vector to store the values for all of the electrons in each event. Storing a std::vector in a branch requires a slightly different branch method than primitive variables.

Finally, once the creation and deletion of the tree/variables is taken care of, let’s fill them in the execute() function of the algorithm like:

  // Read/fill the EventInfo variables:
  const xAOD::EventInfo* eventInfo = nullptr;
  ANA_CHECK (evtStore()->retrieve (eventInfo, "EventInfo"));

  // Print out run and event number from retrieved object
  ANA_MSG_DEBUG ("in execute, runNumber = " << eventInfo->runNumber() << ", eventNumber = " << eventInfo->eventNumber());

  m_runNumber = eventInfo->runNumber ();
  m_eventNumber = eventInfo->eventNumber ();

  // Fill the event into the tree:
  tree ("analysis")->Fill ();

tip You may already have the code included to retrieve EventInfo. If you do, you don’t need to add the same lines again.

This concludes the updates in the C++ code, you should be able to (re-)compile your code with all these changes included.

Job Configuration (EventLoop)

EventLoop does not automatically create an output file with the tree. You have to tell EventLoop explicitly that you want to create an output file with the tree in it. This is a different file than the output file that will hold any histograms you make.

You do this by adding the following into your job steering macro:

# Add output stream
job.outputAdd (ROOT.EL.OutputStream ('ANALYSIS'))

You can find your created output file under submitDir/data-ANALYSIS.

Job Configuration (Athena)

For Athena as well there is one more step that you need to take. You have to tell your job where your algorithm should write the tree(s). You do that by adding the following into your jobOption file (see main Athena tutorial for more info):

jps.AthenaCommonFlags.HistOutputs = ["ANALYSIS:MyxAODAnalysis.outputs.root"]
svcMgr.THistSvc.MaxFileSize=-1 #speeds up jobs that output lots of histograms

This is needed because every Athena algorithm creates histograms and trees through the THistSvc, which can have any number of files/streams open at one time. With the above instruction you tell the service to (re-)create a file called MyxAODAnalysis.outputs.root, and assign it to the ANALYSIS stream. Which is the stream all analysis algorithms will write to by default.

The output file MyxAODAnalysis.outputs.root will hold any TTrees and histograms you create.

tip Keep in mind though that different algorithms can be assigned to different files/streams. Though during analysis it is not recommended to create too many separate files… Still, if you want to tell your >algorithm which stream it should write its histograms into, you can do it like:

alg.RootStreamName = 'MY_STREAM_01'

Then of course you will have to make sure that you set up MY_STREAM_01

Commit your changes

Don’t forget to commit and push your changes.