Online Monitoring

Last update: 11 Mar 2024 [History] [Edit]

Introduction

The creation of monitoring histograms should be done using the “Monitored” infrastructure. For technical details and code examples see the Doxygen documentation.

For the documentation on how this infrastructure is used in Data Quality monitoring offline see the DQRun3FrameworkTutorial TWiki.

Using the “Monitored” infrastructure allows the creation and filling of histograms in a thread-safe way with the minimum amount of boilerplate code. There is a separation between the monitored quantity (e.g. energy, momentum, eta, phi) and the final histograms (e.g. momentum vs eta). The variables themselves are defined and filled in the C++ code, whereas the definition of the histograms is entirely done in the Python job configuration via the GenericMonitoringTool. This allows for maximum flexibility without having to re-compile the code for simple histogram changes.

As a starting point add the following empty ToolHandle to your class:

#include "AthenaMonitoringKernel/Monitored.h"
[...]
private:
  ToolHandle<GenericMonitoringTool> m_monTool{this,"MonTool","","Monitoring tool"}; 
  // empty string as third argument turns off monitoring by default

and a conditional retrieve in your initialize() method:

if (!m_monTool.empty()) ATH_CHECK(m_monTool.retrieve());

Finally, make sure you link against the AthenaMonitoringKernelLib library in your CMakeLists.txt:

atlas_add_component( ...
                     LINK_LIBRARIES ... AthenaMonitoringKernelLib ... )

The following sections give more details on:

Monitored variables

Some of the following examples are show-cased in the AthExMonitored package and can be tested via athena.py AthExMonitored/MonitoredOptions.py.

Scalars

The simplest monitored variable is a scalar (e.g. integer or floating point value):

{
  auto et = Monitored::Scalar<float>("Et");
  auto njets = Monitored::Scalar<int>("nJets");
  auto phi = Monitored::Scalar("phi", 0.0);
  auto eta = Monitored::Scalar("eta", 0.0);
  auto cutType = Monitored::Scalar("cutType", "EtaCut");
  auto mon = Monitored::Group(m_monTool, et, njets, phi, eta, cutType);
  // code to set the values
}

A few remarks to the above code example:

  • The actual type of Monitored::Scalar is irrelevant, which is why we use the auto keyword. It suffices to know that it behaves like a regular builtin arithmetic type.
  • The first argument is the name of the variable and used in the histogram definition and e.g. axes labels. The second (optional) argument is the default value.
  • The underlying type is deduced either from the default value (e.g. 1.0 -> double, 42 -> int) or can be explicitly set via the template parameter.
  • Strings can also be a monitored value. The histogram can either have configured labels or the labels can be assigned dynamically as allowed by the ROOT TH1::Fill method.

Tip Refer to the Monitored::Scalar Doxygen for all details and features.

Group

In order to maintain correlations when filling histograms (e.g. eta and phi of a track) the monitored quantities need to be grouped within a Monitored::Group. The filling of the histogram occurs when the Monitored::Group object goes out of scope or when fill() is called explicitly.

Tip There is support for filling with weights or cut masks. See the Monitored::Group Doxygen for full details.

Collections

Any iterable container (e.g. std::vector, DataVector) can be monitored directly and one histogram fill will be performed for each element. The simplest case is if the elements are convertable to a floating point value:

// monitoring of std::vector<float> vec;
auto eta = Monitored::Collection("eta", vec);

warning The above should only be used if the values are already stored in this format. Do not fill a vector just for the sake of monitoring. In most cases one of the following methods can be used.

In case the container (e.g. DataVector<Track>) holds objects with accessors for the monitored quantity, the third parameter can be used to identify the member method (here Track class method) that should be called when retrieving the value or more generally a lambda function:

auto eta = Monitored::Collection( "Eta", tracks, &Track::eta );
auto phi = Monitored::Collection( "Phi", tracks, []( const Track& t ) { return t.phi(); } );

A monitored collection or scalar can also contain strings. Monitoring them will results in an alphanumeric histogram fill:

auto det = Monitored::Scalar<std::string>( "DetID", "SCT" );
Monitored::Group(monTool, det);
det = "PIX";
Monitored::Group(monTool, det);

Tip Refer to the Monitored::Collection Doxygen for all details and features.

Timers

Monitored::Timer and Monitored::ScopedTimer can be used to measure and monitor execution times of code sections.

auto t1 = Monitored::Timer( "TIME_t1" );  // default is microseconds
auto t2 = Monitored::Timer<std::chrono::milliseconds>( "TIME_t2" );
{
  auto group = Monitored::Group( monTool, t1, t2 );
  std::this_thread::sleep_for(std::chrono::milliseconds(10));
}

Histograms definition

The histograms are configured in Python:

from AthenaMonitoringKernel.GenericMonitoringTool import GenericMonitoringTool
monTool = GenericMonitoringTool('MonTool')

#monTool.HistPath = 'MyGroup/MySubDir'  # default is the parent name of MonTool
monTool.defineHistogram( 'nTracks', path='EXPERT', type='TH1F', title='Counts',
                         xbins=10, xmin=0, xmax=10 )
monTool.defineHistogram( 'eta', path='EXPERT', type='TH1F', title='#eta;;Entries',
                         xbins=30, xmin=-3, xmax=3 )
monTool.defineHistogram( 'AbsPhi', path='EXPERT', type='TH1F', title='|#phi|;;Entries',
                         xbins=10, xmin=0, xmax=3.15 )
monTool.defineHistogram( 'eta,AbsPhi', path='EXPERT', type='TH2F', title='#eta vs #phi',
                         xbins=15, xmin=-3, xmax=3, ybins=15, ymin=0, ymax=3.15 )
monTool.defineHistogram( 'TIME_execute', path='EXPERT', type='TH1F', title='Time for execute',
                         xbins=100, xmin=0, xmax=100 )

This will define three 1D histograms and one 2D histogram:

  • The first parameter corresponds to the name of the monitored variable(s) as defined in C++.
  • The histogram can be renamed (aliased) by postfixing the first argument by ;anothername.
  • path is the top-level directory (note however that an additional directory of the name equal to the name of monitored algorithm, tool or service from where the MonTool is invoked).
  • The title uses the same syntax as in ROOT’s TH1 constructor, i.e. 'title;xaxis;yaxis'.

In the above example all histograms would be stored in a directory called MyAlg if MyAlg is the instance name of the algorithm with which this tool instance is connected i.e. MyAlg.MontTool = MonTool. In case of many instances of a given type (e.g. HypoAlgs, HypoTools) a different grouping (e.g. by the class name) might be more appropriate. This can be achieved by setting a specific histogram booking path:

monTool.HistPath = "L2CaloHypo/" + threshold

which would e.g. result in all histograms being stored under the path “EXPERT/L2CaloHypo/HLT_e3/…”.

The defineHistogram method has several additional options. Those that are possibly useful in the HLT case are:

  • option to create histograms with alphanumeric labels for the x-axis e.g. labels=['a','b']
  • option to create Lumiblock based histograms. The setting is: opt='kLBNHistoryDepth=N' where N is the number of lumiblocks from which the statistics is accumulated in one copy of the histogram. While running the trigger in emulators/offline the histograms end up in the expert-monitoring.root file and have additional postfix _LB1, _LB2 if N=1 and _LBX_X+N if N!=1. For online running, lumiblock tagged histograms are published under the same name irrespectively of the LB. If a histogram with whole run statistics is needed in addition it can be booked using the alias option (see above).
  • Auto-binning (opt='kCanRebin') - same number of bins, range is doubled when under/overflows are encountered.
  • Histograms with dynamic axis extension (opt='kAddBinsDynamically') - additional bins are added when under/overflows are encountered (use with caution as this option may result in a large histogram).
  • etc.

TipSee the Doxygen documentation for all available options.

It is recommended that he configuration of the MonTool with histograms is placed within separate function creating an instance of ComponentAccumulator with the MonTool as private tool.

Guidelines

During developments:

  • all variables computed by reconstruction and hypo algorithms (not tools) and monitored should be monitored.
  • when developing new chains, make sure the inputs to a given hypo (hypo tool) make sense
    • e.g., the distributions of variables to be used as inputs to the final selection should show the biases expected by selections applied earlier in the sequence
    • checking this may require more monitoring histograms. They can be added temporarily and removed once the sequence is validated
  • you should always have enough histograms to check that any new development does not affect other chains (to avoid cross talking among chains or wrong configurations)
    • NB: checking final counts may not be sufficient given the low statistics of processed events or the low chain rate

For debugging in nightly tests and monitoring at P1:

  • the level of online monitoring should be enough for debugging the vast majority of issues, but it should also not produce unmanageable number of histograms. Critical judgment is up to each signature
  • each plot at P1 should have a clear debugging/monitoring purpose
  • no need to monitor all single hypo tools, but few plots from a handful of hypos (e.g., from primary or ExpressStream chains) would help
    • e.g., energy distribution of electrons selected in a given hypo is less sensitive to variations in running conditions and menu evolution, so the comparison to a reference distribution is be more reliable
    • don’t duplicate plots of the same variable that is an input to several algorithms (e.g., same input variable to different hypo tools with different thresholds)
  • consider having monitoring plots of inputs from L1 (e.g. RoI multiplicities and energies), this particularly important for commissioning
  • if useful, include 2D maps (e.g., eta-phi, eta-et) and plots vs mu
    • NB: 2D plots are memory expensive, please be cautious with binning
  • keep in mind that in Run3, due to possible mixed filling scheme, the spread of mu across BCIDs can be rather large. Each signature should think whether there are selections particularly susceptible to mu and have some monitoring plots vs mu or BCID
    • e.g., values of scores of MVA classifiers vs mu or BCIDs