ATLAS Analysis-based Software Tutorial

You’ve seen some messages in log files; now let’s look into ATLAS messaging in some more detail. For the tutorial material, only reading through “Printing values” is needed.

Anatomy of messages

In ATLAS, messages are usually identified by three things:

Source: Typically an algorithm or a tool, generally identified by their name. Sometimes we also have standalone code messaging, which should then try to pick its own unique and identifiable name.
Level: This indicates how important/severe the message is. The primary use of this is to filter out irrelevant messages. By default all messages of level INFO or higher will get displayed, but this is configurable.
Content: This can in principle be just any string, but try to include enough information that users can understand what you are trying to say without having to look at your source code. Also, if your class is complex or spread out over multiple files, consider including information like the function name, so that users can find the source more easily. (Note that this is often a warning sign that your algorithm/tool has grown to be overly complex, and could benefit from being split up into multiple classes.)

Message levels

For most of the tutorial we use the message level INFO indicating that we have something to tell the user, but that nothing went wrong and there is nothing to worry about. It is also the lowest message level that still gets displayed by default. The main reason we choose INFO is because we don’t want to complicate the tutorial further by constantly adjusting message levels. However, for production-quality code (which includes your analysis code) you typically want to reduce the amount of print-out you get (otherwise you may miss important messages or have gigantic logfiles). Generally all the print-out in this tutorial would really be more appropriate at the DEBUG or VERBOSE level instead.

A good rule is that production-quality code should not use INFO messages inside execute(), and should only use it sparingly anywhere else. It is perfectly fine for an algorithm or tool not to produce any INFO level printout at all.

The complete list of message levels in decreasing order of priority is:

`FATAL`	Something went severely wrong and we will end the program immediately
`ERROR`	An error occurred and we are about to return a `StatusCode::FAILURE` (normally leading to program termination)
`WARNING`	Something went wrong that the user should investigate and fix, but the program can continue to run
`INFO`	Information that may be relevant for the user, but that doesn’t in itself indicate that anything went wrong
`DEBUG`	Additional information that can be enabled when trying to debug your code in an effort to understand what it is doing. This is typically used to indicate when certain points in the code are reached for each event.
`VERBOSE`	Even more debugging output, that allows to follow in great detail what your code is doing. This can be used to print out values that are set multiple times per event that might shed some light on the cause of a crash.

If you look at existing/older code you may find them using any variety of other messaging mechanisms (e.g. std::cout, Info()). While opinions are somewhat divided on how bad that is, for new code you should try to stick to the messaging macros below.

Do not use WARNING for strange but common situations that the algorithm can handle gracefully. WARNING should indicate that someone should take an action.

There is special code that prevents DEBUG and VERBOSE messages from begin evaluated if the logging is set to a higher level. That can be important in case what you want to print for debugging is hard to calculate. You can use msgLvl functions to check what the message level is in case you’d like to do the same inside your code, e.g.:
if ( msgLvl(MSG::DEBUG) ) {
  some_object->print_function();
}

The compiler recognizes special macros that can be extremely useful for writing debug messages: __FILE__, __LINE__ and __FUNCTION__. As you’d expect, they will provide the file, line number in the file, and function name. When you really can’t tell where code is crashing and want to add print-outs, adding __LINE__ can be handy to see what’s going on, e.g.:
ANA_MSG_DEBUG( "On " << __LINE__ " variable= " << variable);

ANA_MSG_* and ATH_MSG_*

In general, in your C++ code, you should write an INFO message, for example, using:

ANA_MSG_INFO("Hello from my code");

These macros will automatically identify the component (algorithm or tool) from which they are being called and include that information in the log.

There are two sets of message macros, one starting with ANA_MSG_ the other starting with ATH_MSG_. By and large they do the same thing, though in this tutorial we stick with the former rather than the later. The main difference is that inside the Athena environment you cannot use ATH_MSG_* for standalone functions as described below (in AnalysisBase there is literally no difference between the two).

Example: Output in your test algorithm

Let’s print some information from the execute() function of your analysis algorithm. In particular, let’s print the run and event numbers from the EventInfo object. First we have to teach your analysis code where the EventInfo object is defined. In your MyAnalysis/CMakeLists.txt file, you will need to add the library that includes EventInfo to the list the compiler looks through:

# Add the shared library:
atlas_add_library (MyAnalysisLib
  MyAnalysis/*.h Root/*.cxx
  PUBLIC_HEADERS MyAnalysis
  LINK_LIBRARIES AnaAlgorithmLib xAODEventInfo)

In the source code, in Root/MyxAODAnalysis.cxx, you will need to add the header file that defines an EventInfo object. At the top of the file, add:

#include <xAODEventInfo/EventInfo.h>

Now in the execute() function of MyxAODAnalysis.cxx, you will need to retrieve the EventInfo object. Most objects you’ll want to retrieve are stored in a “store” (also called a “white board”). The store that holds data that changes with every event is called the “event store”; there is also a “detector store” and “conditions store”. The EL::AnaAlgorithm base class gives you convenient ways to access the stores, so we can just retrieve the data we want. In the execute() method, add:

  // Retrieve the eventInfo object from the event store
  const xAOD::EventInfo *eventInfo = nullptr;
  ANA_CHECK (evtStore()->retrieve (eventInfo, "EventInfo"));

Then we can add some output to print the run and event numbers in the log file:

  // Print out run and event number from retrieved object
  ANA_MSG_DEBUG ("in execute, runNumber = " << eventInfo->runNumber() << ", eventNumber = " << eventInfo->eventNumber());

That’s it! Try to recompile your algorithm, and then run it again as you did before. Did you see any new output? Why or why not?

Need a hint?
We added the output at "DEBUG" level, and the default output level is "INFO", so your output isn't shown! You will see how to set the output level for your algorithm in the next sections, but if you want to see it right away you can add to your python configuration:

alg.OutputLevel = ROOT.MSG.DEBUG

Printing values

The ANA_MSG_INFO macro works quite similar to std::cout (it uses an std::ostream underneath), so you can print out variables using the same syntax, e.g. you could print out a loop index like this:

for (unsigned iter = 0; iter != 10; ++ iter)
  ANA_MSG_INFO ("iter = " << iter);

Use in standalone functions

If you want to use ATLAS messaging outside of a tool or an algorithm, you need to do a little more work. There are multiple options for this; one of the most simple ones is to use AsgMessaging/MessageCheck.h.

Let’s say you have an existing function:

...
void function ()
{
...
  std::cout << "[some message]" << std::endl;
...
}
...

With ATLAS messaging this would look like this:

#include <AsgMessaging/MessageCheck.h>
...
void function ()
{
  using namespace asg::msgUserCode;
...
  ANA_MSG_INFO ("[some message]");
...
}
...

This will just attach all your messages to a component for user code (i.e. it will all appear to come from the same source), which is fine in some situations and problematic in others. If you use this for anything else than end-user analysis code you should probably define your own component just for that. That will both show the correct name with your messages and allow users to adjust the message level for it separately from other messages. Details can be found in the MessageCheck header.

Changing message output levels

You can change the message output level of any algorithm or tool using the pre-existing “OutputLevel property”. We will cover this in more detail in the next section when we discuss the “OutputLevel” property.

Changing the output level of code outside of algorithms/tools is much less straight forward, so in general the recommendation is to not print DEBUG and VERBOSE messages outside of algorithms and tools. Users will have a very hard time seeing those messages.

Changing message output to assist debugging

Much of the syntax described in this section applies to Athena. The same conceptual changes can be made in AnalysisBase with different commands.

The message output level can be changed to DEBUG or VERBOSE at a global level (for all algorithms and tools being run), e.g. athena -l DEBUG ... or other ways. However, it is often much more useful only change the OutputLevel for the tool or algorithm you are investigating. Most components should have this property. This avoids your output getting cluttered with useless printout. In Athena, there are job properties under ConfigFlags.Exec that allow you to do this easily, e.g. DebugMessageComponents and equivalent for the other types of messages. This is a list that takes in Unix-like path, allowing \* wildcards, to component names. An example is Exec.DebugMessageComponents='["*/FPGATrackSim*"]'. See also Properties for Debugging for a bit more on this kind of functionality.

It is also useful to adjust two other messaging properties. The first is the number of messages before further output is suppressed. This defaults to 500 messages. It can be disabled globally by setting the MessageSvc property enableSuppression=False, and it can be set per message level, e.g. with the property verboseLimit=1000. See MessageSvc.h for the default values and names of the other limits.

The second property controls the format of the output text. With the default formatting, long tool or algorithm names will be cutoff. The Format property of the MessageSvc controls the formatting. For example, setting it to Format="% F%100W%S%7W%R%T %0W%M" will change the default first column width of 18 to 100. The default format string can be found here along with definitions of the magic characters. The best way to adjust this is via the ConfigFlags that are, e.g., available in a preExec. It can be set with flags.Common.MsgSourceLength=100.

Here is an explicit example of adjusting these two properties via a postExec with a Component Accumulator configuration. This is not the recommended way to do this, but enableSuppression is not available via the flags. Since this is run after the configuration, the MessageSvc should exist, and the main CA is available as cfg, see steering doc page.

... --postExec "msgsvc = cfg.getService('MessageSvc'); msgsvc.enableSuppression=False; msgsvc.Format=\"% F%100W%S%7W%R%T %0W%M\""

Log levels in transforms

One common desire when running a job transform like Sim_tf.py or Reco_tf.py is to change the log level of the output, for example to include debug messages.

There is a command-line setting for the transforms to change their output level:

Reco_tf.py --loglevel=DEBUG ...

This setting affects only the transform itself, and not the athena job that is configured by the transform. In order to change the log level for the athena job, you should instead use:

Reco_tf.py --athenaopts="-l DEBUG" ...

ATLAS Messaging