Athena I/O and the xAOD EDM

Last update: 22 Oct 2024 [History] [Edit]

This page gives an overview of the Athena I/O system and the event data model used to write analysis level data. This is a large and technical topic, so these pages aren’t intended to be an in-depth manual. Rather, they are intended to give a broad overview, acquaint readers with the main terms and components, and provide links to relevant resources.

Persistent data storage

Data manipulated by Athena are usually in the form of C++ objects. These must be represented on disk such that they can be read back and regenerated in memory at a later time, potentially by a newer version of the software. ATLAS accomplishes this task using a software layer called POOL which has its ATLAS instance in APR, with the back-end I/O being provided by ROOT. Athena interacts with POOL via software in the AthenaPOOL packages.

Any class whose instances need to be stored to disk must be provided with a “POOL converter”. Generally it’s not necessary to write a POOL converter manually - it can be done automatically by an ATLAS script built into CMake (atlas_add_poolcnv_library). A step-by-step example is given here (internal). More information on the CMake setup for ATLAS can be found here.

Transient/persistent conversion and schema evolution

The transient/persistent (T/P) separation mechanism allows the classes to evolve in newer versions of the software whilst allowing existing persistent versions to be read back, known as “schema evolution”. This is accomplished by means of “T/P converters”. A T/P converter converts between the (latest/current) transient version, and a persistent version. Unlike the POOL converters, T/P converters need to be written by hand. T/P converters are only required for non-xAOD classes (see below) and when “versioning” is involved in some way (although at this stage in the lifetime of ATLAS this does involve almost all classes).

A full guide to the T/P mechanism is given here (internal).

xAOD event data model

An event data model is a collection of classes — interfaces and concrete types — and their relationships which, together, provide a representation of a physics event, recorded by the ATLAS detector. It defines both how they are represented in memory (transient form) and on disk (persistent form). Using the same, coherent classes throughout the software improves commonality and coherence across the experiment, facilitates the use of more common and higher quality software, and allows for common object definitions (in both a software and a physics sense).

The xAOD EDM has been developed by ATLAS to represent analysis-level data objects in AOD and DAOD files. As such the xAOD objects are the data structures that are most commonly encountered by most ATLAS members. The objects in the xAOD are split into two types: interface objects and payload objects. The interface objects provide a user interface to the type, allowing operations such as electron->pt(), but do not contain the numerical data itself. The numerical payload is instead held in the payload objects. These allocate continuous memory for the data, and allow the interface objects to access this data. The data structure currently used by ATLAS for the payload is the ROOT TTree, but the EDM is actually not bound to a specific storage technology. Instances of the payload classes are referred to as “auxiliary stores”.

xAOD classes consequently have a “dual personality”: in Athena and other applications using an explicit event loop, the interface objects are used and the programmer is presented with containers of objects belonging to a given class (e.g. xAOD::TrackParticle, xAOD::Muon), with each data member being provided by an accessor method as shown above. However, if using ROOT or a similar column-based application the payload can be read directly since they are contiguous in memory and ultimately written as branches of a TTree. This means that files written in the xAOD format can be opened in ROOT without any ATLAS libraries, and plots made from the event data in exactly the same way as with a plain TTree. The xAOD has also enabled the development of non-Athena analysis frameworks since Athena libraries are no longer needed to read the files.

For examples on how to use the xAOD in analysis, please refer to the analysis tutorial

xAOD design

A formal description of the xAOD design can be found here. The key components, concepts and features are:

  • SG::AuxElement: all xAOD interface objects inherit from this class. It provides a consistent means of accessing the payload from all interface types.
  • xAOD::IParticle: most xAOD objects inherit from this class (which in turn inherits from SG::AuxElement) but not all - xAOD::EventInfo is an obvious exception. It provides a uniform interface for accessing 4-momentum / particle information about different types.
  • DataVector<T>: this represents containers of objects of type <T>. It behaves much like std::vector<T*> but has a number of additional features, most importantly covariance such that since xAOD::Muon inherits from xAOD::IParticle, DataVector<xAOD::Muon> also inherits from DataVector<xAOD::IParticle>. It includes code for implementing the separation of the interface and payload
  • SG::ÌAuxStore: the abstract interface to the auxiliary stores, allowing the connection between the DataVectors and the numerical payload
  • Versioning: concrete xAOD types always have a name ending in _vX, but in user code these are never directly used, with versionless typedefs used instead. This allows the conversion of an old _vX on disk to a newer version _vY in memory with a more up to date version of the software
  • Static and dynamic auxiliary stores: the static auxiliary stores contain the variables that are regarded as class members. In addition each static store references a dynamic store, which can be used to hold arbitrary data added on the fly. The static store forwards any requests for variables that it does not manage to the dynamic store
  • Shallow copies: a copy can be made of a DataVector with auxiliary data. The auxiliary store for this copied vector is of the type xAOD::ShallowAuxContainer. It maintains a reference to the original store. Any requests to write a variable will be carried out in the xAOD::ShallowAuxContainer, while read requests for variables not in the xAOD::ShallowAuxContainer. will be forwarded to the original store. This allows one to make a copy of a container and change a few variables, but still share the storage for most of the data.

Several of these features are only possible due to the separation between interface and payload.