EventLoop Motivation and Goals

Last update: 06 Nov 2019 [History] [Edit]

The purpose of the EventLoop package is to relieve the user of the burden of writing his own event loop. For the user this has a number of advantages:

  • The steering code for running on a local batch cluster or the grid can be technically complicated, particularly when creating and managing output datasets. Furthermore this code has to be updated somewhat regularly, to keep up with changes to the grid infrastructure, etc. The EventLoop package shifts that burden from the individual analyzer to the EventLoop package maintainer.
  • Since the same package provides the interfaces for different architectures it is fairly straightforward to change from running on your local machine, to your local batch system or the grid. For the analyzer that means that if his analysis evolves and he needs more computing power it is very little effort to just move his jobs to a new site.
  • When running on the grid, individual job failures are relatively common. The grid driver in the EventLoop package is developed and maintained by the grid community with the express purpose of allowing automatic recovery from common job failure modes. For the analyzer that means he has to spend less time retrying failed jobs manually.

The main drawback of using the EventLoop package is that you have to restructure your code a little bit. How much you have to do, depends on how your code looks like now. The main points are:

  • Since you no longer run your own event loop, you have to encapsulate your code inside a class that can then be called from the EventLoop package. If you have been using TTree::MakeClass this may be a new concept, if you have been using TTree::MakeSelector or another event loop package, then this concept will already be familiar to you.
  • The build is being handled by cmake, meaning your analysis code has to be inside one (or several) cmake packages. This is necessary, because cmake is the only build system we have that builds out of the box at all the different sites that your code may run on.
  • The samples are provided through the SampleHandler package. How much use you make of the SampleHandler package is up to you. You can use it to also manage your samples, or simply create the SampleHandler objects when you initialize the event loop. SampleHandler was chosen to describe the samples, because we had to pick a way and SampleHandler is both: generic and an official PAT package.
  • The submission has to happen through a root/PyROOT script. This has to happen, because you are supposed to configure your objects before handing them over to EventLoop. While this may be a change for some people it was deemed the easiest solution. The alternative would have been to pass in a configuration file that your objects can read, but that would have created extra work for many users who are not currently set up to read configuration files.

There are a number of solutions that provide similar services to EventLoop. On the one hand there is athena, on the other hand there are a number of user packages doing these things. While they have a lot in common, they also differ in certain points. EventLoop has the following features, which at least in this combination are not available in other products:

  • EventLoop is an official ATLAS project. This means that future support and maintenance rests on a much better base than for projects that people maintain in private. It also means that the EventLoop package is likely to be integrated better with other ATLAS packages.
  • EventLoop supports running on the local machine, batch systems and the grid (Kubernetes support to come). Most other solutions only work on a subset of these. It should be noted though that most users only use a subset, so they may not care.
  • EventLoop is a very focused package. All it does is running the event loop, and doesn’t try to provide a solution for configuring jobs or reading event data into memory. This gives the user the freedom to choose how he wants to do those tasks independently from choosing how to run the event loop.
  • EventLoop is designed to be extendable to new environments. Everything that is specific to a certain architecture is factored out into two or three classes. If you want to add another one, you only have to add these classes for your specific case.
  • EventLoop is fairly lightweight and flexible. It should be comparatively easy to take existing code and switch it over to using EventLoop. This should be true both for end user code and other frameworks that want to offload the burden of running the event loop.
  • EventLoop doesn’t require the user to copy any source files into his own area and modify them. While this may not seem like much, there is an inherent work load once the user takes over control of files. Essentially changes no longer get propagated to him automatically and instead he has to integrate them manually.