Using Batch Systems

Last update: 16 Oct 2018 [History] [Edit]

We don’t really have a batch system that everyone can use as part of the tutorial, and even if we did, it can take quite a while to get batch jobs to turn around (though it should generally be faster than submitting to the grid, or you may just as well submit to the grid). There used to be a system called lxbatch that was available from lxplus, but that has been decommissioned.

Users of AthAnalysis or any other Athena project can learn how to submit jobs to the batch systems at this link

Generally the procedure is to replace the driver you have been using so far with a driver specific to your batch system. There are a fair number of different drivers available, and adding another one is often just a matter of a few hours, even if you are not an experienced developer. However, don’t be afraid to ask for help if you feel a driver for your system is missing. The list of supported batch drivers can be found here: Event Loop: Batch System Drivers

A sample setup of a batch driver may look like this:

  EL::LSFDriver driver;
  driver.options()->setString (EL::Job::optSubmitFlags, "-L /bin/bash"); // or whatever shell you are using
  driver.shellInit = "export ATLAS_LOCAL_ROOT_BASE=/cvmfs/ && source ${ATLAS_LOCAL_ROOT_BASE}/user/";

Note the shellInit parameter, which is used to set up AtlasLocalSetup on each of the worker nodes. You may have to adjust this to whatever you need to setup atlas software on your machines.

If you are using the compiled application to run your macro, you need to do one thing: in your compiled steering macro MyAnalysis/util/testRun.cxx add this near the top with the other header include statements:

#include <EventLoop/LSFDriver.h>

Note that the software generally assumes that you have a shared filesystem that can be used to share your software between the machines and to return the output files. For condor batch systems there are options to submit jobs without a shared filesystem. There are plans to switch batch submission to use docker, which would hopefully make it easier to support operation without a shared filesystem with other batch systems.

There is also a special driver called LocalDriver which simulates submitting to a batch system on a single machine. If anything goes wrong, running with LocalDriver is the first thing you should try. This is also the only batch driver that is fully supported centrally, all other batch drivers are only supported on a best effort basis (as full support for a diverse set of batch systems would be very difficult).