Running on the grid

Last update: 26 Oct 2023 [History] [Edit]

In this section of the tutorial we will teach you how to run on the grid. There are two main advantages to running on the grid: First, you have access to the vast computing power of the grid, which means even very lengthy jobs can finish within hours, instead of days or weeks. Second, you have direct access to all the datasets available on the grid, which means you don’t need to download them to your site first, which can save you hours if not days of time. There are also two main disadvantages: First, for all but the simplest jobs your turnaround time will be measured in hours, if not days. And this time can vary depending on the load at the various grid sites. Second, there are more things beyond your control that can go wrong, e.g. the only grid site with your samples may experience problems and go offline for a day or two thereby delaying the execution of your jobs.

Users of AthAnalysis or any other Athena project can learn how to submit jobs to the grid at this link

As a first step, set up the Panda client which will be needed for running on the grid. You should set these up before setting up root, so it’s probably best to start from a clean shell and issue the commands:

setupATLAS
lsetup panda 

Now navigate to your working area, and setup your Analysis Release, following the recommendations above What to do everytime you log in

The nice thing about using EventLoop is that you don’t have to change any of your algorithms code, we simply change the driver in the steering script. It is recommended that you use a separate submit script when running on the grid.

Let’s copy the content of ATestRun_eljob.py all the way up to, and including, the driver.submit() statement into a new file ATestSubmit.py.

If you did the section on direct access of file on the grid the configuration for SampleHandler should already be set, otherwise open ATestSubmit.py and comment out the directory scan and instead scan using Rucio (shown below). Note since we are just testing this functionality we will use a very small input dataset so your job will run quickly and you can have quick feedback regarding the success (let’s hope it’s a success) of your job.

# inputFilePath = os.getenv( 'ALRB_TutorialData' ) + '/mc16_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3126_r10201_p4172/'
# ROOT.SH.ScanDir().filePattern( 'DAOD_PHYS.21569875._001323.pool.root.1' ).scan( sh, inputFilePath )

ROOT.SH.scanRucio(sh, 'data16_13TeV.periodAllYear.physics_Main.PhysCont.DAOD_ZMUMU.repro21_v01/' )

Next, replace the driver with the PrunDriver:

# driver = ROOT.EL.DirectDriver()
driver = ROOT.EL.PrunDriver()

We actually need to specify a structure to the output dataset name, as our input sample has a really really long name, and by default the output dataset name will contain (among other strings) this input dataset name which is too long for the Grid to handle. So after you’ve defined this PrunDriver add:

driver.options().setString("nc_outputSampleName", "user.<nickname>.grid_test_run")

where you should replace <nickname> with your Grid nickname (usually same as your lxplus username), the argument [2] will put the dataset ID (MC ID or run number) and [6] will put the AMI tags (basically we are removing this “physics short” text). Note that if you wanted to rerun this again with the same input but a slightly different analysis setting you would need to come up with a different output dataset name (or you will get an error telling you this output dataset name already exists). You need to have unique dataset names.

The PrunDriver supports a number of optional configuration options that you might recognize from the prun program.

Finally, submit the jobs as before:

ATestSubmit.py --submission-dir=submitDir

This job submission process may take a while to complete - do not interrupt it! You will be prompted to enter your Grid certificate password.

You can follow the evolution of your jobs by going to https://bigpanda.cern.ch/user/.

If you need more information on options available for running on the grid, check out the grid driver documentation.