Advanced -- Running a full sample on the grid

Last update: 14 Feb 2024 [History] [Edit]

You may (for example, when performing “official” physval checks) need to run a check on the entire sample. It’s computationally prohibitive to run these extensive checks locally, and using the grid is far preferable. First make sure that you have a valid grid certificate (if you do not please follow the instructions here ). The tutorial package has an example grid submission script gridsubmit.sh, which you can open up in your favorite editor and modify.

#!/bin/bash
UserPrefix="user.mvessell" #replace with your own grid username
suffix=$2
if [ -e $1 ]
then list=$1
else list="list.txt";echo $1 > $list   #here list.txt is a simple text file with the full name of your desired data sets
fi
outDS=${UserPrefix}.HIST_IDPVM.${suffix}

#replace the file below with an (can be any) AOD file on your machine
#It is only needed for panda to understand that the input will be AOD, no actual analysis is performed

ExampleFile=/eos/atlas/atlastier0/rucio/data18_13TeV/physics_Main/00357750/data18_13TeV.00357750.physics_Main.merge.AOD.f1161_m2066/data18_13TeV.00357750.physics_Main.merge.AOD.f1161_m2066._lb0121._0001.1

#The addNthFieldOfInDSToLFNneeds to adapt to the naming of the datasets you run on –for standard ATLAS datasets, it would for example be
#--addNthFieldOfInDSToLFN=1,2,3,6
#it specifies which “pieces” of the DS name (if split at each of the dots) get added to the output file names
pathena \
--filesInput ${ExampleFile} \
InDetPhysValMonitoring/InDetPhysValMonitoring_topOptions.py \
--inDsTxt $list \
--outDS $outDS \
--addNthFieldOfInDSToLFN=5,6 \
--express \
--useNewCode \
- --doExpertPlots

# IMPORTANT: do NOT –mergeOutput–the ATLAS merge script is broken and breaks our efficiency plots
# Can add${@:3} if you want to pipe any further command line argsinto IDPVM command line arguments

This submission script just takes a simple text file with a list of datasets as an argument, if you open up list.txt in your favorite editor you will see

data18_13TeV.00357750.physics_Main.merge.AOD.f1164_m2066
data18_13TeV.00357750.physics_Main.merge.AOD.f1141_m2066
data18_13TeV.00357750.physics_Main.merge.AOD.f1138_m2066

replace the ones here with your desired test dataset names. Then you can just do

setupATLAS
asetup Athena,22.0.45
lsetup rucio
voms-proxy-init -voms atlas
lsetup panda
source GridSubmit.sh list.txt testnamestring

to submit your jobs.

You can check on the status of your jobs using BigPanDA and clicking myBigPanDA in the top menu. Once these are finished you can download the finished histogram off of the grid using rucio and the output dataset name you specified (generally you will get a helpful email with the full name of the output dataset, which will look something like user.yourusername.HIST_IDPVM.testnamestring_M_output)

rucio get user.yourusername.HIST_IDPVM.testnamestring_M_output

Now you can hadd together all of the output and proceed with the rest of the section. :-)

Take me back!