The grid driver is currently broken. Do not do the exercises in this section until it is fixed.
You can run your algorithm on the grid through built-in functionality.
In a new shell, navigate to AnalysisTutorial
. Set up panda
and
then set up your analysis release.
The nice thing about using EventLoop (and Athena) is that you don’t have to change any of your algorithms code, we simply change the driver in the steering script. It is recommended that you use a separate submit script when running on the grid.
Let’s copy the content of ATestRun_eljob.py
all the way up to, and
including, the driver.submit()
statement into a new file
ATestSubmit_eljob.py
. In the new file, change submit()
to submitOnly()
.
If you use the
submit()
command, the script will wait until the grid jobs are finished, which we don’t want. ThesubmitOnly()
command will launch the jobs and then return control.
Next, we need to tell SampleHandler how to find the input file(s) on the
grid. In ATestSubmit_eljob.py
, comment out the directory scan and instead
scan using Rucio (shown below).
Since we are just testing this functionality we will use a very small input dataset so your job will run quickly and you can have quick feedback regarding the success (let’s hope it’s a success) of your job.
#sample = ROOT.SH.SampleLocal("dataset")
#sample.add (os.getenv ('ASG_TEST_FILE_MC'))
#sh.add (sample)
ROOT.SH.scanRucio(sh, ' mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_p5169/' )
Next, replace the driver with the PrunDriver
:
# driver = ROOT.EL.DirectDriver()
driver = ROOT.EL.PrunDriver()
We actually need to specify a structure to the output dataset name, as
our input sample has a really really long name, and by default the
output dataset name will contain (among other strings) this input
dataset name which is too long for the Grid to handle. So after you’ve
defined this PrunDriver
add:
driver.options().setString("nc_outputSampleName", "user.<nickname>.grid_test_run")
Be sure to replace <nickname>
again
The PrunDriver
supports a number of optional configuration options
that you might recognize from the prun
program. If you want detailed
control over how jobs are submitted, please consult this page for a
list of options: GridDriver
Finally, submit the jobs as before:
ATestSubmit_eljob.py --submission-dir=submitDir
This job submission process may take a while to complete - do not interrupt it! You will be prompted to enter your Grid certificate password.
If you need to log out from your computer but you still want output to
be continuously downloaded so that it is immediately available when
you come back, a somewhat more advanced GridDriver
exists which will
use Ganga and GangaService to keep running in the background, see the
EventLoop twiki page
for more info.
If you need more information on options available for running on the grid, check out the grid driver documentation.
If you are using Athena, you can learn how to submit jobs to the grid using
pathena
at
this link