The PanDA client package
contains a number of tools you can use to submit and manage analysis
jobs on PanDA
.
While pathena
is used to submit Athena user jobs to PanDA
, more general
jobs (e.g. ROOT and Python scripts) can be submitted to the grid by using
prun
.
Finally, pbook
is a python-based bookkeeping tool for all PanDA
analysis jobs.
Detailed information about each of the tools can be found in the above page.
If you are working on lxplus, the client is already installed, so to use it you only need to do:
setupATLAS
lsetup panda
Here we have set up the cvmfs software environment and asked to set up the Panda Clients.
To keep your grid work consolidated, create a new directory within tutorial
called GridTutorial
.
From your GridTutorial
directory, create a new directory for a simple
prun
test and navigate into the new directory:
mkdir prunTest
cd prunTest
This is important because
prun
sends (almost) all files in, and below, your current directory to the grid. Any unnecessary files in your current directory will also be sent to the grid and will slow down the job launch.
Now, create a python script called HelloWorld.py
(using your favorite editor),
that contains the following lines:
#!/usr/bin/env python3
print("Hello world!")
Make the python script executable with:
chmod u+x HelloWorld.py
Before launching it to the grid, it is important to run it locally. You should always do this to avoid wasting grid resources on jobs that crash locally.
./HelloWorld.py
Now we can submit the prun
command to run this script on the grid:
prun --outDS user.<nickname>.pruntest --exec HelloWorld.py
Here <nickname>
is your grid nickname/grid name (which is the same
as your lxplus username).
This will queue two jobs, one build job that recreates your job environment
and one corresponding to the actual Hello World
job. The build job will
execute first, and once it has finished the Hello World
job will be executed.
When the job has finished, we will try to find the “Hello world!” message in
the output!
If the job is successfully launched, you will see a confirmation printed
to the screen along with a jediTaskID
number that you will need in the
next step.
To monitor the progress and check the log file output of a job, we can use
the big panda monitor https://bigpanda.cern.ch. Scroll down to the field
for Task ID
and enter the number we noted above, and click on search.
This will send you to a page associated with this task, and shows that there are two jobs (in some stage of running). Have a look on this page to see the various pieces of information provided.
Now we will try to look at the output. We will need both jobs to have finished. If the jobs do not seem to have started running after a few minutes, it is suggested to carry on with the tutorial, and to check back frequently on the status of the jobs.
From the task
page, click on the Show jobs
drop-down menu:
This will give you several options of associated jobs to view. Click on
All (including retries)
to see a list of all jobs associated with the
task:
If you want to get to the list of jobs directly, you can use the URL
https://bigpanda.cern.ch/jobs/?jeditaskid=4203786
and modify the
the jeditaskid
to the value corresponding to yours.
From here, you can click on the PanDA ID
number corresponding to a
job. This will take you to a new page with details about that job. From
this page, click on the Logs
drop-down menu:
Now, click on Log Files
to get a list of the log files associated
with the job:
The log containing the Hello World
output is called payload.stdout
. Click
on it to open it and try to find the “Hello world!” message.
This forms the basis of simple debugging of jobs that fail on the Grid. As you will have tested the job locally first, if there is a problem, it may be a transient grid error, but it is useful to know how to search for problems in the output files. Note, it is also possible to download the log files as a dataset using dq2/Rucio.
Now, because we did not need to compile any code to run this job - it’s just a simple python script - we do not actually need the build stage of the job.
To launch a prun
job, without the build stage, type the following command:
prun --noBuild --outDS user.<nickname>.pruntest --exec HelloWorld.py
Now, only the script will be run. Usually though, you will probably want to compile some code first. You can read more about the PanDA tools.
So far, only the basics of the BigPanDA monitoring are described. As an optional exercise, see if you can find the link that shows all of your jobs. This is a good page to bookmark, to come back to in future.