In the previous step, we saw how to find and open and view a log file in a web browser. But what if we wanted to download it? We can do this using Rucio tools.
After one of your jobs has completed we will now find and download the log file.
When using Rucio, it is almost always better to use it in a separate terminal to where you are running your code or submitting grid jobs. This will minimize the potential for conflicts between different python versions.
Setup the Rucio tools if you haven’t done so already.
lsetup rucio
If you are working on a new lxplus node, or any computer where you didn’t just submit your PanDA job, you may need to create a voms proxy:
voms-proxy-init -voms atlas:/atlas
Unlike
pathena
andprun
,rucio
won’t do that for you.
Go back to the BigPanDA web page and find the
page with the jediTaskID
that we used previously. Search for the Output
entry in the Containers
table, and note the log file container name, e.g.,
user.aparker.pruntest.log
. Back in your terminal session, try to find this
log file in the grid:
$ rucio list-dids user.aparker:*pruntest*log*
+--------------------------------------------------+--------------+
| SCOPE:NAME | [DID TYPE] |
|--------------------------------------------------+--------------|
| user.aparker:user.aparker.pruntest.log | CONTAINER |
| user.aparker:user.aparker.pruntest.log.340520924 | DATASET |
+--------------------------------------------------+--------------+
We now have two options:
Let’s do the second:
rucio download user.aparker:user.aparker.pruntest.log.340520924
After it finishes downloading, navigate into the downloaded directory and extract the files from the tarball:
cd user.aparker.pruntest.log.340520924/
tar -xvf user.aparker.pruntest.log.23186476.000001.log.tgz
A tarball (a file with the
tgz
extension) is a set of files packaged together and compressed usinggzip
. This is an efficient way to transfer large numbers of small files.
This will give you access to the log file (as well as much more related information) from your job. This can be useful for debugging.
There will be a lot of information in here but when you have extracted the
logs, the file you are probably looking for is payload.stdout
As you learned earlier, you can also directly download individual files with Rucio. If you go to the PanDA job page that you saw earlier, you will see in the table that says
3 job files:
the log tarball. You can directly download that file with rucio:rucio download user.aparker.pruntest.log.23186476.000001.log.tgz
Here rucio guessed the right scope to use thanks to the name of the tarball.