Containers on the grid

Last update: 03 Apr 2023 [History] [Edit]

It is possible to run customised docker containers either built on athena images, as explained in the docker containers section of the athena tutorial or on on other base images available on dockerhub and elsewhere. These containers are usually called “standalone” because differently from the generic OS containers used by default on the grid they contain all their dependencies (or at least they should). This is useful particularly if the software required hasn’t been installed centrally.

How to run a custom image using prun

Using prun it is possible to run images directly from a registry, typically docker.hub but also gitlab. For examples to run a Hello World using an alpine image from docker.hub you can try

prun --containerImage docker://alpine \
--exec "echo 'Hello World\!'" --outDS user.$RUCIO_ACCOUNT.test

where RUCIO_ACCOUNT is usually your CERN user name.

The task will appear in BigPanDA as a normal task and as explained in the basics you can access the log files to see the stages of the job. Note that the images have to be public to be accessed by the pilot.

How to add images to CVMFS

For scalability reasons it is better not to run directly from the registries but to put your image in CVMFS.

To do that you need to edit this file on gitlab directly to add your image. The system will then fetch it from the registry, build it and unpack it in CVMFS as a singularity image usable on the grid by every job. Every time you update the image the system will update also the CVMFS copy. If you use wild cards the system will get all the tags. You can add images from any registry as long as they are publicly accessible (like for the pilot).

Once the image is unpacked it will appear in this CVMFS area /cvmfs/unpacked.cern.ch/ under the specific registry directory.

For example this line https://registry.hub.docker.com/llorente/threejet-nnlo:* will result in a llorente directory with all the ‘threejet-nnlo’ tags unpacked underneath

ls /cvmfs/unpacked.cern.ch/registry.hub.docker.com/llorente/
threejet-nnlo:0.1  threejet-nnlo:0.2  threejet-nnlo:0.3  threejet-nnlo:0.4

The equivalent command line to the alpine Hello World in the section above is the following

prun --containerImage /cvmfs/unpacked.cern.ch/registry.hub.docker.com/library/alpine:3.10.2 \
--exec "echo 'Hello World\!'" --outDS user.$RUCIO_ACCOUNT.test

where we just replaced the registry image with its unpacked instance in CVMFS.

The basic explanation of why it is better to do this step is because from a grid job point of view there are two main differences between the registry and CVMFS currently

  1. If you run from the registry the middleware has to transfer the whole image and build it on the WNs before using it. While this method can be convenient for low level testing, it definitely is not for a large scale submission of jobs. On top of this registries have either caps on the number of accesses or on the bandwidth users can use, so if you send thousands of jobs trying to download the images they will fail.
  2. CVMFS instead builds the image only once, and when it is updated and only serves the pieces of code requested by the job not all of it beforehand and those pieces of code are then cached on the WNs for further use.