The Basics of Rucio

Last update: 23 Aug 2024 [History] [Edit]

Set up environment

It is strongly recommended if using Rucio to use a fresh / clean shell that is different to the one you are using for Athena etc. If you do need to set up Rucio in the same session, it should be set up at the time you call asetup using the “wrapper” mode:

lsetup 'rucio -w'

To load the environment on lxplus do:

lsetup rucio
voms-proxy-init -voms atlas

Enter your grid password when prompted by the voms command, and answer yes to any questions (if the client software requests it).

This will set up the environment to allow authentication via grid tools and set up and configure the Rucio clients.

tip Note that this will always set up the latest Rucio client version.

Don’t forget to create your ATLAS proxy if you didn’t do this in the first step:

voms-proxy-init -voms atlas

These proxies last 12 hours by default. If you set up software on the same machine, you might find that you still have a valid proxy that can be used. The setup software will let you know.

Basic information

First, let’s ask a few simple questions about Rucio, and what Rucio knows about you.

Use the commands:

rucio ping
rucio whoami

The first command will print the version of rucio that you have set up and the second will print information about the Rucio account you are using.

tip Note that Rucio accounts can represent users (i.e. you), groups (e.g., Higgs) or activities (e.g., tier0), and your credential (c.f., grid certificate) can map to several of these accounts if required (e.g., for group production roles).

Scopes

Just like namespaces in C++, Rucio has the concept of a Scope and can help to organise datasets etc. To see that there is already a scope for you, type:

rucio list-scopes | grep $USER

tip On lxplus, the environmental variable USER is usually the same as the nickname associated with the rucio account. On other computers, use the variable RUCIO_ACCOUNT to achieve the same effect.

To see all available scopes, just use the command rucio list-scopes.

All items known to Rucio, e.g., Files, Datasets, and Containers, are created within a scope with a name (known as a Data Identifier or a DID), which is unique within the scope. Everything can then be defined with a scope and a name: <scope>:<name>.

tip In the MC generation section, you were introduced to the concept of a DSID, which is a number that shows up as a subset of the DID.

tip For most ATLAS datasets that follow standard naming rules, the scope is also the first part of the dataset name. That allows Rucio to guess the scope in many cases, so you don’t always need to provide it. It is good practice to specify it in any case.