Now let’s try to list some interesting information. Remembering that all datasets (etc.) exist with a scope, we can try to list all the known items within your scope:
rucio list-dids "user.${USER}:*"
Depending on the type of shell you are using, the quotes may or may not be important.
If this is the first time you are using the grid, then this may well not show anything.
Let’s now try and find some data. We took data in 2022, and the
data was (mostly) at 13.6TeV centre of mass energy. The data should
then be in the scope data22_13p6TeV
.
For MC datasets, the year is to do with the start of production in a configuration, which may not correspond to the year of the data it is modeling. For example, MC datasets modeling Run 2 are in the
mc20_13TeV
scope. For data, the year should match the datataking year except in rare cases. The way datasets are named are described in the nomenclature rules and here.
Try to find the list of all DIDs for the data22_13p6TeV
scope.
rucio list-dids "data22_13p6TeV:*"
Now try to limit the returned set of items to those of type dataset and from run 450445 and of type AOD
Previously you used the wildcard *
to search for all names in the scope. You
can also search for patterns, e.g., *Main*
Additionally, you can supply an
extra --filter
argument to the command to filter the results.
rucio list-dids "data23_13p6TeV:*450445*" --filter type=dataset,datatype=AOD
There may be a lot more, but here is a taste of what you might see…
+---------------------------------------------------------------------------------+-----------------+
| SCOPE:NAME | [DID TYPE] |
|---------------------------------------------------------------------------------+-----------------|
| data23_13p6TeV:data23_13p6TeV.00450445.physics_Main.merge.AOD.f1342_m2167 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.physics_Main.merge.AOD.x731_m2165 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.physics_MinBias.merge.AOD.f1342_m2167 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.physics_ZeroBias.merge.AOD.f1340_m2165 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.physics_CosmicCalo.merge.AOD.f1340_m2165 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.express_express.merge.AOD.f1340_m2165 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.express_express.merge.AOD.x731_m2165 | DIDType.DATASET |
| data23_13p6TeV:data23_13p6TeV.00450445.physics_CosmicCalo.merge.AOD.x731_m2165 | DIDType.DATASET |
+---------------------------------------------------------------------------------+-----------------+
Thanks to the strict nomenclature rules, you can also identify the type using the search string:
rucio list-dids "data23_13p6TeV:*450445*.AOD.*" --filter type=dataset
Now let’s enter the name of a non-existing dataset such as:
rucio list-dids "data23_13p6TeV:data23_13p6TeV.00450445.physics_Main.merge.AOD.f1342_m21670"
The output indicates that no such dataset exists:
+--------------+--------------+
| SCOPE:NAME | [DID TYPE] |
|--------------+--------------|
+--------------+--------------+
Rucio, like many ATLAS tools, has quite a bit of built-in help. You can try, for example:
rucio list-dids --help
in case you forget the filter format or what options are available.