Basics of Using the Grid

Last update: 21 Nov 2024 [History] [Edit]

In this section, we will teach you how to use the grid to run your jobs. The grid offers two main advantages: First, it has vast computing resources that allow your jobs to finish much more quickly than they would on a single computer. Second, it gives you access to all of the ATLAS datasets stored on sites distributed around the world, which take up far more space than you have access to locally. Running on the grid, however, is not without disadvantages. In particular, running on the grid requires more overhead than running locally, which means it is not ideal for small jobs. Also, there are more things beyond your control that can go wrong while running on the grid, such as the grid sites that contain your data going offline temporarily.

To run analysis jobs on the grid you will need some mechanism to send your software and files to the appropriate Grid site with the datasets you want to run over, and some tool to monitor the progress of those jobs. In this tutorial we will show you how to submit your jobs to the Grid using a tool called PanDA.

Below are some web links that can help you track the status of your jobs:

  • BigPanDA: The PanDA monitoring site for your jobs (and all ATLAS jobs).
  • PanDA Components: Detailed information about how the systems inside PanDA connect and terminology used by the system.
  • ADC monitoring tools: Contains all the tools you could need to monitor grid operations, used by distributed computing experts. For the curious!

We will use these tools throughout the tutorial to check on the status of the jobs.