Shift guide for release coordinators

Last update: 23 Jan 2020 [History] [Edit]

Introduction

This page contains information about the workflows and procedures a release coordinator is concerned with. It is assumed that you are familiar with the general workflow for ATLAS Offline software development as summarised in the Workflow Quick Reference. Having an overview of the tasks of the software review shifters as well as knowing basics of git is also helpful.

Checking the status of the release

Before accepting new MRs, you should check the status of the latest nightly. The first priority of a release coordinator should be to make sure that the release is in good shape, and obviously existing problems should usually be fixed before adding more MRs (though as release coordinator you should always feel free to use your judgement).

How to check the nightly build status?

The status of the nightly builds can be checked in two places:

The latter contains information on ART build jobs and other detailed information for the different build steps. Information on the nightly (and CI) build infrastructure are available on this twiki page.

For each nightly build check:

  1. All unit tests succeed. If there are failures, try to identify the responsible MR and post a comment on it. If you cannot identify any MR, open a Jira ticket in the relevant Jira tracker.
  2. No new failures in the ART build tests.
  3. Check the ART grid tests to see which tests are running / failing. It can happen that a MR has broken some tests because they aren’t part of the CI tests (the CI is necessarily a subset of all of the possible tests).

Any problem seen in the unit tests will very likely also affect the CI jobs. To help the review shifters please mark any relevant Jira tickets with the “CI” label. This will make the issue appear on the CI Status Board.

What to do if nightly is not yet available on cvmfs?

First check on Jenkins if the nightly finished or it is still ongoing. If it is still ongoing and it is already afternoon check that it is not stuck on something. Just by checking the console log file should give you a good idea if the nightly is progressing (select the nightly you are interested in and then click on the “Console” icon).

If the nightly finished and the status of the build is a Blue ball - the nightly should be available on CVMFS.

If the nightly finished and the Status is Red ball - the build failed for some reason. Now you need to identify the exact reason. Every nightly is executing the following steps so while investigating check the steps one by one and see where there is a problem:

  • git clone
  • building externals
  • building the main project
  • creating the RPMs
  • preparing the RPMs and copying them to EOS – once done the RPMs should be visible under http://atlas-software-dist-eos.web.cern.ch/atlas-software-dist-eos/RPMs/nightlies/
  • installation on CVMFS (in the Console log at the end there will be a line like
    The logfile is copied under: /cvmfs/atlas-nightlies.cern.ch/repo/sw/master/2018-07-15T2059/master__Athena__x86_64-slc6-gcc62-dbg__2018-07-15T2059__1531720274.ayum.log
    

    All the information can be found inside the “Console” log file for the given nightly.

WarningNote that issues with git and copying the RPMs to EOS are IT infrastructure related issues. If there were git problems you can restart the nightly. If there were RPM to EOS copy problems check with Atlas.Release to redo the copy.

Handling new merge requests

Finding open merge requests for your release

In git, different releases are represented by different branches of the atlas/athena git repository. One subtle point is that what code we build from the branch is controlled by the project specific scripts that live in the Projects directory of the branch (e.g., here for the master branch). You are, however, the release coordinator for all projects built from a branch.

Upon creation, merge requests will be tagged automatically by the CI system with a label indicating their target branch. In addition, the CI system also adds a label indicating the stage and result of the software review process. Changes approved by the software review shifters are labeled as review-approved. Therefore, you should be able to select all ready-to-be-accepted merge requests from the GitLab merge request overview page. Simply filter the list of merge requests by selecting review-approved and your release branch name from the dropdown menu for labels.

Full rebuilds need to be triggered after updates where an incremental rebuild may not catch all changes properly, such as a LCG version update. Such updates should be performed in late evenings, with advance notification emailed to ATLAS robot to provide CI system administrators enough time to schedule cleaning local disks of build machines (to ensure builds are from scratch).

The CI job state is depicted by a small icon on the right (e.g. green checkmark for passed, red cross for failed, blue circle for running). This status sometimes does not get updated correctly (e.g. a CI job is indicated as still running while it has already finished). Therefore, please do not rely on this icon. If in doubt, check the discussion tab of this MR for the latest CI summary comment from the ATLAS robot.

Accepting merge requests

Before accepting any merge request, please have a quick look at the description and the discussion on GitLab merge request page. They may contain some more detailed information about the nature of this merge request and possibly also the relation with other merge requests (e.g. dependencies between MRs). You should also make sure that all discussions were resolved.

Once you are sure that you want to merge these changes into your release, you can accept the merge request by pushing the green “Merge” button.

WarningPlease do not use the command line merging procedure as described on GitLab.

Due to the fact that updates of the CI pipeline status has been found unreliable in the past, release coordinators are discouraged from using the “Merge When Pipeline Succeeds” feature.

How to roll back bad merges?

If you want to undo the changes introduced by an (already accepted) merge request, you have to go the GitLab webpage for this merge request. There is an orange button labeled Revert. Clicking on this button opens a new dialog where you should choose the target branch to be the same as the target branch of the initial merge request. Please leave the checkbox “Open a new merge request” checked. Confirm the settings by clicking on the green Revert button. This will open a new merge request undoing the changes introduced by the faulty merge request and the usual CI jobs and the review process start automatically.

Tag and deploy a release after a successful build

Once you are satisfied as the release coordinator that a nightly release is good enough to be deployed to the production CVMFS server then you need to do the following steps:

  • Copy the RPMs of the successful nightly over to EOS using the following Jenkins link (tunnelling is necessary from outside CERN at present).

  • Log in with username “jobrest”. If you do not know the password, ask Alex Undrus.

  • Click “Build with parameters” on the left hand sidebar. Fill in the details on the screen where:

    • nightly_name can be 21.0, master and so on
    • rel_nightly is the timestamp for your nightly, e.g. 2019-05-05-T0017
    • project can be Athena, AthSimulation, AnalysisBase etc and then click “Build”

Repeat the above steps for any other platforms you may need (e.g. the opt and dbg builds typically have slightly different timestamps).

Create a Jira ticket for CVMFS installation

Create a Jira ticket (Task) in ATLINFR requesting that the appropriate nightly release should be promoted to a full numbered release (the Component is CVMFS Release Installation). You need to give the name of the git branch, the Project that was built and the Datestamp of the build, e.g., 21.0 + Athena + 2017-03-29T2135.

Tip Every project already knows what release candidate it is, which is stored in the version.txt file of the project’s directory (like this). You will update this number later on in this procedure.

Warning It is impossible to install a build as a different numbered production release from the version.txt candidate number when the nightly was built. e.g., if the release candidate number is 21.0.21 it is not possible to install this as 21.0.22 or 21.0.21.0 or 21.0.21p1.

Tag the release in git

Create the tag for the release in the main repository. From the front page of the main repository click on Tags, then New tag. Enter the Tag name, which should always be release/A.B.X[.Y], i.e., the actual numbered release version; the Create from should be the nightly build tag that is being used, which has the format nightly/BRANCH_NAME/DATE_STAMP.

Tip In case you have built multiple platforms for the same release use the nightly stamp of the primary platform (e.g. gcc8-opt). In practice this should not matter as the nightly tags for different platforms should all point to the same git commit if started around the same time.

Then enter an informative message about the reason for deploying this particular release that will be useful to others. An example might be the following:

Alert You can also create the tag locally in any clone of the repository and push it to atlas/athena. It’s not recommended unless you really know what you are doing.

Update the release candidate number

Finally, now that a nightly has been promoted to a numbered release the release candidate number needs to be updated to be the next number in the release series for this branch + project. The easiest way to do this is via the simple editor in GitLab itself. Navigate to the correct branch and project’s version.txt file, e.g.,:

Just then click on the Edit button and switch to the new version number. The commit message can be quite simple, describing the update now that a previous nightly has been promoted.

Tip Only people with push rights to the branch can do this directly. You should have these rights as a release coordinator. (Lesser mortals need to make a merge request.)