This page contains information about the workflows and procedures a release coordinator is concerned with. It is assumed that you are familiar with the general workflow for ATLAS Offline software development as summarised in the Workflow Quick Reference. Having an overview of the tasks of the software review shifters as well as knowing basics of git is also helpful.
Before accepting new MRs, you should check the status of the latest nightly. The first priority of a release coordinator should be to make sure that the release is in good shape, and obviously existing problems should usually be fixed before adding more MRs (though as release coordinator you should always feel free to use your judgement).
The status of the nightly (and CI) builds can be checked here:
Information on the nightly (and CI) build infrastructure are available on this twiki page.
For each nightly build check:
Any problem seen in the unit tests will very likely also affect the CI jobs. To help the review shifters and other developers please mark any relevant Jira tickets with the “CI” label. This will make the issue appear on the CI Status Board.
First check on Jenkins if the nightly finished or it is still ongoing. If it is still ongoing and it is already afternoon check that it is not stuck on something. Just by checking the console log file should give you a good idea if the nightly is progressing (select the nightly you are interested in and then click on the “Console” icon).
If the nightly finished and the status of the build is a Blue ball - the nightly should be available on CVMFS.
If the nightly finished and the Status is Red ball - the build failed for some reason. Now you need to identify the exact reason. Every nightly is executing the following steps so while investigating check the steps one by one and see where there is a problem:
The logfile is copied under: /cvmfs/atlas-nightlies.cern.ch/repo/sw/master/2018-07-15T2059/master__Athena__x86_64-slc6-gcc62-dbg__2018-07-15T2059__1531720274.ayum.log
All the information can be found inside the “Console” log file for the given nightly.
Note that issues with git and copying the RPMs to EOS are IT infrastructure related issues. If there were git problems you can restart the nightly. If there were RPM to EOS copy problems check with Atlas.Release to redo the copy.
In git, different releases are represented by different branches of the
atlas/athena git repository. One
subtle point is that what code we build from the branch is controlled
by the project specific scripts that live in the Projects
directory
of the branch (e.g., here
for the main
branch). You are, however, the release coordinator
for all projects built from a branch.
Upon creation, merge requests will be tagged automatically by the CI system with a label indicating their target branch. In addition, the CI system also adds a label indicating the stage and result of the software review process. Changes approved by the software review shifters are labeled as review-approved. Therefore, you should be able to select all ready-to-be-accepted merge requests from the GitLab merge request overview page. Simply filter the list of merge requests by selecting review-approved and your release branch name from the dropdown menu for labels.
Full rebuilds need to be triggered after updates where an incremental rebuild may not catch all changes properly, such as a LCG version update. Such updates should be performed in late evenings, with advance notification emailed to ATLAS robot to provide CI system administrators enough time to schedule cleaning local disks of build machines (to ensure builds are from scratch).
The CI job state is depicted by a small icon on the right (e.g. green checkmark for passed, red cross for failed, blue circle for running). This status sometimes does not get updated correctly (e.g. a CI job is indicated as still running while it has already finished). Therefore, please do not rely on this icon. If in doubt, check the discussion tab of this MR for the latest CI summary comment from the ATLAS robot.
Before accepting any merge request, please have a quick look at the description and the discussion on GitLab merge request page. They may contain some more detailed information about the nature of this merge request and possibly also the relation with other merge requests (e.g. dependencies between MRs). You should also make sure that all discussions were resolved.
Once you are sure that you want to merge these changes into your release, you can accept the merge request by pushing the green “Merge” button.
Please do not use the command line merging procedure as described on GitLab.
Due to the fact that updates of the CI pipeline status has been found unreliable in the past, release coordinators are discouraged from using the “Merge When Pipeline Succeeds” feature.
If you want to undo the changes introduced by an (already accepted) merge request, you have to go the GitLab webpage for this merge request. There is an orange button labeled Revert. Clicking on this button opens a new dialog where you should choose the target branch to be the same as the target branch of the initial merge request. Please leave the checkbox “Open a new merge request” checked. Confirm the settings by clicking on the green Revert button. This will open a new merge request undoing the changes introduced by the faulty merge request and the usual CI jobs and the review process start automatically.
As of the time of writing, the policy for output-changing merge requests (i.e. MRs that change controlled output formats, and indicated by labels, e.g. xyz-output-changed) is that such MRs should not be accepted without explicit approval from Reconstruction and/or Software Coordinators.
In detail:
The procedure is explained on the Frozen Tier0 Policy twiki, but briefly:
/eos/atlas/atlascerngroupdisk/data-art/grid-input/WorkflowReferences/
) and then replicated to cvmfs (/cvmfs/atlas-nightlies.cern.ch/repo/data/data-art/WorkflowReferences/
). To update these, you will need to increment (i.e. change _vN
to _vN+1
, and then commit) the reference version in WorkflowTestRunner/python/References.py, and copy new reference pool files to appropriately versioned directories in WorkflowReferences
on EOS.A script is available to perform most steps automatically. It just needs to be passed the URL to the MR CI summary page (which you find by scrolling through a MR and looking for a link labelled “Full details available on this CI monitor view”). It is suggested you run first in --test-run
mode, to make sure everything is fine e.g.
./Tools/PROCTools/python/update_ci_reference_files.py --test-run "https://bigpanda.cern.ch/ciview/?rel=MR-62281-2023-11-10-14-55"
Please note that in order for References.py
and the digest summaries to be updated, you will need to have the MR of the branch checked out.
The script will explain, but it will involve steps like:
git fetch --no-tags https://:@gitlab.cern.ch:8443/<MR_AUHTOR>/athena.git <MR_BRANCH>:<BRANCH>
git switch <BRANCH>
git rebase upstream/main
The script will also give you the link to the Jenkins job which copies files from EOS to CVMFS. Since the MR will not succeed until the reference files are on CVMFS, you may want to trigger it manually.
Have a look at this presentation for more details on the script.
Once you are satisfied as the release coordinator that a nightly release is good enough to be deployed to the production CVMFS server then you need to do the following steps:
These steps are described in the following sections.
See here for instructions on how to setup tunnelling using browser plugins.
Log in with username “jobrest”. If you do not know the password, ask Alex Undrus.
Click “Build with parameters” on the left hand sidebar. Fill in the details on the screen where:
25.0.11
x86_64-el9-gcc13-opt
, aarch64-el9-gcc13-opt
etcAthena
, AthSimulation
, AnalysisBase
etcmain
, 24.0
, 22.0-mc20
, ondemand
and so on2024-07-22T2101
Repeat the above steps for any other platforms you may need (e.g. the opt
and dbg
builds typically
have slightly different timestamps).
If you are making a release for a branch where this is expected (currently
24.X
and25.X
), do not forget to build bothx86_64
andaarch64
platforms.
There are two ways to publish a release, manually or using the prepare_release_notes.py script. The recommended approach is to use the script (since this automates a lot of the process), but please note that it requires a GitLab token (see here, for instructions on how to create one: please note that at minimum your token needs to have an API ‘scope’ or permissions) in order to work optimally. Both approaches are described below.
Run the following commands to publish the release (making sure to add your gitlab API token instead of YOUR_GITLAB_API_TOKEN_HERE
):
lsetup git
lsetup gitlab
git clone ssh://git@gitlab.cern.ch:7999/atlas/athena.git
cd athena
./Build/AtlasBuildScripts/prepare_release_notes.py release/25.0.11 nightly/main/2024-07-22T2101 -t YOUR_GITLAB_API_TOKEN_HERE
Always use the latest version of the script from the
main
branch even when building releases from other branches.
The script will ask if you want to let it build the release for you. If you choose yes (which you should!), it will ask you for the associated JIRA ticket and a short description. This description is NOT the release notes (the script will make these for you), but rather a high-level summary of the purpose of the release e.g.
This is the latest build of the Tier-0 data taking branch that fixes rare crashes seen for high pt muons.
If you did NOT let the script make the release, then you will need to follow the relevant instructions in the next section.
The script will now create the release tag, make the release in gitlab and fill in the detailed release notes.
You should now skip the next section and go straight to Update the release candidate number.
Assuming you chose not to use the recommended approach above, you will need to do the following steps manually.
Create a new tag for the release in the
main repository. Enter the Tag name
, which should always be
release/A.B.X[.Y]
, i.e., the actual numbered release version; the Create from
should be the nightly build tag that is being used, which has the
format nightly/BRANCH_NAME/DATE_STAMP
.
In case you have built multiple platforms for the same release use the nightly stamp of the primary platform (e.g.
gcc11-opt
). In practice this should not matter as the nightly tags for different platforms should all point to the same git commit if started around the same time.
You can also create the tag locally in any clone of the repository and push it to
atlas/athena
. It is not recommended unless you really know what you are doing.
Then enter an informative message about the reason for deploying this particular release that will be useful to others (do not put the full release notes into the commit message, this is done in separate step below). An example might be the following:
Even if you are publishing the release manually, we still recommend that you generate the release notes using the prepare_release_notes.py script with the following parameters: the tag of the release you are building, the tag of the nightly used to build it. Even without a token, the script can generate the release notes for you (just a slightly less informative version).
For example:
lsetup git
lsetup gitlab
git clone ssh://git@gitlab.cern.ch:7999/atlas/athena.git
cd athena
./Build/AtlasBuildScripts/prepare_release_notes.py release/23.0.42 nightly/23.0/2023-04-02T2101
# Release notes generated in 'release_notes.md'
Always use the latest version of the script from the
main
branch even when building releases from other branches.
Next you have to create the release manually:
Release title
: same as tag name (e.g. release/23.0.42
)Release date
: today’s dateRelease notes
: copy the release_notes.md
content hereYou will now need to Update the release candidate number, as described below.
Finally, now that a nightly has been promoted to a numbered release
the release candidate number needs to be updated to be the next number
in the release series for this branch + project. The easiest way to do this
is via the “Web IDE” editor in GitLab itself. Navigate to the correct
branch and project version.txt
file and select “Open in Web IDE”:
In the following navigate to all version.txt
files that need updating
and increment the release number. Once done create a single commit on
the release branch with all updates:
All Projects (except
AnalysisBase
,AthGeneration
andAthAnalysis
) should be kept in sync. So if, for example, you update the version ofProjects/Athena/
to23.0.43
you should also updateAthDataQuality
,AthSimulation
,DetCommon
andVP1Light
Only people with push rights to the branch can do this directly. You should have these rights as a release coordinator, otherwise you need to make a merge request.
Alternatively, you can do it entirely on the command in your local athena checkout:
git fetch upstream
git checkout upstream/BRANCH # e.g. main, 24.0, ...
grep "" Projects/*/version.txt # show all project versions
old=23.0.4 new=23.0.5 && git grep -l $old Projects/*/version.txt | xargs sed -i s/$old/$new/
git diff # verify the changes
git commit -a -m "Bump project versions to $new"
git push upstream
After the release has been built, please announce the release by sending out an email:
mailto: atlas-sw-spmb@cern.ch, hn-atlas-recoIntegration@cern.ch, atlas-dp-proc@cern.ch, atlas-trigger-operation@cern.ch, hn-atlas-releaseKitAnnounce@cern.ch
subject: Release Athena,23.0.42
Dear all,
This is to let you know that the release Athena,23.0.42 has been built from the nightly
Athena,main,2023-04-02T2101 and is in the process of being distributed to CVMFS.
The JIRA ticket can be found here:
https://its.cern.ch/jira/browse/ATLINFR-4208
The release notes are available at:
https://gitlab.cern.ch/atlas/athena/-/releases/release%252F23.0.42
Regards, myName
As of writing, changes to the 24.0
release branch are manually merged (~daily) into main
by
the main
release coordinator. To ignore certain directories and files in
the merge (e.g. version.txt
) the atlas_git_merge.sh script should be used. The script will fetch the remote, create a local branch, merge 24.0
(but without the final commit) and reject changes to a predefined list of files (see atlas_git_merge.sh --help
):
# in your atlas/athena clone:
./Build/AtlasBuildScripts/atlas_git_merge.sh 24.0
# if no conflicts are found
git commit
# if there are conflicts, resolve them and
git merge --continue
# push branch and create MR
git push origin
./Build/AtlasBuildScripts/prepare_release_notes.py --sweep -t YOUR_GITLAB_API_TOKEN_HERE
The last step creates the MR diff (release_notes.md
) which will be used as the
merge request description. The script will prompt if a pre-filled Draft MR should be created in GitLab.
In case you had to apply conflict resolutions, you may want to amend the appropriate lines
(or remove entire lines if you rejected some commits). The script will also warn you in case a MR
with the sweep:ignore label was merged and provides instructions on how to remove it.
The GitLab token
(which must have at least API ‘scope’ or permission) is required to decorate the MRs with the domain labels
and to allow the script to create the MR on your behalf.
In order to preserve the commit history, do not squash the commits when finalizing the merge request.
For the rare cases where updates to e.g. the externals should be merged, use the
--all
option to disable the default ignore list.
Useful commands for resolving conflicts:
git merge --abort
to abort the mergegit reset --hard upstream/main
to start over with current branchgit [-n] revert HASH
to revert a specific commit [without creating commit]git checkout --ours FILE
to keep version from main
(or --theirs
for 24.0
)
followed by git add FILE
to mark conflict as resolvedFor emergencies the following procedure can be used to launch an on-demand release build. This would mainly be used in case a patch is required to an older version of the nightly branch. The procedure is very similar to building/deploying a regular nightly build, except that the release is being built from a fixed git tag.
git fetch upstream --tags
git checkout -b 23.0.29-patches release/23.0.29
In case there is already a patch release for this branch point, simply re-use the existing branch.
23.0.29.1
, 23.0.29.2
, etc.).
In case you have been re-using a branch in step 1, check the list of already existing tags via git tag -l 'release/23.0.29.*'
. This step has to be done before building the release.git push upstream 23.0.29-patches
release/23.0.29.1
). Make sure to explain
in the “Release notes” which changes went into this tag.Athena
, AthSimulation
, AnalysisBase
, etc.x86_64-el9-gcc13-opt
release/23.0.60.1
gcc13
(needs to match the compiler used in the nightly)3.27.5
(>= regular nightly, check with cmake --version
)11.4
(check version in corresponding nightly via echo $CUDAXX
)ondemand
build results on the BigPanDA dashboard and
note the datestamp of the build.ondemand
2023-05-16T2101
). You can also find this
datestamp in the console output of the Jenkins job from step 2.To set up a production branch, first make the base release (as above) so there is a clear branching point, and so the release numbering makes sense. Next, make a branch from this release, naming it descriptively e.g. 22.0-mc20. Another acceptable option would be to name it after the base release e.g. 22.0.41.X
.
You should then setup permissions for this this branch allowing only the branch managers to push and merge (look for “Protected branches” under “Repository Settings”) and have a look at other protected branches for examples).
Next, it is very important to update the release candidate version (recall that in order to make make a release from a nightly, the nightly needs to know the release it will become) e.g. if you branched from 22.0.41
, you should immediately change the version.txt
in all relevant projects (i.e. all, except AnalysisBase
and AthAnalysis
) to 22.0.41.1
.
Finally, do not forget to update the README.md with details about the new branch (ideally in all branches, but certainly in the new branch and main
).
In productions branches we will need to make patch or point releases, e.g. 22.0.41.1
The procedure should be exactly the same as above, namely:
22.0.41.1
becomes 22.0.41.2
The CI compiles code-changes in the MRs incrementally. Occasionally, the build nodes end up in a bad state and need full rebuilds to recover. Updates to the externals frequently create this problem. This can be achieved by scheduling the clean-build-dir
job on atlas-sit-ci.cern.ch
after logging in via SSO. The ongoing CI jobs will be allowed to complete before the cleanup. All new CI jobs will be waiting in the queue until the clean-build-dir
jobs are completed on all CI nodes.
Locate the clean-build-dir
job and select Build with Parameters. The default parameters (*
) will run a clean up for all branches and all build nodes. This can be customized via:
BRANCH
: As the CI runs in separate build directories for each git branch, issues are usually restricted to one branch. Selecting a specific git branch (e.g. main
, 24.0
) for the cleanup avoids delays in other branches.BUILD_NODE
: Select the CI node (e.g. aibuild64-011
) in case the problem only affects one particular node.Please announce the clean-up of the CI nodes outside of the regular weekly cleanup on the MR shifters mattermost channel at the link.
The following sections contains brief information about how to update the LCG layer version and/or the TDAQ/TDAQ-COMMON version in a Athena release in gitlab.
The LCG layer version is configured in the build_externals.sh
or in rare cases the CMakeLists.txt
files in the athena/Projects/*/
subdirectories - see the gitlab directory Projects. Set the two variables LCG_VERSION_NUMBER
and LCG_VERSION_POSTFIX
accordingly. There are a few exceptions: not all Projects use LCG like e.g. AnalysisBase
or only use LCG_VERSION_NUMBER
like e.g. DetCommon
.
Edit all the typically 7 relevant files by hand or use the following one-liner for are search-and-replace in your local git area before creating the MR:
cd athena/Projects
git grep -l 'b_ATLAS_9' | xargs sed -i 's/b_ATLAS_9/b_ATLAS_11/g'
Here are two example merge requests that update only a LCG layer or a combination of the LCG layer + TDAQ/TDAQ-COMMON:
LCG_102b_ATLAS_11
in MR 59722.LCG_102b_ATLAS_2
and TDAQ/TDAQ-COMMON version 09-05-00
in MR 58081 + fix for DetCommon in MR 59092.In the MR description please add a list of the updated packages and their versions and a link to the relevant SPI jira ticket like e.g. https://sft.its.cern.ch/jira/browse/SPI-2277.
Please add the gitlab labels full-build, full-unit-tests and the label RC Attention Required with a small written reminder in the MR text to the current release coordinator to trigger a CI nodes clean-up after the MR has been merged.
The TDAQ and TDAQ-COMMON versions are configured in the CMakeLists.txt
files in the athena/Projects/*/
subdirectories and are set by the two variables TDAQ-COMMON_VERSION
and TDAQ_VERSION
. Not all Projects actually have a TDAQ and/or TDAQ-COMMON configuration since they don’t depend on them.
Here is one example merge request that updates only TDAQ/TDAQ-COMMON:
10-00-00
in MR 59665.Please add the same gitlab labels as mentioned in the previous section.