Software Merge Review Shift Overview

Last update: 09 Aug 2024 [History] [Edit]

Introduction

This page contains information about the software review process in ATLAS. It describes the tasks of merge request review shifters and provides some guidance as well as hints for common situations/problems. It is assumed that you are familiar with the general workflow for ATLAS software development as summarised in the Workflow Quick Reference. Furthermore, a solid knowledge of the C++ and Python programming language is mandatory. An overview of the ATLAS coding conventions and style guidelines is desirable.

Shifter roles

For every working day, there are two level-1 and two level-2 shifters on shift. Ideally, they work in different time zones (e.g. CERN and US) to provide developers from all timezones with a timely review of their merge requests. Please note that the shift times indicated in the OTP system are CERN times and may deviate from the actual shift times.

The current shift crew can be found on the OTP S&C Shift Crew page.

Level-1 shifter

As a level-1 shifter you are expected to have a thorough understanding of the C++ language and some basic knowledge about Python. You should be able to check the formal code requirements listed below as well as the criteria for good documentation and the adherence to the ATLAS coding conventions and style guidelines.

If you feel confident about reviewing the changes go ahead. If the changes are more substantial and require a deeper understanding of the Athena code structure/workflows, you can escalate this MR to the level-2 shifter for review (but please make sure you explain why, i.e. write ‘Escalating to L2, because I’m not sure whether there is a memory leak in the bar() method of Foo.cxx’ or ‘Escalating to L2 because Foo.cxx has been completely re-written.’) If you are in doubt, you can always pass a MR on to the next level, but try to guide the next shifter to save them from re-reading everything.

Level-2 shifter

In addition to the knowledge required by the level-1 shifts, you have some experience with the usage and design of the Athena software and are able to comment on structural changes. A certain level of proficiency in git is helpful as well.

Your main task is the review of MRs which were escalated to level-2 by the level-1 shifters. In case that these changes require dedicated expertise in a specific software domain, you could always include an expert in the discussion (using the @username mentioning feature in GitLab, see the FAQ for a list of contact persons). Otherwise, you are expected to approve the MR or iterate with the developer on the proposed changes.

Reminder: review process

Your task as review shifter is to review the proposed changes to the ATLAS software and, if necessary, iterate with the developer(s) until these changes fulfil certain criteria outlined below. The outcome of the review process is the recommendation for the release coordinators to accept this merge request (or not). It is up to the release coordinators to finally accept or reject a merge request.

Upon creation merge requests are labelled automatically by the CI system. The labels relevant for the review process are:

  • review-pending-level-1: The review process has started and a level-1 shifter should have a look at this MR.
  • review-pending-level-2: The review process is with the level-2 shifter for further clarification/review.
  • review-pending-expert: The advice of a software domain expert on the proposed changes is needed.
  • review-user-action-required: The review shifters raised some comments which should be addressed by the developer(s).
  • review-approved: This MR is approved by the review shifters and is recommended to be accepted by the release coordinators.

For MRs that affect packages in the analysis releases there is an extra set of labels that are relevant for the analysis release shifter:

  • analysis-review-required: This MR affects packages in the analysis release and should be reviewed by the analysis release shifter.
  • analysis-review-expert: This MR was manually tagged as requiring review by an expert from the analysis domain.
  • analysis-review-approved: The analysis release shifter has looked at this MR and confirmed that there are no analysis specific issues in the MR that need to be resolved. There may still be general software quality issues with this MR and the shifters should go through their regular merge review process for approval.

Once you are done with your part of the review process you update the review label as appropriate. In case further changes are required (not approved), add a comment with your findings. In case of approval without further comments, changing the label is sufficient.

The regular and analysis-specific review are independent and can proceed in parallel. The release coordinator will ensure full approval before merging where necessary.

GitLab has its own approval mechanism, which we are not using in our standard review workflow. Therefore you do not need to click the “Approve” button at the top of the MR.

Special case: output changing merge requests

Merge requests that change the output of the reconstruction are flagged by the CI system with a dedicated label and the CI will be marked as failed.

Depending on which output changed, take the following actions:

  • xyz-output-changed: The reconstruction output changed for one or multiple output types (e.g. !69732).
    • If the changes are intentional (check the MR description or ask the developer):
      • Add RC Attention Required label.
      • References can only be updated by RC (developers should not touch the reference files) after agreement from Reco or Software coordinators for the main branch, from PROC for frozen branches and from AMG for Derivations.
        • (See the release coordinator guidelines for more information about the procedure)
      • Since the reference update is a delicate procedure it is advantageous for the code to be fully reviewed before. Proceed therefore with the code review as normal, ignoring the reference failures in CI but not any other ones. The “output-changed” labels remain on the MR.
    • If the changes are not intentional, apply the review-user-action-required label and remove all the “output-changed” labels. The developer needs to fix the MR until is passes without output changes.
  • changes-trigger-counts: One or several triggers changed the number of accepted events (e.g. !70822).
    • Apply the review-user-action-required label and ask the developer if the changes are intended.
      • If yes, the reference files need to be updated as part of the MR by the developer.
      • If not, the MR needs to be fixed and the changes-trigger-counts label removed.

For both cases, you can proceed with your usual code review before, or in parallel with, the reference file updates, but the final review-approved label should only be applied once the CI passes.

Special case: sweeps

Sweeps are merge requests that merge changes from one release branch to another (e.g. 24.0 to main). There are two types of sweeps:

  1. Automatic sweeps of one single MRs initiated by the CI system (e.g. atlas/athena!66591)
  2. Manual periodic merges done by the release coordinator (e.g. atlas/athena!71857)

As the code changes have already been reviewed when merged into the original branch, not further code review is required. If the CI succeeds, you can immediately approve the MR. In case of failure, apply the review-user-action-required label and in case of an automatic sweep, tag the author of the original MR that is listed in the description of the sweep MR.

What should you do…

…when you start the shift

  • Join the Mattermost atlassoftware team then join the shifter channel here which is used for communication among the shifters and experts. See whether there were any recent discussions (e.g. infrastructure problems or other news).
  • Your account needs to be added to the Jenkins user database in order to get access to job management functionalities on the Jenkins web interface (such as the job rebuild button). In order to get this privilege send an email to the ATLAS robot mentioning your CERN username.
  • Check for open merge requests on the GitLab merge request page for atlas/athena. Filter the open merge requests by the review-pending-level-1/2 labels (depending on what shift you do) and review according to the guidelines given below.
  • Check on long-standing open merge requests (e.g. reverse sort by Last updated).
    • Merge Requests inactive for >3 months should simply be closed (so do NOT ping older MRs unnecessarily, as this will make them appear active).
    • Add a comment saying “Closed after 3 months of inactivity, as per shifter rules.” and then close the MR.

…during your shift

  • Follow discussions in the Mattermost shifter channel.
  • Frequently check for newly created/updated merge requests and review them. First take care of MRs marked as urgent. Then review MRs in reverse chronological order (from oldest to newest in the Last updated sorting).
  • Watch out for failed CI jobs. In case they failed due to a transient infrastructure problem, restart them (as explained in the FAQ). If you are unsure about the source of the failure, do not restart the job but reach out to the developer and/or the Mattermost channel. Excessive restarting of CI jobs leads to unnecessary load on the CI system.
  • The CI Status Board lists known problems affecting the CI system. To report new issues on the infrastructure open an ATLINFR Jira ticket. For other problems open a ticket in the Core SW, Reconstruction or Trigger Jira project. If you assign the CI label to the issue it will appear in the status board.
  • Check the MR problems page for some suggested actions to take to make sure merge requests don’t slip through the cracks. In particular try to check at least once during your shift whether there are any MRs in the “invisible” and “unlabelled” sections and add review labels to the affected MRs if there are.

…at the end of your shift

  • Say Goodbye in the Mattermost channel and mention any problems/open issues that may be of interest to the next shifters.

Checklist for reviewing a merge request

The review should encompass criteria from the following four categories, ordered by importance:

  1. functionality of code changes
  2. documentation
  3. ATLAS coding conventions and style guidelines

but should also address some general points. After the review and possible iterations with the developer, please make sure that all discussions are marked as resolved before approving a merge request.

Code functionality

  • Did the CI job run successfully?
    • Does the code compile with any error or warnings? A successful build is indicated by this line in the CI result:

      ✅ Athena: number of compilation errors 0, warnings 0

      whereas a problematic build that needs follow-up is marked with a warning sign:

      ⚠️Athena: number of compilation errors 0, warnings 8

      For a MR to be approved there should be zero compilation errors and warnings. In case the warnings are unrelated to the changes in the MR (see the build logs) they can be ignored.

    • Do all unit tests still succeed? If not, check the CI Status Board for known failures.

  • In case of failed CI jobs, have a quick look to try to diagnose the issue and provide some guidance to the developer. In general, it is the responsibility of the developers to ensure that their MRs pass the CI system. However, new developers may not be familiar with the workflow and help is much appreciated.
  • Do not approve incomplete MRs! Developers sometimes make MRs halfway through a bug fix/feature implementation and want this to be merged in (for various reasons). In this case, please tell the developers to mark the MR as Draft and request it only once it is final (i.e. it builds locally and has been tested).
  • Does the code assign memory?
    • Check for new!
      • Could it be replaced by stack variables?
      • Could it be replaced by a smart pointer? (recommend make_unique() or make_shared())
    • If the interface requires bare pointers, are they deleted?
    • Is memory ownership implied by getting an object back from another call (e.g. factory methods)?
      • Check that this is clearly documented!
      • Suggest change to a unique_ptr where appropriate!
  • Does the code logic look correct?
    • Watch out for logical comparisons in the wrong place, e.g. std::abs(some_value > 2)!
    • Watch out for single block statements and misleading indentation!
  • Are any strings, vectors or other large objects passed by value instead of reference?
  • Are there any uninitialised variables? Declare variables only when they are actually needed.
  • No printing to std::cout/cerr - use the Athena logging service through the message macros. Unit tests are exempt from this rule.
  • In case of non-trivial changes to CMakeLists.txt files (e.g. custom code using more than simple set(...) or atlas_<foo>(...) functions) please notify Attila (@akraszna) as a watcher.
  • UTF8 potentially allows developers to use non-ASCII characters in merge requests. However this can lead to very hard to read code, and could cause various other problems. So:
    • non-ASCII characters are forbidden in identifiers, that is function, class and variable names, irrespective of the language i.e. this applies to C++, but also to python, bash etc.
    • non-ASCII characters are permitted in log messages and comments, provided they are reasonable and improve readability (for example, writing Δϕ instead of deltaPhi, or μ instead of muon, or to correctly write someone’s name).

Documentation

Functional code is important and so is maintainability. Therefore, please check that:

  • Is the MR description (and/or commit messages) informative and follows the guidelines?
    • For MRs with multiple commits, ideally each commit should be properly documented and the commit history should be clean. However, as we “squash” all MRs into one commit (unless the developer explicitly decided otherwise) the MR description should serve as the main documentation.
    • The merge request title will be used within the merge commit’s commit message. Therefore, it should adhere to the guidelines above, i.e.:
      • a concise justification why the changes are necessary
      • possible implications/side effects for other parts of the code
      • link(s) to related JIRA tickets if applicable
    • Auxiliary information (e.g. “Jane, can you comment on it?”) should be added as comments.
    • Feel free to ask the developer to edit the MR description if needed.
  • Copyright statements: yes, a boring and yet important topic!
    • Modifications, deletions or additions to the ATLAS copyright statement are a complete no-go. The correct copyright statement must read (including the comment format):
      /*
        Copyright (C) 2002-2024 CERN for the benefit of the ATLAS collaboration
      */
      
    • Make sure that newly added source code files (easily identifiable from the Changes tab) have this copyright statement! Trivial job option fragments and boilerplate files are exempt from this rule, usually found in share or in src/components.
    • The end year is not important as the copyright only expires many decades after the creation of the works. For you as a shifter this means you should only ask for an update if:
      • there are significant copyrightable changes and the end date lags by 10 or more years, or
      • you are asking for other changes anyway.

      If the merge request is otherwise fine please don’t bother updating the copyright, which would trigger another CI run.

  • Does the code have appropriate comments and Doxygen documentation?
  • Are the log messages useful and not too verbose?

ATLAS coding conventions and style guidelines

Sticking to the ATLAS guidelines for software development helps other people to understand your code better and faster and, thus, facilitates maintainability. As such, the following points should be mentioned as minor suggestions. Since these guidelines have not been followed too closely in the past, you will see a lot of code which could be improved. Some of it may not even been touched by the actual changes. This is a delicate topic and some developers may react huffish to these kind of comments. Therefore, the general strategy is to aim for gradual improvement than drastic changes:

  • be kind (even more polite than you usually are ;-))
  • suggest rather than request changes
  • focus on things which were touched (of course, you can still mention that your comments may still apply to other parts of the code)
  • in case the developer rigidly and repeatedly refuses to implement your suggestions, create a JIRA ticket with your findings (so that they can be addressed later) and approve the MR

Other things to watch out for:

  • The following files should not enter the git repository:
    • ChangeLog files (no nay never)
    • large binary blobs – it’s a code repository.
  • Source files that are > 100kB should probably be split.
  • If you notice that a commit is changing file permissions, please query this with the author. These could be deliberate, but more frequently this is by mistake.

FAQ

  1. How should developers update their merge request?
  2. How do developers notify review shifters after they responded to comments?
  3. Who should resolve discussions?
  4. How can I access atlas-sit-ci.cern.ch:8080 remotely?
  5. How to restart a CI job after a infrastructure failure?
  6. Where are the per-package build log files?
  7. Are Draft/WIP merge requests processed by the CI system?
  8. Which expert to contact?
  9. Why is the pipeline status for a merge request not displayed?
  10. What is a reasonable number of changed files in one MR?
  11. How can I cancel a running pipeline?


  1. How should developers update their merge request?
    If developers would like to update their code changes (either based on feedback from the review shifters or for any other reason), they can simply add further commits to the source branch. The MR will be updated automatically. Since a new CI job is run on every update of the source branch, developers are encouraged to push only working sets of commits instead of pushing each commit individually. This helps reducing the load on the CI system and therefore gives the developers faster turn-around times.
    If developers want to update only the MR title, description or labels, this can all be done directly from the GitLab MR page using the Edit button in the top-right corner.

  2. How do developers notify review shifters after they responded to comments?
    If a MR as been flagged as review-user-action-required, developers should answer and address the questions raised by the review shifters. In case they updated the source branch, a new CI job will reset the review label to review-pending-level-1 automatically. In the event that no code changes were required, developers should manually change the label back to review-pending-level-1/2 (whatever they think is appropriate).

  3. Who should resolve discussions?
    Before a MR is approved, all discussions should be marked as resolved. In general, a discussion should be resolved by the initiator (which usually is the review shifter). If developers agree with a comment and they have implemented the requested change in an update of the MR, they should briefly state in the discussion that this suggestion was implemented and resolve this discussion directly.

  4. How can I access atlas-sit-ci.cern.ch:8080 remotely?
    The Jenkins server is running at the above address which is only reachable from inside the CERN firewall due to security restrictions for our build machines. In order to access this machine which provides access to all the detailed Jenkins log files, you need use port forwarding or SSH tunneling (see here for instructions).

  5. How to restart a CI job after a infrastructure failure?
    There are two options for restarting a CI job:
    • The easiest way is to add the following comment on the GitLab MR.
      Jenkins please retry a build
      
    • If this for some reason doesn’t work, click on the link to the Jenkins output CI-MERGE_REQUEST XYZ on the GitLab MR page. This should bring you to the Jenkins job summary page where you can see the result of the individual sub-jobs and access their log files. The log file of the main job (which is doing the git checkout + git merge) can be accessed from the navigation menu on the left under Console Output. In order to restart (or cancel) a CI job you need to login with you CERN account using the login button in the top-right corner of the screen. After a successful login, the navigation menu on the left should show a Rebuild button. If that is not the case, your account needs to be added to the relevant Jenkins group as described above.
  6. Where are the per-package build log files?
    Once the CI job is finished ATLAS robot publishes a summary comment on the GitLab MR discussion page which contains a link to NICOS. On this page you can access the log files for the git checkout + git merge operation, the cmake configuration and the build of the externals by clicking on the three icons in the Build time, checkout, conf, inst column. The column #PB also gives a summary about the number of failed packages and packages which had warnings (in parentheses). Clicking on these numbers gets you to a page which lists the build logfiles for each individual package where the colors indicate the severity of compiler warnings/errors.

  7. Are Draft/WIP merge requests processed by the CI system? No, merge requests marked as Draft are not processed by the CI system. But the CI can be triggered manually as described in FAQ 5. However, once the CI finishes the system will not add a review label (as of ATLINFR-2754) since the Draft status indicates that the change is not yet ready for review.

  8. Which expert to contact? The CI system should automatically tag relevant experts using an expert watch list. If no one is listed as a watcher (or if this list seems out of date), please contact someone from the relevant software domain to fix it. A list of contacts for various software domains (and CP groups) can be found here. Make sure that you mentioned the expert in the GitLab using the @username notation.

  9. Why is the pipeline status for a merge request not displayed?
    Make sure that atlasbot was added as a developer to the source fork as described here. In case you spot a MR where the source repository is lacking atlasbot as a member, you should make a comment like to the one below

     Hi @<username>, please add `atlasbot` as a developer to your fork as described
     [here](https://atlassoftwaredocs.web.cern.ch/athena_git/gitlab-fork/) so that
     the CI pipeline status is displayed correctly for future MRs.
        
     Thanks.
    
  10. What is a reasonable number of changed files in one MR? This question is hard to answer in general. Replacing endreq by endmsg in the whole code base would be a valid MR touching thousands of files. It is not the number of changes (or modified files) which should be important but rather whether those changes are all logically related. The goal is that each MR addresses one issue (e.g. one JIRA ticket) and does not contain multiple unrelated updates. If this requires changes to many files, it does not mean it is a bad request. You may as well see it from a different perspective: A merge request is the smallest entity on release level which can be undone easily in case there is a problem. So if a merge request contains multiple changes where there is the hypothetical possibility that you only want to undo certain parts if a bug is discovered, it is better to split the merge request.

  11. How can I cancel a running pipeline? Since our CI pipelines run in Jenkins, the GitLab UI cannot be used to cancel pipelines. Instead one has to use the Jenkins web interface. However, since aborting pipelines mid-job can lead to corrupted build nodes, we generally advise against canceling any CI jobs. For the rare cases where this is needed (e.g. a user triggers a large number of CI jobs), the release coordinator should be informed who can cancel the jobs and perform a node cleanup if problems arise.

Feedback

Please feel free to contact the ATLAS git team if you have any suggestions on the documentation material, open an issue or improve it yourself by following our contribution guide. For problems with the CI system open a ticket in ATLINFR.