Software Merge Review Shift Reference

Last update: 14 Feb 2024 [History] [Edit]

Introduction

This page contains information about the software review process in ATLAS. It describes the tasks of merge request review shifters and provides some guidance as well as hints for common situations/problems. It is assumed that you are familiar with the general workflow for ATLAS software development as summarised in the Workflow Quick Reference. Furthermore, a solid knowledge of the C++ programming language is mandatory. An overview of the ATLAS coding conventions and style guidelines is desirable.

Shifter roles

For every working day, there are two level-1 and two level-2 shifters on shift. Ideally, they work in different time zones (e.g. CERN and US) to provide developers from all timezones with a timely review of their merge requests. Please note that the shift times indicated in the OTP system are CERN times.

The current shift crew can be found on the OTP S&C Shift Crew page.

Level-1 shifter

As a level-1 shifter you are expected to have a thorough understanding of the C++ language and some basic knowledge about python. You should be able to check the formal code requirements listed below as well as the criteria for good documentation and the adherence to the ATLAS coding conventions and style guidelines.

For trivial changes, you should approve MRs directly. If the changes are more substantial and require a deeper understanding of the Athena code structure/workflows, you can escalate this MR to the level-2 shifter for review (but please make sure you explain why, i.e. write ‘Escalating to L2, because I’m not sure whether there is a memory leak in the bar() method of Foo.cxx’ or ‘Escalating to L2 because Foo.cxx has been completely re-written.’) If you are in doubt, you can always pass a MR on to the next level, but try to guide the next shifter to save them from re-reading everything.

Level-2 shifter

In addition to the knowledge required by the level-1 shifts, you have some experience with the usage and design of the Athena software and are able to comment on structural changes. A certain level of proficiency in git is helpful as well.

Your main task is the review of MRs which were escalated to level-2 by the level-1 shifters. In case that these changes require dedicated expertise in a specific software domain, you could always include an expert in the discussion (using the @username mentioning feature in GitLab, see the FAQ for a list of contact persons). Otherwise, you are expected to approve the MR or iterate with the developer on the proposed changes.

Reminder: review process

Your task as review shifter is to review the proposed changes to the ATLAS software and, if necessary, iterate with the developer(s) until these changes fulfil certain criteria outlined below. The outcome of the review process is the recommendation for the release coordinators to accept this merge request (or not). It is up to the release coordinators to finally accept or reject a merge request.

Upon creation merge requests are labelled automatically by the CI system. The labels relevant for the review process are:

  • review-pending-level-1: The review process has started and a level-1 shifter should have a look at this MR.
  • review-pending-level-2: The review process is with the level-2 shifter for further clarification/review.
  • review-pending-expert: The advice of a software domain expert on the proposed changes is needed.
  • review-user-action-required: The review shifters raised some comments which should be addressed by the developer(s).
  • review-approved: This MR is approved by the review shifters and is recommended to be accepted by the release coordinators.

For MRs that affect packages in the analysis releases there is an extra set of labels that are relevant for the analysis release shifter:

  • analysis-review-required: This MR affects packages in the analysis release and should be reviewed by the analysis release shifter.
  • analysis-review-expert: This MR was manually tagged as requiring review by an expert from the analysis domain.
  • analysis-review-approved: The analysis release shifter has looked at this MR and confirmed that there are no analysis specific issues in the MR that need to be resolved. There may still be general software quality issues with this MR and the shifters should go through their regular merge review process for approval.

Once you are done with your part of the review process you update the review label as appropriate. In case further changes are required (not approved), add a comment with your findings. In case of approval without further comments, changing the label is sufficient.

The regular and analysis-specific review are independent and can proceed in parallel. The release coordinator will ensure full approval before merging where necessary.

GitLab has its own approval mechanism, which we are not using in our standard review workflow. Therefore you do not need to click the “Approve” button at the top of the MR.

What should you do…

…when you start the shift

  • Join the Mattermost atlassoftware team then join the shifter channel here which is used for communication among the shifters and experts. See whether there were any recent discussions (e.g. infrastructure problems or other news).
  • Your account needs to be added to the Jenkins user database in order to get access to job management functionalities on the Jenkins web interface (such as the job rebuild button). In order to get this privilege send an email to the ATLAS robot mentioning your CERN username.
  • Check for open merge requests on the GitLab merge request page for atlas/athena. Filter the open merge requests by the review-pending-level-1/2 labels (depending on what shift you do) and review according to the guidelines given below.
  • Check on long-standing open merge requests (e.g. reverse sort by Last updated).
    • Alert New Rule for 2021: Merge Requests inactive for >3 months should simply be closed (so do NOT ping older MRs unnecessarily, as this will make them appear active).
    • Add a comment saying “Closed after 3 months of inactivity, as per shifter rules.” and then close the MR.
  • Check for failed sweeps. If there are, add a comment to the MR reminding the author that failed sweeps need to be resolved (to either sweep:ignore or sweep:done), as explained in the Git Workflow Tutorial here.

…during your shift

  • Follow discussions in the Mattermost shifter channel.
  • Frequently check for newly created/updated merge requests and review them. First take care of MRs marked as urgent. Then review MRs in reverse chronological order (from oldest to newest in the Last updated sorting).
  • Watch out for failed CI jobs. In case they failed due to a transient infrastructure problem, restart them (as explained in the FAQ). If you are unsure about the source of the failure, do not restart the job but reach out to the developer and/or the Mattermost channel. Excessive restarting of CI jobs leads to unnecessary load on the CI system.
  • The CI Status Board lists known problems affecting the CI system. To report new issues on the infrastructure open an ATLINFR Jira ticket. For other problems open a ticket in the Core SW, Reconstruction or Trigger Jira project. If you assign the CI label to the issue it will appear in the status board.
  • Check the MR problems page for some suggested actions to take to make sure merge requests don’t slip through the cracks. In particular try to check at least once during your shift whether there are any MRs in the “invisible” and “unlabelled” sections and add review labels to the affected MRs if there are.

…at the end of your shift

  • Say Goodbye in the Mattermost channel and mention any problems/open issues that may be of interest to the next shifters.

Checklist for reviewing a merge request

The review should encompass criteria from the following four categories, ordered by importance:

  1. functionality of code changes
  2. documentation
  3. ATLAS coding conventions and style guidelines
  4. Simulation/Overlay/Reconstruction/Derivation output changes

but should also address some general points. After the review and possible iterations with the developer, please make sure that all discussions are marked as resolved before approving a merge request.

Code functionality

  • Did the CI job run successfully?
    • Do all relevant unit tests still succeed?
    • Does the code compile with any error or warnings? A successful build is indicated by this line in the CI result:

      ✅ Athena: number of compilation errors 0, warnings 0

      whereas a problematic build that needs follow-up is marked with a warning sign:

      ⚠️Athena: number of compilation errors 0, warnings 8

      For a MR to be approved there should be zero compilation errors and warnings. In case the warnings are unrelated to the changes in the MR (see the build logs) they can be ignored.

  • In case of failed CI jobs, have a quick look to try to diagnose the issue and provide some guidance to the developer. In general, it is the responsibility of the developers to ensure that their MRs pass the CI system. However, new developers may not be familiar with the workflow and help is much appreciated.
  • Do not approve incomplete MRs! Developers sometimes make MRs halfway through a bug fix/feature implementation and want this to be merged in (for various reasons). In this case, please tell the developers to mark the MR as Draft and request it only once it is final (i.e. it builds locally and has been tested).
  • Does the code assign memory?
    • Check for new!
      • Could it be replaced by stack variables?
      • Could it be replaced by a smart pointer? (recommend make_unique() or make_shared())
    • If the interface requires bare pointers, are they deleted?
    • Is memory ownership implied by getting an object back from another call (e.g. factory methods)?
      • Check that this is clearly documented!
      • Suggest change to a unique_ptr where appropriate!
    • Always get L2 signoff if memory if dynamically assigned!
  • Does the code logic look correct?
    • Watch out for logical comparisons in the wrong place, e.g. std::abs(some_value > 2)!
    • Watch out for single block statements and misleading indentation!
  • Are any strings, vectors or other large objects passed by value instead of reference?
  • Are there any uninitialised variables? Declare variables only when they are actually needed.
  • No printing to cout/cerr - use the Athena logging service through the message macros. Unit tests are exempt from this rule.
  • For python code:
    • Avoid use of from module import *.
    • Use of loggers, not print statements.
    • To be continued…
  • In case of non-trivial changes to CMakeLists.txt files (e.g. custom code using more than simple set(...) or atlas_<foo>(...) functions) please notify Attila (@akraszna) as a watcher.
  • UTF8 potentially allows developers to use non-ASCII characters in merge requests. However this can lead to very hard to read code, and could cause various other problems. So:
    • non-ASCII characters are forbidden in identifiers, that is function, class and variable names, irrespective of the language i.e. this applies to C++, but also to python, bash etc.
    • non-ASCII characters are permitted in log messages and comments, provided they are reasonable and improve readability (for example, writing Δϕ instead of deltaPhi, or μ instead of muon, or to correctly write someone’s name).

Documentation

Functional code is important and so is maintainability. Therefore, please check that:

  • Copyright statements: yes, a boring and yet important topic!
    • Modifications, deletions or additions to the ATLAS copyright statement are a complete no-go. The correct copyright statement must read (including the comment format):
      /*
        Copyright (C) 2002-2024 CERN for the benefit of the ATLAS collaboration
      */
      
    • Make sure that newly added source code files (easily identifiable from the Changes tab) have this copyright statement! Trivial job option fragments and boilerplate files are exempt from this rule, usually found in share or in src/components.
    • The end year is not important as the copyright only expires many decades after the creation of the works. For you as a shifter this means you should only ask for an update if:
      • there are significant copyrightable changes and the end date lags by 10 or more years, or
      • you are asking for other changes anyway.

      If the merge request is otherwise fine please don’t bother updating the copyright (and rerunning the CI)!

  • Does the code have appropriate comments and Doxygen documentation?
  • Are the log messages useful and not too verbose?
  • Are the commit messages informative and follow guidelines?
    • It is not trivial to change commit messages afterwards. This requires using git rebase which we consider an advanced topic. However, it should be at least pointed out to the developers that guidelines for commit messages exist and that they should be considered in the future.
  • The merge request title will be used within the merge commit’s commit message. Therefore, it should adhere to the guidelines above. The MR description is not included in the git history but should contain:
    • a concise justification why the changes are necessary
    • possible implications/side effects for other parts of the code
    • link(s) to related JIRA tickets if applicable

    Sometimes people put information of limited relevance in the description, e.g.

    • This is my first merge request.
    • Jane, can you comment on it?
    • Restarting the CI job.

    These kind of things should rather be added as comments than being put directly in the MR description. You can always ask the developer to edit the MR title and description. Keep in mind that this is the replacement of the ChangeLog and as such a critical piece of information when it comes to bug fixes and debugging.

  • Are there a unreasonable number of tiny or corrective commits?
    • Humans make mistakes and developers are no exception. In case you see commits like “Fix typo”, “Add missing include” etc. that do not add any valuable information but rather fix issues introduced in previous commits, you should mention the feature of git commit --amend to the developers. Merging commits is possible using git rebase -i, which you may recommend, but this again is an advanced operation that requires a certain level familiarity with git. Another option here is to use GitLab’s Squash Commits functionality within the Merge Request.

ATLAS coding conventions and style guidelines

Sticking to the ATLAS guidelines for software development helps other people to understand your code better and faster and, thus, facilitates maintainability. As such, the following points should be mentioned as minor suggestions. Since these guidelines have not been followed too closely in the past, you will see a lot of code which could be improved. Some of it may not even been touched by the actual changes. This is a delicate topic and some developers may react huffish to these kind of comments. Therefore, the general strategy is to aim for gradual improvement than drastic changes:

  • be kind (even more polite than you usually are ;-))
  • suggest rather than request changes
  • focus on things which were touched (of course, you can still mention that your comments may still apply to other parts of the code)
  • in case the developer rigidly and repeatedly refuses to implement your suggestions, create a JIRA ticket with your findings (so that they can be addressed later) and approve the MR

Simulation/Overlay/Reconstruction/Derivation output changes

In the CI builds we are running three main tests that check whether a particular MR is changing the simulation/overlay/reconstruction outputs. These are named, respectively:

  • SimulationTier0Test-test
  • OverlayTier0Test_required-test
  • qTestsTier0_required-test (except for the main branch for which please see below)

If one or more of these tests are failing, please take the following steps:

  • Make sure the changes are intended, and reviewed by the relevant experts as usual,
  • If the changes are accepted, request the developer to run the relevant tests locally to prepare the updated reference file(s),
  • Ask the expert(s) to create a new reference file(s) version and copy the file(s) to EOS,
  • Ask the developer to update Tools/PROCTools/python/RunTier0TestsTools.py accordingly and add the changes to the MR,
  • Make sure the relevant test now succeeds.

Only after these steps, the shifter should add the label review-approved. Also, please don’t forget to make sure the relevant labels, i.e. changes-overlay-output etc., are added to the MR.

As mentioned above, in the main branch we do NOT run qTestsTier0_required but instead run two other tests: q221xAODDigestTest and q431xAODDigestTest. Each test runs the relevant q-test, dump the content of the resulting AOD using the xAODDigest.py script into a text file, and compare it against a predefined reference.

If one or more of these tests are failing, please take the following steps:

  • In the original build, make sure the actual q-test succeeded and only the check against the reference failed due to changes in the output,
  • If the changes are approved (accepted and justified), request the developer to update the reference file(s) and amend the MR w/ the new one(s),
    • The reference files reside under Tools/PROCTools/data
    • The developer can either run the script locally to reproduce the new reference(s) or simply copy-paste the changes from the NICOS page of the MR build
    • The actual test scripts can be found here for the q221 test, and here for the q431 test
  • Make sure the relevant test now succeeds.

Only after these steps, the shifter should add the label review-approved. Also, please don’t forget to add the chaged-reconstruction-output label to the MR.

The derivation test, currently named FrozenDerivationTest_optional-test, is handled differently. If this test is failing, again make sure things are labelled correctly, and the changes are reviewed by the relevant expert(s). The reference file is updated automatically on a daily basis (nominally around 8 AM CERN time). Therefore, no further action is needed from the shifter but please be vigilant as any CI build that happens between the time this MR is accepted by the coordinator and the time the reference file is updated will have this test failed. In that case, the shifter can ask Jenkins to rebuild once the reference file is updated.

Other things to watch out for:

  • The following files should not enter the git repository:
    • ChangeLog files (no nay never)
    • large binary blobs – it’s a code repository.
  • Source files that are > 100kB should probably be split.
  • If you notice that a commit is changing file permissions, please query this with the author. These could be deliberate, but more frequently this is by mistake.

FAQ

  1. How should developers update their merge request?
  2. How do developers notify review shifters after they responded to comments?
  3. Who should resolve discussions?
  4. How can I access atlas-sit-ci.cern.ch:8080 remotely?
  5. How to restart a CI job after a infrastructure failure?
  6. Where are the per-package build log files?
  7. Are Draft/WIP merge requests processed by the CI system?
  8. Which expert to contact?
  9. Why is the pipeline status for a merge request not displayed?
  10. Where can I find the output of the unit tests?
  11. What is a reasonable number of changed files in one MR?
  12. How can I cancel a running pipeline?


  1. How should developers update their merge request?
    If developers would like to update their code changes (either based on feedback from the review shifters or for any other reason), they can simply add further commits to the source branch. The MR will be updated automatically. Since a new CI job is run on every update of the source branch, developers are encouraged to push only working sets of commits instead of pushing each commit individually. This helps reducing the load on the CI system and therefore gives the developers faster turn-around times.
    If developers want to update only the MR title, description or labels, this can all be done directly from the GitLab MR page using the Edit button in the top-right corner.

  2. How do developers notify review shifters after they responded to comments?
    If a MR as been flagged as review-user-action-required, developers should answer and address the questions raised by the review shifters. In case they updated the source branch, a new CI job will reset the review label to review-pending-level-1 automatically. In the event that no code changes were required, developers should manually change the label back to review-pending-level-1/2 (whatever they think is appropriate).

  3. Who should resolve discussions?
    Before a MR is approved, all discussions should be marked as resolved. In general, a discussion should be resolved by the initiator (which usually is the review shifter). If developers agree with a comment and they have implemented the requested change in an update of the MR, they should briefly state in the discussion that this suggestion was implemented and resolve this discussion directly.

  4. How can I access atlas-sit-ci.cern.ch:8080 remotely?
    The Jenkins server is running at the above address which is only reachable from inside the CERN firewall due to security restrictions for our build machines. In order to access this machine which provides access to all the detailed Jenkins log files, you need use port forwarding or SSH tunneling (see here for instructions).

  5. How to restart a CI job after a infrastructure failure?
    There are two options for restarting a CI job:
    • The easiest way is to add the following comment on the GitLab MR.
      Jenkins please retry a build
      
    • If this for some reason doesn’t work, click on the link to the Jenkins output CI-MERGE_REQUEST XYZ on the GitLab MR page. This should bring you to the Jenkins job summary page where you can see the result of the individual sub-jobs and access their log files. The log file of the main job (which is doing the git checkout + git merge) can be accessed from the navigation menu on the left under Console Output. In order to restart (or cancel) a CI job you need to login with you CERN account using the login button in the top-right corner of the screen. After a successful login, the navigation menu on the left should show a Rebuild button. If that is not the case, your account needs to be added to the relevant Jenkins group as described above.
  6. Where are the per-package build log files?
    Once the CI job is finished ATLAS robot publishes a summary comment on the GitLab MR discussion page which contains a link to NICOS. On this page you can access the log files for the git checkout + git merge operation, the cmake configuration and the build of the externals by clicking on the three icons in the Build time, checkout, conf, inst column. The column #PB also gives a summary about the number of failed packages and packages which had warnings (in parentheses). Clicking on these numbers gets you to a page which lists the build logfiles for each individual package where the colors indicate the severity of compiler warnings/errors.

  7. Are Draft/WIP merge requests processed by the CI system? No, merge requests marked as Draft are not processed by the CI system. But the CI can be triggered manually as described in FAQ 5. However, once the CI finishes the system will not add a review label (as of ATLINFR-2754) since the Draft status indicates that the change is not yet ready for review.

  8. Which expert to contact? The CI system should automatically tag relevant experts using an expert watch list. If no one is listed as a watcher (or if this list seems out of date), please contact someone from the relevant software domain to fix it. A list of contacts for various software domains (and CP groups) can be found here. Make sure that you mentioned the expert in the GitLab using the @username notation.

  9. Why is the pipeline status for a merge request not displayed?
    Make sure that atlasbot was added as a developer to the source fork as described here. In case you spot a MR where the source repository is lacking atlasbot as a member, you should make a comment like to the one below

     Hi @<username>, please add `atlasbot` as a developer to your fork as described
     [here](https://atlassoftwaredocs.web.cern.ch/gittutorial/gitlab-fork/) so that
     the CI pipeline status is displayed correctly for future MRs.
        
     Thanks.
    
  10. Where can I find the output of the unit tests?
    Unfortunately, the output of the testing stage can only be viewed directly in the Jenkins logfiles for the moment. We are working on uploading the test results to NICOS.

  11. What is a reasonable number of changed files in one MR? This question is hard to answer in general. Replacing endreq by endmsg in the whole code base would be a valid MR touching thousands of files. It is not the number of changes (or modified files) which should be important but rather whether those changes are all logically related. The goal is that each MR addresses one issue (e.g. one JIRA ticket) and does not contain multiple unrelated updates. If this requires changes to many files, it does not mean it is a bad request. You may as well see it from a different perspective: A merge request is the smallest entity on release level which can be undone easily in case there is a problem. So if a merge request contains multiple changes where there is the hypothetical possibility that you only want to undo certain parts if a bug is discovered, it is better to split the merge request.

  12. How can I cancel a running pipeline? Since our CI pipelines run in Jenkins, the GitLab UI cannot be used to cancel pipelines. Instead one has to use the Jenkins web interface. However, since aborting pipelines mid-job can lead to corrupted build nodes, we generally advise against canceling any CI jobs. For the rare cases where this is needed (e.g. a user triggers a large number of CI jobs), the release coordinator should be informed who can cancel the jobs and perform a node cleanup if problems arise.

Feedback

Please feel free to contact the ATLAS git team if you have any suggestions on the documentation material, open an issue or improve it yourself by following our contribution guide. For problems with the CI system open a ticket in ATLINFR.