Software Merge Review Shift Reference

Last update: 15 Oct 2019 [History] [Edit]

Introduction

This page contains information about the software review process in ATLAS. It describes the tasks of merge request review shifters and provides some guidance as well as hints for common situations/problems. It is assumed that you are familiar with the general workflow for ATLAS software development as summarised in the Workflow Quick Reference. Furthermore, a solid knowledge of the C++ programming language is mandatory. An overview of the ATLAS coding conventions and style guidelines is desirable.

Shifter roles

For every working day, there are two level-1 and two level-2 shifters on shift. Ideally, they work in different time zones (e.g. CERN and US) to provide developers from all timezones with a timely review of their merge requests. Please note that the shift times indicated in the OTP system are CERN times.

The current shift crew can be found on the OTP S&C Shift Crew page.

Level-1 shifter

As a level-1 shifter you are expected to have a thorough understanding of the C++ language and some basic knowledge about python. You should be able to check the formal code requirements listed below as well as the criteria for good documentation and the adherence to the ATLAS coding conventions and style guidelines.

For trivial changes, you should approve MRs directly. If the changes are more substantial and require a deeper understanding of the Athena code structure/workflows, you can escalate this MR to the level-2 shifter for review (but please make sure you explain why, i.e. write ‘Escalating to L2, because I’m not sure whether there is a memory leak in the bar() method of Foo.cxx’ or ‘Escalating to L2 because Foo.cxx has been completely re-written.’) If you are in doubt, you can always pass a MR on to the next level, but try to guide the next shifter to save them from re-reading everything.

Level-2 shifter

In addition to the knowledge required by the level-1 shifts, you have some experience with the usage and design of the Athena software and are able to comment on structural changes. A certain level of proficiency in git is helpful as well.

Your main task is the review of MRs which were escalated to level-2 by the level-1 shifters. In case that these changes require dedicated expertise in a specific software domain, you could always include an expert in the discussion (using the @username mentioning feature in GitLab, see the FAQ for a list of contact persons). Otherwise, you are expected to approve the MR or iterate with the developer on the proposed changes.

Reminder: review process

Your task as review shifter is to review the proposed changes to the ATLAS software and, if necessary, iterate with the developer(s) until these changes fulfil certain criteria outlined below. The outcome of the review process is the recommendation for the release coordinators to accept this merge request (or not). It is up to the release coordinators to finally accept or reject a merge request.

Upon creation merge requests are labelled automatically by the CI system. The labels relevant for the review process are:

  • review-pending-level-1: The review process has started and a level-1 shifter should have a look at this MR.
  • review-pending-level-2: The review process is with the level-2 shifter for further clarification/review.
  • review-pending-expert: The advice of a software domain expert on the proposed changes is needed.
  • review-user-action-required: The review shifters raised some comments which should be addressed by the developer(s).
  • review-approved: This MR is approved by the review shifters and is recommended to be accepted by the release coordinators.

Once you are done with your part of the review process, you should enter a (short) comment about your findings and update the review label appropriately. The image below summarises the workflow.

What should you do…

…when you start the shift

  • Join the Mattermost shifter channel here which is used for communication among the shifters and experts. See whether there were any recent discussions (e.g. infrastructure problems or other news).
  • Your account needs to be added to the Jenkins user database in order to get access to job management functionalities on the Jenkins web interface (such as the job rebuild button). In order to get this privilege send an email to the ATLAS robot mentioning your CERN username.
  • Check for open merge requests on the GitLab merge request page for atlas/athena. Filter the open merge requests by the review-pending-level-1/2 labels (depending on what shift you do) and review according to the guidelines given below.
  • Check on long-standing open merge requests (e.g. sort by Oldest updated) and try to trigger some action (e.g. ping developers, include experts) if appropriate.
  • Check for failed sweeps. If there are, add a comment to the MR reminding the author that failed sweeps need to be resolved (to either sweep:ignore or sweep:done), as explained in the Git Workflow Tutorial here.

…during your shift

  • Follow discussions in the Mattermost shifter channel.
  • Frequently check for newly created/updated merge requests and review them.
  • Watch out for failed CI jobs. In case they failed due to a transient infrastructure problem, restart them (as explained in the FAQ). If you are unsure about the source of the failure, do not restart the job but reach out to the developer and/or the Mattermost channel. Excessive restarting of CI jobs leads to unnecessary load on the CI system.
  • The CI Status Board lists known problems affecting the CI system. To report new issues on the infrastructure open an ATLINFR Jira ticket. For other problems open a ticket in the Core SW, Reconstruction or Trigger Jira project. If you assign the CI label to the issue it will appear in the status board.
  • Check the MR problems page for some suggested actions to take to make sure merge requests don’t slip through the cracks. In particular try to check at least once during your shift whether there are any MRs in the “invisible” and “unlabelled” sections and add review labels to the affected MRs if there are.

…at the end of your shift

  • Say Goodbye in the Mattermost channel and mention any problems/open issues which may be of interest to the next shifters.

Checklist for reviewing a merge request

The review should encompass criteria from the following four categories, ordered by importance:

  1. functionality of code changes
  2. documentation
  3. ATLAS coding conventions and style guidelines
  4. Simulation/Overlay/Reconstruction/Derivation output changes

but should also address some general points. After the review and possible iterations with the developer, please make sure that all discussions are marked as resolved before approving a merge request.

Code functionality

  • Did the CI job run successfully?
    • Do all relevant unit tests still succeed?
    • Does the code compile with any error or warnings? A successful build is indicated by this line in the CI result:

      ✅ Athena: number of compilation errors 0, warnings 0

      whereas a problematic build that needs follow-up is marked with a warning sign:

      ⚠️Athena: number of compilation errors 0, warnings 8

      For a MR to be approved there should be zero compilation errors and warnings. In case the warnings are unrelated to the changes in the MR (see the build logs) they can be ignored.

  • In case of failed CI jobs, have a quick look to try to diagnose the issue and provide some guidance to the developer. In general, it is the responsibility of the developers to ensure that their MRs pass the CI system. However, many developers are still getting familiar with the new workflow and any help is much appreciated.
  • Do not approve incomplete MRs! Developers sometimes make MRs halfway through a bug fix/feature implementation and want this to be merged in (for various reasons). In this case, please tell the developers to mark the MR as WIP and request it only once it is final (i.e. it builds locally and has been tested).
  • Does the code assign memory?
    • Check for new!
      • Could it be replaced by stack variables?
      • Could it be replaced by a smart pointer? (recommend make_unique() or make_shared())
    • If the interface requires bare pointers, are they deleted?
    • Is memory ownership implied by getting an object back from another call (e.g. factory methods)?
      • Check that this is clearly documented!
      • Suggest change to a unique_ptr where appropriate!
    • Always get L2 signoff if memory if dynamically assigned!
  • Does the code logic look correct?
    • Watch out for logical comparisons in the wrong place, e.g. std::abs(some_value > 2)!
    • Watch out for single block statements and misleading indentation!
  • Are any strings, vectors or other large objects passed by value instead of reference?
  • Proper use of inline
    • [Need to explain what that is - it’s maybe just not advisable at all now as it’s only a suggestion to the compiler]
  • Are there any uninitialised variables? Declare variables only when they are actually needed!
  • No printing to cout/cerr - use the Athena logging service through the message macros!
  • For python code:
    • Avoid use of from module import *!
    • Use of loggers, not print statements!
    • To be continued…
  • In case of non-trivial changes to CMakeLists.txt files (e.g. custom code using more than simple set(...) or atlas_<foo>(...) functions) please notify Attila (@akraszna) as a watcher.

Documentation

Functional code is important and so is maintainability. Therefore, please check that:

  • Copyright statements: yes, a boring and yet important topic!
    • Modifications, deletions or additions to the ATLAS copyright statement are a complete no-go. The correct copyright statement must read (including the comment format):
      /*
        Copyright (C) 2002-2019 CERN for the benefit of the ATLAS collaboration
      */
      
    • Make sure that newly added source code files (easily identifiable from the Changes tab) have this copyright statement! Trivial job option fragments and boilerplate files are exempt from this rule, usually found in share or in src/components.
    • Strictly speaking the end year only needs to be updated for significant “copyrightable” changes to the source code. And in general, the end year is not that important as the Copyright only expires many decades after the creation of the works. For you as a shifter this means you don’t need to be too strict about the end year.
    • If there are many files with the copyright statement removed, it could be that this is a MR porting a SVN tag to git. Developers sometimes do this by applying patches rather than using svnpull. If you spot this situation, please recommend to the developers to use svnpull as this handles the copyright statement automatically.
  • Does the code have appropriate comments and Doxygen documentation?
  • Are the log messages useful and not too verbose?
  • Are the commit messages informative and follow guidelines?
    • It is not trivial to change commit messages afterwards. This requires using git rebase which we consider an advanced topic. However, it should be at least pointed out to the developers that guidelines for commit messages exist and that they should be considered in the future.
  • The merge request title will be used within the merge commit’s commit message. Therefore, it should adhere to the guidelines above. The MR description is not included in the git history but should contain:
    • a concise justification why the changes are necessary
    • possible implications/side effects for other parts of the code
    • link(s) to related JIRA tickets if applicable

    Sometimes people put information of limited relevance in the description, e.g.

    • This is my first merge request.
    • XYZ, can you comment on it?
    • Restarting the CI job.

    These kind of things should rather be added as comments than being put directly in the MR description. You can always ask the developer to edit the MR title and description. Keep in mind that this is the replacement of the ChangeLog and as such a critical piece of information when it comes to bug fixes and debugging.

  • Are there a unreasonable number of tiny or corrective commits?
    • Humans make mistakes and developers are no exception. In case you see commits like “Fix typo”, “Add missing include” etc. which do not add any valuable information but rather fix issues introduced in previous commits, you should mention the feature of git commit --amend to the developers. Merging commits is possible using git rebase -i which you may recommend, but this again is an advanced operation which requires a certain level familiarity with git. Another option here is to use GitLab’s Squash Commits functionality within the Merge Request.

ATLAS coding conventions and style guidelines

Sticking to the ATLAS guidelines for software development helps other people to understand your code better and faster and, thus, facilitates maintainability. As such, the following points should be mentioned as minor suggestions. Since these guidelines have not been followed too closely in the past, you will see a lot of code which could be improved. Some of it may not even been touched by the actual changes. This is a delicate topic and some developers may react huffish to these kind of comments. Therefore, the general strategy is to aim for gradual improvement than drastic changes:

  • be kind (even more polite than you usually are ;-))
  • suggest rather than request changes
  • focus on things which were touched (of course, you can still mention that your comments may still apply to other parts of the code)
  • in case the developer rigidly and repeatedly refuses to implement your suggestions, create a JIRA ticket with your findings (so that they can be addressed later) and approve the MR

Simulation/Overlay/Reconstruction/Derivation output changes

In the CI builds we’re running three main tests that check whether a particular MR is changing the simulation/overlay/reconstruction outputs. These are named, respectively:

  • SimulationTier0Test-test
  • OverlayTier0Test_required-test
  • qTestsTier0_required-test

If one or more of these tests are failing, please take the following steps:

  • Make sure the changes are intended, and reviewed by the relevant experts as usual,
  • If the changes are accepted, request the developer to run the relevant tests locally to prepare the updated reference file(s),
  • Ask the expert(s) to create a new reference file(s) version and copy the file(s) to EOS,
  • Ask the developer to update Tools/PROCTools/python/RunTier0TestsTools.py accordingly and add the changes to the MR,
  • Make sure the relevant test now succeeds.

Only after these steps, the shifter should add the label review-approved. Also, please don’t forget to make sure the relevant labels, i.e. changes-overlay-output etc., are added to the MR.

The derivation test, currently named FrozenDerivationTest_optional-test, is handled differently. If this test is failing, again make sure things are labelled correctly, and the changes are reviewed by the relevant expert(s). The reference file is updated automatically on a daily basis (nominally around 8 AM CERN time). Therefore, no further action is needed from the shifter but please be vigilant as any CI build that happens between the time this MR is accepted by the coordinator and the time the reference file is updated will have this test failed. In that case, the shifter can ask Jenkins to rebuild once the reference file is updated.

Other things to watch out for:

  • The following files should not enter the git repository:
    • ChangeLog files (no nay never)
    • cmt directories or files therein (There won’t be any cmt based builds from git)
    • .svn folders (Developers sometimes add them by mistake)
    • large binary blobs – it’s a code repository.
  • Source files that are > 100kB should probably be split.

FAQ

  1. How should developers update their merge request?
  2. How do developers notify review shifters after they responded to comments?
  3. Who should resolve discussions?
  4. How can I access atlas-sit-ci.cern.ch:8080 remotely?
  5. How to restart a CI job after a infrastructure failure?
  6. Where are the per-package build log files?
  7. Are WIP merge requests processed by the CI system?
  8. Which expert to contact?
  9. Why is the pipeline status for a merge request not displayed?
  10. Where can I find the output of the unit tests?
  11. What is a reasonable number of changed files in one MR?


  1. How should developers update their merge request?
    If developers would like to update their code changes (either based on feedback from the review shifters or for any other reason), they can simply add further commits to the source branch. The MR will be updated automatically. Since a new CI job is run on every update of the source branch, developers are encouraged to push only working sets of commits instead of pushing each commit individually. This helps reducing the load on the CI system and therefore gives the developers faster turn-around times.
    If developers want to update only the MR title, description or labels, this can all be done directly from the GitLab MR page using the Edit button in the top-right corner.

  2. How do developers notify review shifters after they responded to comments?
    If a MR as been flagged as review-user-action-required, developers should answer and address the questions raised by the review shifters. In case they updated the source branch, a new CI job will reset the review label to review-pending-level-1 automatically. In the event that no code changes were required, developers should manually change the label back to review-pending-level-1/2 (whatever they think is appropriate).

  3. Who should resolve discussions?
    Before a MR is approved, all discussions should be marked as resolved. In general, a discussion should be resolved by the initiator (which usually is the review shifter). If developers agree with a comment and they have implemented the requested change in an update of the MR, they should briefly state in the discussion that this suggestion was implemented and resolve this discussion directly.

  4. How can I access atlas-sit-ci.cern.ch:8080 remotely?
    The Jenkins server is running at the above address which is only reachable from inside the CERN firewall due to security restrictions for our build machines. In order to access this machine which provides access to all the detailed Jenkins log files, you need use port forwarding or SSH tunneling. There are several ways of doing this. Below is an example which has been tested by several users:

     ssh -N -L localhost:9999:atlas-sit-ci.cern.ch:8080 <username>@lxplus.cern.ch
    

    Then you should be able to access the Jenkins server and all log files by replacing atlas-sit-ci.cern.ch:8080 by localhost:9999 in the URL.

  5. How to restart a CI job after a infrastructure failure?
    There are two options for restarting a CI job:
    • The easiest way is to add the following comment on the GitLab MR.
      Jenkins please retry a build
      
    • If this for some reason doesn’t work, click on the link to the Jenkins output CI-MERGE_REQUEST XYZ on the GitLab MR page. This should bring you to the Jenkins job summary page where you can see the result of the individual sub-jobs and access their log files. The log file of the main job (which is doing the git checkout + git merge) can be accessed from the navigation menu on the left under Console Output. In order to restart (or cancel) a CI job you need to login with you CERN account using the login button in the top-right corner of the screen. After a successful login, the navigation menu on the left should show a Rebuild button. If that is not the case, your account needs to be added to the relevant Jenkins group as described above.
  6. Where are the per-package build log files?
    Once the CI job is finished ATLAS robot publishes a summary comment on the GitLab MR discussion page which contains a link to NICOS. On this page you can access the log files for the git checkout + git merge operation, the cmake configuration and the build of the externals by clicking on the three icons in the Build time, checkout, conf, inst column. The column #PB also gives a summary about the number of failed packages and packages which had warnings (in parentheses). Clicking on these numbers gets you to a page which lists the build logfiles for each individual package where the colors indicate the severity of compiler warnings/errors.

  7. Are WIP merge requests processed by the CI system?
    Yes, new pushes to a merge request trigger the CI regardless of the WIP status (as of ATLINFR-2717), however if the MR is WIP then the system will not add a review label (as of ATLINFR-2754) since the WIP status indicates that the change is not yet ready for review.

  8. Which expert to contact? The CI system should automatically tag relevant experts using an expert watch list. If no one is listed as a watcher (or if this list seems out of date), please contact someone from the relevant software domain to fix it. A list of contacts for various software domains (and CP groups) can be found here. Make sure that you mentioned the expert in the GitLab using the @username notation.

  9. Why is the pipeline status for a merge request not displayed?
    Make sure that atlasbot was added as a developer to the source fork as described here. In case you spot a MR where the source repository is lacking atlasbot as a member, you should make a comment like to the one below

     Hi @<username>, please add `atlasbot` as a developer to your fork as described
     [here](https://atlassoftwaredocs.web.cern.ch/gittutorial/gitlab-fork/) so that
     the CI pipeline status is displayed correctly for future MRs.
        
     Thanks.
    
  10. Where can I find the output of the unit tests?
    Unfortunately, the output of the testing stage can only be viewed directly in the Jenkins logfiles for the moment. We are working on uploading the test results to NICOS.

  11. What is a reasonable number of changed files in one MR?
    This question is hard to answer in general. Replacing endreq by endmsg in the whole code base would be a valid MR touching thousands of files. It is not the number of changes (or modified files) which should be important but rather whether those changes are all logically related. The goal is that each MR addresses one issue (e.g. one JIRA ticket) and does not contain multiple unrelated updates. If this requires changes to many files, it does not mean it is a bad request. You may as well see it from a different perspective: A merge request is the smallest entity on release level which can be undone easily in case there is a problem. So if a merge request contains multiple changes where there is the hypothetical possibility that you only want to undo certain parts if a bug is discovered, it is better to split the merge request.

Feedback

Please feel free to contact the ATLAS git team if you have any suggestions on the documentation material. Feedback regarding the CI system should be sent to the ATLAS robot.