Migrating from the SVN workflow

Last update: 06 Aug 2024 [History] [Edit]

Workflow

If you previously have experience with ATLAS’ previous source code system, Subversion, along with all of its associated workflow (SVN repository layout, package tags, Tag Collector, projects, etc.) then git can seem intimidatingly different at first. Don’t worry, it will pass… and most people do end up really appreciating git’s power and flexibility.

Here we provide a guide to thinking like a git (pun intended) when you work with ATLAS code that should help map the old workflow as much as is possible to the new one, alongside some explanations as to why the new workflow is the way it is.

We go through the workflow top to bottom, from a developer’s point of view:

Repository preamble

In SVN there was one single repository and everyone worked from that. However, this was inflexible and led to committing code to the single main repository being the only general way to let others look at it or test it.

With git we use personal or group copies of the main repository (aka the upstream repository). So there is an extra step here to take your own fork of the main repository. However, this only needs to be done once - the cost is small and the benefits are great.

Get a copy of the code locally to work on

Assuming that your basic account was setup correctly then in SVN developers would take a copy of the repository directly from the single SVN master:

svn co $SVNOFF/Tools/PyJobTransforms/trunk Tools/PyJobTransforms

(Of course, as that was tedious, svnco was used as a wrapper.)

Instead with git you clone the repository:

git clone https://:@gitlab.cern.ch:8443/YOUR_USER_NAME/athena.git
git remote add upstream https://:@gitlab.cern.ch:8443/atlas/athena.git

The second git command run is to make sure your git repository knows about the main repository - it’s important to be able to synchronise with changes there directly.

Good The git checkout is a fully functioning repository on its own - so many operations like diffing code, searching or viewing history become extremely fast. The default checkout is also of all packages, so you can see the whole of the code at once (and, of course, if you don’t want that then just use a sparse checkout).

Bad Making a local copy of the repository does take up more space - about 200MB for a sparse checkout, 800MB for a complete one…

Tip …however, it is possible to keep a lot of development fed from one checkout by using branches and fetching updates - so reuse it.

Develop new code

Actual code development is pretty independent of the SCM that ATLAS uses. However, a fully functioning local repository is more powerful.

Good You can save local snapshots of your code as a work in progress as often as you like (just git commit whenever). You can even use side branches of your main topic branch for isolating different sub-developments. You can also trivially diff your code against any version in the repository. This is really powerful when you do incremental cycles of local development and testing.

In git you also create a branch for your development at this point, which is equivalent to a set of package tags plus a Tag Collector bundle. However, it’s hugely easier and faster to do:

git fetch upstream  # Sync with latest changes in main repo
git checkout -b my-new-development upstream/[PARENT_BRANCH] --no-track

The SVN plus Tag Collector equivalent we don’t even try to show…

Good The git topic branch is far more versatile and better supported by tools.

Commit your code

svn commit ...

becomes

git add ...
git commit ...
git push -u origin ...

Why the difference? Git distinguishes between your local repository and the GitLab one - so you can keep everything local (git commit) until it’s fully ready (git push). Git also has the concept of a staging area, which is why git add is needed. (Although there are more steps, git gives you a lot more flexibility.)

Conflict resolution

In SVN updating to HEAD is pretty easy:

svn update

In git the process is just to apply the changes from the upstream branch onto your topic branch:

git fetch upstream                  # Get all upstream changes
                                    # (does not touch your checkout!)
git merge upstream/[PARENT_BRANCH]

If there is a line-by-line conflict you must resolve it by hand.

git has much more powerful facilities for rolling back out of a failed merge (see git status during the merge).

Tagging

svn cp -r 123456 $SVNOFF/Tools/PyJobTransforms/trunk $SVNOFF/Tools/PyJobTransforms/tags/PyJobTransforms-01-02-03

becomes… nothing at all. Because this change is already identified uniquely in git by the commits made in the development and by the topic branch that was used.

Good Just creating a branch in git is easier, faster and less error prone than the fiddly SVN equivalent.

Requesting a tag

In the old workflow a developer would request that a package tag be incorporated into a release by either

  • Using the Tag Collector,
  • Sending an email to the release coordinators.

In git this is done from GitLab by just clicking on the Merge Request button in the GitLab web page. Handling merge requests in GitLab is vastly superior, not just because the user interface is a thing of joy compared to Tag Collector:

  • Code discussion can be far more detailed (line by line discussions in the diff are possible)
  • The code can be updated to address points of concern, before acceptance - we can finally review code properly
  • If the code is rejected the main repository was never polluted by it
  • Discussion and outcome are archived for history

Good Faster, easier and more functional in GitLab.

Tag bundles

When packages needed to be changed in concert they needed to be handled specially in Tag Collector using a bundle. This was an awkward step and caused a lot of problems, especially if packages needed to be swept from one release to another as the bundle identity was lost when the bundle was accepted.

However, in git, a merge request will encapsulate coherently all changes, regardless of package boundaries. The merge request can be coherently cherry picked between release branches as well. (In fact, one can think of a topic branch in git as being inherently like a bundle, implemented correctly.)

Good Faster, easier and more functional with merge requests of a topic branch.

Other Issues

Where are my package tags…?

Probably the single most unsettling thing for developers used to the old workflow is the lack of package tags. These were used in SVN to snapshot a particular version of a particular package, then collected by Tag Collector to assemble a release.

We already discussed the superiority of GitLab merge requests to a Tag Collector workflow, but still, people will probably still ask where are my package tags?

  • Git does have tags, however they are tags that snapshot the state of the entire repository. But this is a good thing because it’s trivial to look at the code differences between releases without needing to go via Tag Collector:
    git diff release/21.0.1..release/21.0.8
    

    This probably shows more changes than you want, so give a path to only see the changes for a package, e.g.,

    git diff release/21.0.1..release/21.0.8 Tools/PyJobTransforms
    
  • We will also make lightweight tags per nightly so you can see changes between any two nightly builds, e.g.,
    git diff nightly/21.0/2016-12-01T2230..nightly/21.0/2016-12-07T2230 Tools/PyJobTransforms
    
  • Further, you can use git log PATH to see only commits that affected a particular PATH (which can be a package, but more powerfully can be single files or domain areas). Then follow up with a git show to see the commit log and diff for a particular commit.
  • Finally, do remember that git can also diff or log between any arbitrary commit ids and any developer can make their own private tags between significant development points, e.g.,
    git tag my_package/some_label
    

    (These tags will not be in the main repository, although other developers can get a copy of them if they add your private fork as a remote.)

We have a collection of hints that will provide more recipes that can be used.

svnpull.py

The import of ATLAS code from SVN was only made for SVN tags that were part of a release (or for dev, at least validated). If there is code you need to take from SVN into git then you can use the svnpull.py script to achieve that.

Usage is very simple: giving a package name will import the current SVN trunk; giving a package tag will import that tag. Use svnpull.py --help for full usage, which allows you to import any SVN path into git, also restricting to only some files or using an arbitrary SVN revision.

The script will only copy the code from SVN into your git checkout. It is then up to you to add and commit when you have reviewed the change and are ready to make a merge request.

Some very important notes when importing from SVN:

  • Manually remove the cmt directory, or at least remove the files cmt/Makefile.RootCore and cmt/requirements
  • Mention the tag you pulled in inside your git commit message. This is needed for us to check whether all tags in the SVN based release have made it into the git based release.
  • Copy the relevant sections of the ChangeLog into the git commit message.
  • When pulling multiple packages make a separate commit for every package. They can be in a single merge request, but make them separate commits.

Additional notes

svnpull.py has been added to lsetup, but it requires python 2.7 so run lsetup git python. The code is maintained on the svnpull branch in this GitLab repo.

Cheat Sheet

This table tries to summarise what was described above. However, it’s just not possible to map a git command to an SVN one and plug this into the new workflow (which was possible with the CVS to SVN migration), because SVN and git are just not two beasts from the same stable. So, instead, we try to map some of the key concepts and give commands or procedures that are similar.

It’s also useful to keep our git cheat sheet close at hand.

To... SVN git
Checkout code svn co git clone
Show differences svn diff git diff
Update svn update git fetch; git merge
Commit code svn ci git add; git commit; git push
Identify code patch for a release SVN package tag: svn cp git topic branch: git checkout -b new_branch_name
Request tags Tag Collector or Email GitLab merge request
Tag bundles Tag Collector unnecessary