Git Projects with Multiple Contributors

Please note: the older repository scm.physics has passed from "on life support" to "disconnected", and will only ever be revived in mission-critical circumstances. New projects should use git instead, optionally with gitlab.physics as a remote server. Suggestions on migrating projects from svn to git, and one one's first steps in using git, are documented elsewhere.

The modus operandi often seen in svn projects is to have most or all of the code in the trunk, and only rarely to use branches. This mainly reflects the comparative awkwardness of merging branches back into the trunk in svn, and has the downside that collaborators will need to co-ordinate with each other outside svn (or find they ought to have done, the hard way) before pushing potentially-overlapping commits to the server.
Those using a git server can take advantage of the fact that git was designed with merging branches as a first-class citizen. This Article outlines what the GitLab documentation calls the feature branch workflow, which is one of the lighter-weight methods for doing business in this manner, and which can be elaborated to (eg) their Gitlab workflow later if and as necessary.

PLEASE NOTE: This Article is under development; contents somewhat provisional, and known to both be incomplete and have a tendency to drift out of date. If you can still see this paragraph, tread with care.

Please see also:

Feature-Branch Workflow

What (for example) Git for Beginners yields will suffice for a single-user project with development taking place on one local system. In order to handle multiple collaborators safely (which can include yourself doing development on more than one system at the same time), we make use of git branches.

The basic idea of the comparatively simple feature-branch workflow method is that the master branch is (or is treated as if it were) read-only; contributors work in private branches in their own local copies, which are later merged back into master on the gitlab server. Specifically, in the feature-branch workflow, there is one master branch into which every other branch is eventually merged. For more complicated ways of doing business to suit more complex situations, please see (for example):

.... and be prepared to think of master as a metavariable name (like foo).

PLEASE NOTE: The following is at best only lightly tested by your humble Author, who is convinced only that there's further wrinkles and bear traps which he has yet to discover and document. Caveat lector. All assistance humbly accepted.

Setting up:

On the server:

(project home)
  -> Settings (gearwheels):
    Ensure that master is your default branch
  -> Protected Branches (padlock):
    Ensure branch master is protected

These are the defaults if you've done nothing unusual, but it's always worth verifying this sort of thing. (Those of you with a release-branch structure may wish to also protect branches corresponding to software releases.)

It's worth at this point defining a project convention about naming of development branches. One useful one is of the form contribname_featurename (eg majoc_debug-configs): this says what's being done, and by whom, without need to look up anything elsewhere.

Development phase:

On your local system:

  • Ensure your local checkout of master is up-to-date with respect to the server's copy.

  • git checkout -b branchname (create a local branch for your work)

You've now got a local branch to work in. Don't be tempted to start work before you have: your humble Author did exactly that while preparing for this Article, and had to backtrack with git rebase --soft before he could proceed. (Happily, he hadn't pushed the results to the server.)

  • (hack hack hack, ending with a successful .... ) git commit

  • git push -u origin branchname (send to a branch on the server)

On the test system, if different, provided this is not also the production system:

  • If this is the first time, git clone from master in the normal way; otherwise ensure your checkout of master is up-to-date with respect to the server's copy.

  • git checkout -b branchname (create a local branch to receive your work)

  • git branch -u origin/branchname (associate it with the right branch on the server)

  • git pull (fetch and merge)

  • (test test test)

Rinse and repeat, until your code's master-ready.

PLEASE NOTE: production systems are sacred: only ever check out master on them. If you can't test your code on your development system, either find a separate non-production system you can test it on, or improve your testing regime. Checking out a test branch to the production server can seriously confuse your collaborators, as your humble Author can attest (with embarrassment) from personal experience.

Creating a Merge Request:

BOOKMARK: I haven't had to do this yet. Here be dragons.

Performing the merge:

The nominated project leader of the week (who may or may not be you) does the requested merge(s) into master, including resolving any conflicts which may arise.

PLEASE NOTE: This is an alternative to pushing the Merge button via gitlab's WebUI. Sadly, I can't yet see what to click therein to copy the short log message from each commit being merged in into the merge commit message, as done by the --log argument at the command line. All assistance humbly accepted.

On the local system:

  • git remote show origin

  • Do whatever's necessary to ensure that your local copy of master is up-to-date with respect to the server, and vice versa; and likewise with the branch to be merged. This may involve saying git pull or perhaps git push in each branch in turn.

  • git checkout master

  • Check for any potential merge conflicts: git log --name-status otherbranch and git diff master..otherbranch will be your friends.

  • git merge --no-ff --log otherbranch (see Notes below)

  • This will launch the editor with a suitable commit message skeleton. Edit if and as necessary, and save.

  • BOOKMARK: sorting out merge conflicts.

  • Test the result, and commit any necessary changes. It is, after all, quite possible that (eg) minor changes to one function have blown another function clean out of the water.

  • git push origin

On the production server:

git checkout master (this should be redundant)
git fetch origin
git show-branch
git log --first-parent

Do not at this point re-merge the branch, whatever git log without --first-parent may tell you (see Notes below).

Notes:

In the git merge line above, --log adds the short commit messages from otherbranch. The argument --no-ff helps keep history straight: if master has no commits since otherbranch was branched from it, the default action would be to "fast-forward" the branch's commits, and they'd appear as if they'd been committed direct into master. This can be useful in branches from development branches, but does the Wrong Thing here.

After a merge, the output of git log will show the commits from both branches, as will the commits page on the gitlab server; this is agreed to be confusing. To see only master, add --first-branch to the git log incantation. (Other utilities such as git show-branch or gitk --all aren't so easily confused, or confusing.)

Forking gitlab projects

THIS SECTION IS VERY MUCH A WORK IN PROGRESS. The contents should probably move to a different Page.

One of the strengths of gitlab is that anybody with commit rights can change any branch other than master, eg to add new code. One of the weaknesses in a collaborative environment is that anybody with commit rights can affect any such branch, including (much to your humble Author's embarrassement) inadvertently removing gitlab's copy of an active branch which somebody else is relying on.

One way round this is to use a fork of the main repo, work in that, then issue pull requests of your workings back into the main project; you can then be demoted (or demote yourself) on the main project to Reporter, who can do less damage. Where both projects are on the same gitlab system, the pull request takes the form of a merge request between two branches which happen to be in different projects.

Here's a show-and-tell of how that's done, using a sacrifical repo where I'm in separate charge of both ends of the proceedings.

On the upstream project's page in gitlab:

  • Go to the main page, and click on Fork. This will start a dialogue for setting up a copy of the upstream project (hereafter "Upstream") in your own space, with you as Maintainer.

  • Proceed to set up your forked project in the normal way, including the all-important e-mail on submit.

On your workstation/laptop/whatever:

  • Move your current local checkout of Upstream aside:

    cd ..
    mv sacrifical sacrifical-upstream

  • Check out your fork in the usual way. This will give you a local project where origin points to your fork:

    git checkout git@physics.ox.ac.uk:carter/sacrificial

  • Now cd into your project, and say git remote -v. You should get something of the form:

    origin git@gitlab.physics:carter/sacrificial.git (fetch)
    origin git@gitlab.physics:carter/sacrificial.git (push)

    .... where I've had to shorten the URLs to fit in this margin.

  • Now add upstream as an alternate remote, pointing to Upstream (again with shortened URLs, and folding the line at the backslash):

    git remote add upstream \
      git@gitlab.physics:CarterSU/sacrificial.git

  • Check the results with git remote -v, which should give you the added lines:

    upstream git@gitlab.physics:CarterSU/sacrificial.git (fetch)
    upstream git@gitlab.physics:CarterSU/sacrificial.git (push)

  • Now replace the push URL with something which will produce only an error at your end:

    git remote set-url --push upstream PUSH_DISABLED

  • Now check the result, which should be of the form:

    origin git@gitlab.physics:carter/sacrificial.git (fetch)
    origin git@gitlab.physics:carter/sacrificial.git (push)
    upstream git@gitlab.physics:CarterSU/sacrificial.git (fetch)
    upstream PUSH_DISABLED (push)

  • Fetch the upstream:

    git fetch upstream

  • Verify, eg using gitk, that the branch headers for upstream and origin coincide.

Local editing and testing:

Proceed as normal, with the usual edit/commit/push/test loop. The only thing to beware of here is that gitlab will assume that any merge requests on your fork will be to the master branch on Upstream, which for local testing is not what you want. Be careful what you ask for.

Whenever Upstream is updated, pull the updates into your forked copy in a slight amendation of the usual manner:

git fetch upstream
git checkout production
git pull upstream production

.... after which, your local checkout is up-to-date, and git push will update the copy in your fork on gitlab.

Merge request to Upstream:

This is the equivalent of a "pull request" on github, or presumably between gitlab setups on different host machines.

  • Verify that the target branch exists in the upstream project. If necessary (eg if you've surrendered upstream commit rights), ask that it be created.

  • In your fork on gitlab, select the branch of interest, and go for Merge. The default target will be to the master branch on Upstream.

  • Search for and select the correct target branch on Upstream.

  • If you're submitting your branch for review, now's the time to add "WIP:" to the front end of the Subject, so the results won't inadvertently be merged into Upstream before they're ready.

  • Submit, and if necessary e-mail Someone In Authority to ask that your workings be double-checked.

.... and that, modulo discussions with collaborators which are outside this article's remit, is about it.

Categories: Development | HOWTO | agile | git | gitlab | project management