Mercurial vs Git; it’s all in the branches

I have been thinking on comparing Git to other SCM’s. I’ve analyzed several of them and so far I’ve only done the comparison with Monotone (here). Nowadays it’s pretty clear that Git is the most widely used DSCM out there, but Mercurial is the second. So, here is a comparison between Git and Mercurial.

Note: I’m a long-time advocate of Git, and a contributor to the project, so I am biased (but I hope my conclusions are not).

Google’s analysis

So, let’s start with Google’s analysis where they compared Git with Mercurial in order to choose what to use in Google Code. Here I’ll tackle what Google considered to be Mercurial advantages.

Learning Curve. Git has a steeper learning curve than Mercurial due to a number of factors. Git has more commands and options, the volume of which can be intimidating to new users. Mercurial’s documentation tends to be more complete and easier for novices to read. Mercurial’s terminology and commands are also a closer to Subversion and CVS, making it familiar to people migrating from those systems.

That is a value judgement and while that seems to be the consensus, I don’t think Git is much more difficult than Mercurial. In the last Git user survey, people answered the question ‘Have you found Git easy to learn?’ mostly as ‘Reasonably easy’ and few as ‘Hard’ or ‘Very Hard’. Plus, the documentation is not bad, as people answered ‘How useful have you found the following forms of Git documentation?‘ very positively. However, it’s worth noting that the best documentation is online (not the official one) (according to the survey).

Regarding the amount of commands, most of them can be ignored. If you type ‘git’ only the important ones would be listed.

The last point is that Mercurial is easier for people migrating from Subversion and CVS; that is true, but personally I don’t consider that’s a good thing. IMO it’s perpetuating bad habits and mental models. For people that have not been tainted by CVS/Subversion, it might be easier to learn Git.

Windows Support. Git has a strong Linux heritage, and the official way to run it under Windows is to use cygwin, which is far from ideal from the perspective of a Windows user. A MinGw based port of Git is gaining popularity, but Windows still remains a “second class citizen” in the world of Git. Based on limited testing, the MinGW port appeared to be completely functional, but a little sluggish. Operations that normally felt instantaneous on Linux or Mac OS X took several tenths of a second on Windows. Mercurial is Python based, and the official distribution runs cleanly under Windows (as well as Linux, Mac OS X, etc).

That was probably true at the time, but nowadays msysGit works perfectly fine. There has been a lot of work to make Git more portable, and the results have been positive.

Maintenance. Git requires periodic maintenance of repositories (i.e. git-gc), Mercurial does not require such maintenance. Note, however, that Mercurial is also a lot less sophisticated with respect to managing the clients disk space (see Client Storage Management above).

Not any more.

History is Sacred. Git is extremely powerful, and will do almost anything you ask it to. Unfortunately, this also means that Git is perfectly happy to lose history. For example, git-push –force can result in revisions becoming lost in the remote repository. Mercurial’s repository is structured more as an ever-growing collection of immutable objects. Although in some cases (such as rebase), rewriting history can be an advantage, it can also be dangerous and lead to unexpected results. It should be noted, however, that a custom Git server could be written to disallow the loss of data, so this advantage is minimal.

This was an invalid argument from the beginning. Whether history is sacred or not depends on the project, many Git projects have such policy, and they don’t allow rebases of already published branches. You don’t need your SCM to be designed specifically to disallow your developers to do something (in fact rebases are also possible in Mercurial); this should be handled as a policy. If you really want to prevent your developers from doing this, it’s easy to do that with a Git hook. So really, Mercurial doesn’t have any advantage here.

In terms of implementation effort, Mercurial has a clear advantage due to its efficient HTTP transport protocol.

Git has had smart HTTP support since quite some time now.

So, of all the supposed advantages of Mercurial, only the learning curve stands, and I already explained, it’s not that strong of an argument.

IMO Google made a bad decision; it was a matter of time before Git resolved the issues they listed, and in fact Google could have helped to achieve them. Now they are thinking on adding Git support; “We’re listening, we promise. This is one of the most starred issues in our whole bugtracker.” (Update: already done). My recommendation to other people facing similar decisions is to choose the project that has a brighter future, chances are the “disadvantages” you see in the present would be resolved soon enough.

stackoverflow’s comparison

The best comparison I’ve seen is the one on stackoverflow’s question Git and Mercurial – Compare and Contrast, the answer from Jakub Narebski is simply superb.

Here’s the summary:

  • Repository structure: Mercurial doesn’t allow octopus merges (with more than two parents), nor tagging non-commit objects.
  • Tags: Mercurial uses versioned .hgtags file with special rules for per-repository tags, and has also support for local tags in .hg/localtags; in Git tags are refs residing in refs/tags/ namespace, and by default are autofollowed on fetching and require explicit pushing.
  • Branches: In Mercurial basic workflow is based on anonymous heads; Git uses lightweight named branches, and has special kind of branches (remote-tracking branches) that follow branches in remote repository.
  • Revision naming and ranges: Mercurial provides revision numbers, local to repository, and bases relative revisions (counting from tip, i.e. current branch) and revision ranges on this local numbering; Git provides a way to refer to revision relative to branch tip, and revision ranges are topological (based on graph of revisions)
  • Mercurial uses rename tracking, while Git uses rename detection to deal with file renames
  • Network: Mercurial supports SSH and HTTP “smart” protocols; modern Git supports SSH, HTTP and GIT “smart” protocols, and HTTP(S) “dumb” protocol. Both have support for bundles files for off-line transport.
  • Mercurial uses extensions (plugins) and established API; Git has scriptability and established formats.

If you read the whole answer you would see that most of the differences between Mercurial and Git are very subtle, and most people wouldn’t even notice them, however, there’s a fundamental difference that I’ll tackle in the next section: branches.

It’s all in the branches

Update: here’s a new blog post that goes deeper into the branching model differences.

Let’s go directly for an example. Say I have a colleague called Bob, and he is working on a new feature, and create a temporary branch called ‘do-test’, I want to merge his changes to my master branch, however, the branch is so simple that I would prefer it to be hidden from the history.

In Git I can do that in the following way:
git checkout -b tmp-do-test bob/do-test
git rebase master
git mergetool # resolve conflicts
git rebase --continue
git mergetool # resolve conflicts
git rebase --continue
git checkout master
git merge tmp-do-test
git branch -D tmp-do-test

(Of course this is a simplified case where a single ‘git cherry-pick’ would have done the trick)

Voilá. First I created a temporary branch ‘tmp-do-test’ that is equal to Bob’s ‘do-test’, then I rebase it on top of my master branch and resolve the conflicts, Git is smart enough to notice that by not picking any line of Bob’s last commit, the commit should be dropped. Then I go to the master branch and merge this temporary branch, and finally remove that temporary branch. I do variations of this flow quite a lot.

This is much more tricky in Mercurial (if possible). Why?

hg branch != git branch

In Git, a branch is merely one of the many kinds of ‘refs’, and a ‘ref’ is simply a pointer to a commit. This means that there’s nothing fundamentally different between ‘bob/do-test’ or ‘tmp-do-test’, or ‘master’, they are all pointers to commits, and these pointers can be easily be deleted, renamed, fetched, pushed, etc. IOW you can do pretty much whatever you want with them.

In Mercurial, a branch is embedded in a commit; a commit done in the ‘do-test’ branch will always remain in such a branch. This means you cannot delete, or rename branches, because you would be changing the history of the commits on those branches. You can ‘close’ branches though. As Jakub points out, these “named branches” can be better thought as “commit labels”.

Bookmarks

However, there’s an extension (now part of the core) that is relatively similar to Git branches, the bookmarks. These are like Git ‘refs’, and although they were originally intended to be kept local, since Mercurial 1.6 they can be shared, but of course, both sides would need this extension enabled. Even then, there is no namespace to delimit say, my ‘do-test’ bookmark from Bob’s ‘do-test’ bookmark, like Git automatically does.

Future

In the future perhaps Mercurial would add namespaces to bookmarks, and would make the bookmarks and rebase extensions part of the core. At this point it would be clear that traditional Mercurial branches didn’t make much sense anyway, and they might be dropped. And then of course there would not be much practical difference between Mercurial and Git, but in the meantime Git branches are simply better.

Conclusion

The conclusion shouldn’t be that surprising; Git wins. The way Git deals with branches is incredibly simple, and yet so powerful, that I think that’s more than enough reason to crush Mercurial in respect to functionality. Of course, for simple use-cases they are the same, and perhaps Mercurial would be easier for some people, but as soon as you start doing some slightly complicated handling of branches, the “complexity” of Git translates into very simple commands that do exactly what you want. So, every workflow is supported in Git in a straight-forward way. No wonder most popular projects have chosen Git.

Projects using Git: Linux Kernel, Android, Clutter, Compiz Fusion, Drupal, Fedora, FFmpeg (and libav), Freedesktop.org (X.Org, Cairo, D-Bus, Mesa3D), GCC, GNOME, GStreamer, KDE, LibreOffice, Perl5, PulseAudio, Qt, Ruby on Rails, Samba, Wine, VLC

Update: Eclipse is also using Git for a bunch of projects.

Projects using Mercurial: Adium, Go Programming Language, Mozilla, OpenJDK, OpenSolaris, Pidgin, Python, Vim

Advertisements