The Linux way; never ever break user experience

Through the years it has become more and more obvious to me that there’s two camps in open source development, and one camp is not even aware of how the other camp works (or succeeds, rather), often to their own detriment. This was blatantly obvious in Miguel de Icaza’s blog post What Killed the Linux Desktop, in which he accused Linus Torvalds for setting the attitude of breaking API’s to the developer’s heart content without even realizing that they (Linux kernel developers) do the exact opposite of what he claimed; Linux never ever breaks user-space API. This triggered a classic example of many thorned discussions between the two camps, which illustrates how one side doesn’t have a clue of how the other side operates, or even; that there’s entirely different way of doing things. I will name the camps “The Linux guys” (even though they don’t work strictly on the Linux kernel), and “The user-space guys”, which is people that work on user-space, GNOME being one of the peak examples.

This is not an attempt to put people in two black-and-white categories, there’s a spectrum of behaviors, and surely you can find people in a group with the mentality of the other camp. Ultimately it’s you the one that decides if there’s a divide in attitudes or not.

The point of this post is to explain the number one rule of Linux kernel development; never ever break user experience, why that is important, and how far off the user-space camp is.

When I say “Linux” I’m referring to the Linux kernel, because that’s the name of the kernel project.

The Linux way

There are many exhaustive details of what makes the Linux kernel development work on a day-to-day basis, and many reasons for why it is the way it is, and why it’s good and desirable to work in that way. But for now I will simply concentrate on what it is.

Never ever break user experience. This point cannot be stressed enough. Forget about development practices, forget about netiquette, forget about technical competence… what is important is the users, and to never break their expectations. Let me quote Linus Torvalds:

The biggest thing any program can do is not the technical details of the program itself; it’s how useful the program is to users.

So any time any program (like the kernel or any other project), breaks the user experience, to me, that’s the absolute worst failure that a software project can make.

This is a point that is so obvious, and yet many projects (especially big ones), often forget; that they are nothing without their user-base. If you start a small project all by yourself, you are painfully aware that in order for your project to succeed, you need users. Without users you cannot get more developers, and your project could very well disappear from the face of the Earth, and nobody would notice. But once your project is big enough, one or two users complaining about something start being less of an issue, in fact, hundreds of them might be insignificant, and at some point, you loose any measure of what percentage of users are complaining, this problem might grow to the point that developers say “users don’t know what they want”, in order to ignore the importance of users, and their needs.

But that’s not the Linux way; it doesn’t matter if you have one user, or ten, or millions, your project still succeeds (or fails) because of the same reason; it’s useful to some people (or not). And if you break user experience, you risk that usefulness, and you risk your project being irrelevant for your users. That is not good.

Of course, there are compromises, sometimes you can do a bit of risk analysis: OK; this change might affect 1% of our current users, and the change would be kind of annoying, but it would make the code so much more maintainable; let’s go for it. And it’s all about to where you draw the line. Sometimes it might be OK to break user experience, if you have good reasons for it, but you should really try to avoid it, and if you go forward, provide an adjustment period, a configuration for the old behavior, and even involve your users in the whole process to make sure the change is indeed needed, and their discomfort is minimized.

At the end of the day it’s all about trust. I use project X not only because it works for me, but because I trust that it will keep working for me in the years to come. If for some reason I expected it to break next year, I might be better off looking for something else right now that I trust I could keep relying on indefinitely, than having project X break on me while I’m on the middle of a deadline, and I don’t have time for their shenanigans.

Obvious stuff, yet many project don’t realize that. One example is when the udidks2 project felt they should change the address of the mount directories from `/media/foo`, to `/run/media/$user/foo`. What?! I’m in the middle of something important, and all of a sudden I can’t find my disks’ content in /media? I had to spend a considerable amount of time until I found the reason; no, udisks2 didn’t had a bug; they introduced this change willingly and knowingly. They didn’t give any deprecation warning while they moved to the new location, they didn’t have an option to keep the old behavior, they just moved it, with no explanation, in one single commit (here), from one version to the next. Am I going to keep using their project? No. Why would I? Who knows when would be the next time they decide to break some user experience unilaterally without deprecation warnings or anything? The trust is broken, and many others agree.

How about the Linux kernel? When was the last time your Linux kernel failed you in some way that it was not a bug, but that the developers knowingly and willingly broke things for you? Can’t think of any? Me neither. In fact, people often forget about the Linux kernel, because it just works. The external drivers (like NVIDIA or AMD) is not a problem of the kernel, but the drivers themselves, and I will explain later on. You have people bitching about all kinds of projects, and threatening forks, and complaining about the leadership, and whatnot. None of that happens with the Linux kernel. Why? Because it just works. Not for me, not for 90% of the users, for everybody (or 99.99% of everybody).

Because they never ever break user experience. Ever. Period.

The deniers

Miguel de Icaza, after accusing Linus not maintaining a stable ABI for drivers, went on arguing that it was kernel developers’ fault of spreading attitudes like:

We deprecated APIs, because there was a better way. We removed functionality because “that approach is broken”, for degrees of broken from “it is a security hole” all the way to “it does not conform to the new style we are using”.

What part of “never ever break user experience” didn’t Icaza understand? It seems he only mentions the internal API, which does change all the time in the Linux kernel, and which has never had any resemblance of a promise that it wouldn’t (thus the “internal” part), and ignoring the public user-space API, which does indeed never break, which is why you, as a user, don’t have to worry about your user-space not working on Linux v3.0, or Linux v4.0. How can he not see that? Is Icaza blind?

Torvalds:

The gnome people claiming that I set the “attitude” that causes them problems is laughable.

One of the core kernel rules has always been that we never ever break any external interfaces. That rule has been there since day one, although it’s gotten much more explicit only in the last few years. The fact that we break internal interfaces that are not visible to userland is totally irrelevant, and a total red herring.

I wish the gnome people had understood the real rules inside the kernel. Like “you never break external interfaces” – and “we need to do that to improve things” is not an excuse.

Even after Linus Torvalds and Alan Cox explained to him how the Linux kernel actually works in a Google+ thread, he didn’t accept anything.

Lennart Poettering being face to face with both (Torvalds and Cox), argued that this mantra (never break user experience) wasn’t actually followed (video here). Yet at the same time his software (the systemd+udev beast) recently was criticized for knowingly and willingly breaking user experience by making the boot hang for 30s per device that needed firmware. Linus’ reply was priceless (link):

Kay, you are so full of sh*t that it’s not funny. You’re refusing to
acknowledge your bugs, you refuse to fix them even when a patch is
sent to you, and then you make excuses for the fact that we have to
work around *your* bugs, and say that we should have done so from the
very beginning.

Yes, doing it in the kernel is “more robust”. But don’t play games,
and stop the lying. It’s more robust because we have maintainers that
care, and because we know that regressions are not something we can
play fast and loose with. If something breaks, and we don’t know what
the right fix for that breakage is, we *revert* the thing that broke.

So yes, we’re clearly better off doing it in the kernel.

Not because firmware loading cannot be done in user space. But simply
because udev maintenance since Greg gave it up has gone downhill.

So you see, it’s not that GNOME developers understand the Linux way and simply disagree that’s the way they want to go, it’s that they don’t even understand it, even when it’s explained to them directly, clearly, face to face. This behavior is not exclusive to GNOME developers, udisks2 is another example, and there’s many more, but probably not as extreme.

More examples

Linus Torvalds gave Kay a pretty hard time for knowingly and willingly introducing regressions, but does Linux fares better? As an example I can think of a regression I found with Wine, after realizing the problem was in the kernel, I bisected the commit that introduced the problem and notified Linux developers. If this was udev, or GNOME, or any other crappy user-space software, I know what their answer would be: Wine is doing something wrong, Wine needs to be fixed, it’s Wine’s problem, not ours. But that’s not Linux, Linux has a contract with user-space and they never break user experience, so what they did is to revert the change, even though it made things less-than-ideal on the kernel side, that’s what was required so you, the user, doesn’t experience any breakage. The LKML thread is here.

Another example is what happened when Linux moved to 3.0; some programs expected a 2.x version, or even 2.6.x, these programs were clearly buggy, as they should check that the version is greater than 2.x, however, the bugs were already there, and people didn’t want to recompile their binaries, and they might not even be able to do that. It would be stupid for Linux to report 2.6.x, when in fact it’s 3.x, but that’s exactly what they did. They added an option so the kernel would report a 2.6.x version, so the users would have the option to keep running these old buggy binaries. Link here.

Now compare the switch to Linux 3.0 which was transparent and as painless as possible, to the move to GNOME 3. There couldn’t be a more perfect example of a blatant disregard to current user experience. If your workflow doesn’t work correctly in GNOME 3… you have to change your workflow. If GNOME 3 behaves almost as you would expect, but only need a tiny configuration… too bad. If you want to use GNOME 3 technology, but you would like a grace period while you are able to use the old interface, while you adjust to the new one… sucks to be you. In fact, it’s really hard to think of any way in which they could have increased the pain of moving to GNOME 3. And when users reported their user experience broken, the talking points were not surprising: “users don’t know what they want”, “users hate change”, “they will stop whining in a couple of months”. Boy, they sure value their users. And now they are going after the middle-click copy.

If you have more examples of projects breaking user experience, or keeping it. Feel free to mention them in the comments.

No, seriously, no regressions

Sometimes even Linux maintainers don’t realize how important this rule is, and in such cases, Linus doesn’t shy away from explaining it to them (link):

Mauro, SHUT THE FUCK UP!

It’s a bug alright – in the kernel. How long have you been a
maintainer? And you *still* haven’t learnt the first rule of kernel
maintenance?

If a change results in user programs breaking, it’s a bug in the
kernel. We never EVER blame the user programs. How hard can this be to
understand?

> So, on a first glance, this doesn’t sound like a regression,
> but, instead, it looks tha pulseaudio/tumbleweed has some serious
> bugs and/or regressions.

Shut up, Mauro. And I don’t _ever_ want to hear that kind of obvious
garbage and idiocy from a kernel maintainer again. Seriously.

I’d wait for Rafael’s patch to go through you, but I have another
error report in my mailbox of all KDE media applications being broken
by v3.8-rc1, and I bet it’s the same kernel bug. And you’ve shown
yourself to not be competent in this issue, so I’ll apply it directly
and immediately myself.

WE DO NOT BREAK USERSPACE!

The fact that you then try to make *excuses* for breaking user space,
and blaming some external program that *used* to work, is just
shameful. It’s not how we work.

Fix your f*cking “compliance tool”, because it is obviously broken.
And fix your approach to kernel programming.

And if you think that was an isolated incident (link):

Rafael, please don’t *ever* write that crap again.

We revert stuff whether it “fixed” something else or not. The rule is
“NO REGRESSIONS”. It doesn’t matter one whit if something “fixes”
something else or not – if it breaks an old case, it gets reverted.

Seriously. Why do I even have to mention this? Why do I have to
explain this to somebody pretty much *every* f*cking merge window?

This is not a new rule.

There is no excuse for regressions, and “it is a fix” is actually the
_least_ valid of all reasons.

A commit that causes a regression is – by definition – not a “fix”. So
please don’t *ever* say something that stupid again.

Things that used to work are simply a million times more important
than things that historically didn’t work.

So this had better get fixed asap, and I need to feel like people are
working on it. Otherwise we start reverting.

And no amount “but it’s a fix” matters one whit. In fact, it just
makes me feel like I need to start reverting early, because the
maintainer doesn’t seem to understand how serious a regression is.

Compare and contrast

Now that we have a good dose of examples it should be clear how the difference in attitudes from the two camps couldn’t be more different.

In the GNOME/PulseAudio/udev/etc. camp, if a change in API causes a regression on the receiving end of that API, the problem is in the client, and the “fix” is not reverted, it stays, and the application needs to change, if the user suffers as a result of this, too bad, the client application is to blame.

In the Linux camp, if a change in API causes a regression, Linux has a problem, the change is not a “fix”, it’s a regression and it must be reverted (or otherwise fixed), so the client application doesn’t need to change (even though it probably should), and the user never suffers as a result. To even hint otherwise is cause for harsh public shaming.

Do you see the difference? Which of the two approaches do you think is better?

What about the external API?

Linux doesn’t support external modules, if you use an external module, you are own your own. They have good reasons for this; all modules can and should be part of the kernel, this makes maintenance easy for everybody.

Each time an internal API needs to be changed, the person that does the change can do it for all the modules that are using that API. So if you are a company, let’s say Texas Instruments, and you manage to get your module into the Linux mainline, you don’t have to worry about API changes, because they (Linux developers), would do the updates for you. This allows the internal API to always be clean, consistent, relevant, and useful. As an example of a recent change, Russell King (the ARM maintainer), introduced a new API to set the DMA mask, and in the process updated all users of dma_set_mask() to use the new function dma_set_mask_and_coherent(), and by doing that found potential bugs in many instances. So companies like Intel, NVIDIA, and Texas Instruments, benefit from cleaner and more robust code without moving a finger, Rusell did it all in his 51 patch series.

In addition, by having all the modules on the same source tree, when a generic API is to be added, it’s easy to consider all possible use-cases, because the code is readily available. An example of this is the preliminary Common Display Framework, which takes into consideration drivers from Renesas, NVIDIA, Samsung, Texas Instruments, and other Linaro companies. After this framework is done, all existing display drivers would benefit, but things would be specially easier for future generations of drivers. It’s only because of this refactoring that the amount of drivers supported by Linux can grow without the amount of code exploding uncontrollably, which is one of the advantages Linux has over Windows, OSX, and other operating systems’ kernels.

If companies don’t play along in this collaborate effort, like is the case with NVIDIA’s and AMD’s proprietary drivers, is to their own detriment, and there’s nobody to blame but those companies. Whenever you load one of these drivers, Linux goes immediately into a tainted mode, which means that if you find problems with the Linux kernel, Linux developers cannot help you. It’s not that they don’t want to help, but it’s that they might be physically incapable. If a closed-source module has a bug and corrupts memory on the kernel side, there is no way to find that out, and might show as some other module, or even the core itself crashing. So if a Linux developer sees a crash say, on a wireless driver, but the kernel is tainted, there is only so much he can do before deciding it’s not worth his time to investigate this issue which has a good chance of being caused by a proprietary driver.

Thus if a Linux update broke your NVIDIA driver, blame NVIDIA. Or even better, don’t use the proprietary driver, use noveau.

Conclusion

Hopefully after reading this article it would be clear to you what is the number one rule of Linux kernel development, why it is a good rule, and why other projects should follow it.

Unfortunately it should also be clear that other projects, particularly those related to GNOME, don’t follow it, and why that causes such backlash, controversy, and forks.

In my opinion there’s no hope in GNOME, or any other user-space project, being nearly as successful as Linux if they don’t follow the simplest most important rule. Linux will always keep growing in importance and development power, and these others are forever doomed to forks, nearly identical alternatives, and their developers jumping ship after their trust gets broken. If only they would follow this simple rule, or at least understand it.

Bonus video

Tux

Advanced Git concepts; the upstream tracking branch

Probably one of most powerful and under-utilized concepts of Git is the upstream tracking branch, and to be honest it probably was too difficult to use properly in the past, but not so much any more.

Here I’ll try to explain what it is, and how you can take the most advantage out of it.

Remote tracking branches

Before trying to understand what the upstream tracking branch is, you need to be familiar with remote branches (e.g. origin/master). If you are not, you probably want to read the section about them in the Pro Git book here.

To see all your remote tracking branches, you can use ‘git branch –remotes’.

The upstream tracking branch

Even if you have never heard of the concept, you probably already have at least one upstream tracking branch: master -> origin/master. When you clone a repository the current HEAD (usually ‘master’) is checked out for you, but also, it’s setup to track ‘origin/master’, and thus ‘origin/master’ is the “upstream” of ‘master’.

This has some implications on some Git tools, for example, when you run ‘git status‘ you might see a message like this:

# Your branch is behind 'origin/master' by 1 commit.

Also, if you run ‘git branch -vv‘:

* master 549ca22 [origin/master: behind 1] Add bash_profile

This is useful in order to keep your local branches synchronized with the remote ones, but it’s only scratching the surface.

Once you have realized that your local branch has diverged from the remote one, you will probably want to either rebase or merge, so you might want to do something like:

git rebase origin/master

However, ‘origin/master’ is already configured as the upstream tracking branch of ‘master’, so you can do:

git rebase master@{upstream}

Maybe you think @{upstream} is too much to type, so you can do @{u} instead, and since we are already on ‘master’ we can do HEAD@{u}, or even simpler:

git rebase @{u}

But Git is smarter than that, by default both ‘git merge’ and ‘git rebase’ will use the upstream tracking branch, so:

git rebase

Configuring the upstream branch

So now you know that upstream tracking branches are incredibly useful, but how to configure them? There’s many ways.

By default, when you checkout a new branch, and you are using a remote branch as the starting point, the upstream tracking branch will be setup automatically.

git checkout -b dev origin/dev

Or:

git checkout dev

If the starting point is a local branch, you can force the tracking by specifying the –track option:

git checkout --track -b dev master

If you already created the branch, you can update only the tracking info:

git branch --set-upstream-to master dev

There’s a very similar option called –set-upstream, however, it’s not intuitive, and it’s now deprecated in favor of –set-upstream-to, to be sure and avoid confusion, simply use -u.

You can also set it up at the same time as you are pushing:

git push --set-upstream origin dev

Finally, you can configure Git so they are always created, even if you don’t specify the –track option:

git config --global branch.autosetupmerge always

Conclusion

So there you have it, go nuts and configure the upstream branch for all your branches ;)

git branch --vv with upstream

An in-depth analysis of Mercurial and Git branches

I’ve discussed the advantages of Git over Mercurial many times (e.g. here, and here), and I even created a challenge for Mercurial supporters, but in this blog post I’ll try to refrain from doing judgments and concentrate on the actual facts (the key-word being try).

Continuing this full disclosure; I’ve never actually used Mercurial, at least on a day-to-day basis, where I actually had to get something done. But I’ve used it plenty of times testing many different things, precisely to find out how to do things that I can do easily in Git. In addition, I’ve looked deep into the code to figure out how to overcome some of what I considered limitations of the design. And finally, I wrote Git’s official GitMercurial bridge; git-remote-hg (more here).

So, because I’ve spent months figuring out how to achieve certain things in Mercurial, and after talking with the best and the brightest (Git, gitifyhg, hg-git, and Mercurial developers), and exploring the code myself, I can say with a good degree of confidence that if I claim something cannot be done in Mercurial, that’s probably the case. In fact, I invited people from the #mercurial IRC channel in Freenode to review this article, and I invite everyone to comment down below if you think there’s any mistake (comments are welcome).

Git vs. Mercurial branches

Now, I’ve explained before why I think the only real difference between Git and Mercurial is how they handle branches. Basically; Git branches are all-purpose, all-terrain, and Mercurial have different tools for different purposes, and can almost do as much as Git branches, but not quite.

I thought the only real limitation was that Mercurial branches (or rather bookmarks), didn’t nave a per-repository namespace. For example: in Git the branch “development” can be in different repositories, and point to different commits, and to visualize them, you can refer to “max/development” (Max’s development branch), “sarah/development” (Sarah’s), “origin/development” (The central repository version), “development” (your own version). In Mercurial you only have “development”, and that’s it. I consider that a limitation of Mercurial, but feel free to consider it a “difference”. But it turns out there’s more.

In Git, it’s easy to add, remove, rename, and move branches. In Mercurial, bookmarks are supposed to work like Git branches, however, they don’t change the basics of how Mercurial works, and in Mercurial it doesn’t matter if you have a bookmark or not pointing to a commit, it’s still there, and completely visible; in Mercurial, each branch can have multiple “heads”, it doesn’t matter if there’s a bookmark pointing to it or not. So in order to remove a bookmark (and its commits), you need to use “hg strip” command, and to use that command, you need to enable the MqExtension, however, that’s for local repositories, for remote ones you need to cross your fingers, and hope your server has a way to do that — Bitbucket does through its web UI, but it’s possible that there is just no way.

Mercurial advocates often repeat the mantra “history is sacred”, and Mercurial’s documentation attempts to explain why changing history is hard, that shows why it’s hard to remove bookmarks (and it’s commits); it’s just Mercurial’s design.

On the other hand, if you want to remove a branch in git; you can just do “git push :feature-a“. Whether “history is sacred” or not is left for each project to decide.

Solving divergence

In any version control system, divergence is bound to happen, and in distributed ones, even more. Mercurial and Git solve this problem in very different ways, lets see how by looking at a very simple divergent repository:

Diverged

As you can see we have a “Fix” in our local branch, but somebody already did an “Update” to this branch in the remote repository. Both Mercurial and Git would barf when you try to push this “Fix” commit, but lets see how to solve it in each.

In Git this problem is called a “non fast-forward” push, which means that “Fix” is not an ancestor of the tip of the branch (“Update”), so the branch cannot be fast-forwarded to “Fix”. There are three options: 1) force the push (git push --force), which basically means override “origin/master” to point to “master”, which effectively dumps “Update” 2) merge “Update” and “Fix” and then push 3) rebase “Fix” on top of “Update” and then push. Obviously dropping commits is not a good idea, so either a merge or a rebase are recommended, and both would create a new commit that can be fast-forwarded from “Update”.

In Mercurial, the problem is called “multiple heads”. In Git “origin/master” and “master” are two different branches, but in Mercurial, they are two heads of the same branch. To solve the problem, you can start by running “hg heads“, which will show you all the heads of all the branches, in this case “Fix” and “Update” would be the heads of the “default” branch (aka. “master”). Then you have also three options: 1) force the push (hg push --force), although in appearance it looks the same as the Git command, it does something completely different; it pushes the new head to the remote 2) merge and push 3) rebase and push (you need the rebase extension). Once again, the first option is not recommended, because it shifts the burden from one developer to multiple ones. In theory, the developer that is pushing the new commit would know how to resolve the conflicts in case they arise, so (s)he is the one that should resolve them, and not take the lazy way out and shift the burden to other developers.

Either way solves the problem, but Git uses remote namespaces, which I already shown are useful regardless, and the other requires the concept of multiple heads. That is one reason why the concept of “anonymous heads”, that is used as an example of a feature Mercurial has over Git, is not really needed.

Mercurial bookmarks and the forced push problem

The biggest issue (IMO) I found with Mercurial bookmarks is how to create them in the first place. The issue is subtle, but it affects Git-like workflows, and specially Git<->Mercurial bridges, either way it’s useful to understand Mercurial’s design and behavior.

Suppose you have a very simple repository:

Simple repository

In Git, “feature-a” is a branch, and you can just push it without problems. In Mercurial, if “feature-a” is a bookmark, you can’t just push it, because if you do, the “default” branch would have two heads. To push this new bookmark, you need to do “hg push --force“. However, this only happens if the commit “Update” is made, also, you can push “feature-a” if it points to “Init”, and after pushing the bookmark, you can update it to include the “Feature A” commit. The end result is the same, but Mercurial barfs if you try to push the bookmarks and the commits at the same time, and there’s an update on the branch.

There’s no real reason why this happens, it’s probably baggage from the fact that Mercurial bookmarks are not an integral part of the design, and in fact began as an extension that was merged to the core in v1.8.

To workaround this problem in git-remote-hg, I wrote my own simplified version of the push() method that ignores checks for new heads, because in Git there cannot be more than one head per branch. The code still checks that the remote commit of this branch is an ancestor of the new one, if not, you would need to do ‘git push –force’, just like in Git. Essentially, you get exactly the same behavior of Git branches, with Mercurial bookmarks.

Fixing Git

All right, I’m done trying to avoid judgement, but to try to be fair, I’ll start by mentioning the one (and only one) feature that Git lacks in comparison to Mercurial; find the branch-point of a branch, that is; the point where a branch was created (or rebased onto). It is trivial to figure that out visually, and there are scripts that do a pretty good job of finding that out from the topology of the repository, but there are always corner-cases where this doesn’t work. For more details on the problem and proposed solutions check the stackoverflow question.

Personally I’ve never needed this, but if you absolutely need this, it’s easy to patch Git, I wrote a few patches that implement this:

https://github.com/felipec/git/commits/fc/base

This implements the @{tail} notation, which is similar to the official @{upstream} notation, so you can do something like “development@{tail}”, which will point to the first commit the “development” branch was created on.

If this was really needed, the patches could be merged to upstream Git, but really, it’s not.

Fixing Mercurial

On the other hand fixing Mercurial wouldn’t be that easy:

  1. Support remote ‘hg strip’. Just like Git can easily delete remote commits, Mercurial should be able to.
  2. Support remote namespaces for bookmarks. Begin able to see where “sarah/development” points to, is an invaluable feature.
  3. Improve bookmark creation. So the user doesn’t need to force the push depending on the circumstances

Thanks to git-remote-hg, you can resolve 2) and 3) by using Git to work with Mercurial repositories, unfortunately, there’s nothing anybody can do for 1), it’s something that has to be fixed in Mercurial’s core.

Conclusion

I often hear people say that what you can achieve with Git, you can achieve with Mercurial, and vice versa, and at the end of the day it’s a matter of preference, but that’s not true. Hopefully after reading this blog post, you are able to distinguish what can and cannot be done in each tool.

And again, as usual, all comments are welcome, so if you see a mistake in the article, by all means point it out.

Cheers.

What’s new in Git v1.8.4 Mercurial bridge

Git v1.8.4 has been released, and git-remote-hg received a lot of good updates, here’s a summary of them:

Precise branch tracking

Git is able to find out and report if a branch is new, if it needs to be updated, if it can be fast-forward or not, etc.

   b3f6f3a..c0d1c89  master -> master
 * [new branch]      new -> new
 ! [rejected]        bad -> bad (non-fast-forward)
 ! [rejected]        updated -> updated (fetch first)

Unfortunately, Mercurial’s code doesn’t make this easy (you can’t just push new bookmarks), but it has been worked around by writing a custom push() method.

In addition, if you use my patched version of Git (here), you can also use –force and –dry-run when pushing.

+ 51c1c5f...0faf0ed bad -> bad (forced update)

In short, git-remote-hg now makes interacting with Mercuerial repositories exactly the same as with Git ones, except for deleting remote branches (Mercurial just cannot do that).

Shared repository

One of the most useful features of Git (and that Mercurial doesn’t have), is remote name-spaces. So you can easily track “max/development”, “sarah/development”, etc. however, to properly track multiple Mercurial repositories, git-remote-hg needs to create a clone of the Mercurial repo, and if the repository is a big one, having multiple unrelated clones wastes a lot of space.

The solution is to use the Mercurial share extension, which is not really an extension, as it’s part of the core (but can only be used by activating the extension), so you can add as many Mercurial remotes as you want, and they would all share the same object store.

Use SHA-1’s to identify revisions

Previously, Mercurial revisions were stored as revision numbers (e.g. the tenth commit is stored as 10), which means if history is rewritten there’s no way to tell that the revision changed, so the Git commit wouldn’t change either (as it’s cached).

By using SHA-1’s, Mercurial revisions are always tracked properly.

Properly update bookmarks

Previously, Mercurial bookmarks were only fetched once, this is now fixed to always update them.

All extensions are loaded

This way all kinds of extensions the user has configured will affect git-remote-hg, for example the keyring extension.

Make sure history rewrites get updated

Before, Git would complain that a non-fast-forward updated happened–not any more.

Always point HEAD to “default”

Mercurial properly reports which is the current branch and bookmark, but only for local repositories. To get rid of the mismatch we always track “default”

Don’t force bookmark updates

We were inadvertently forcing the update of bookmarks, effectively overriding the previous one even if the update was not fast-forward.

Use Git author for lightweight tags

Unannotated tags don’t have an author in Git, but it’s needed for Mercurial, so instead of providing an empty author, use the one configured for Git.

Fix replacing a file with a directory

What’s next

There are few features missing, and they might not land in upstream Git any more, but:

Support for revision notes

This feature allows showing the Mercurial revision as Git notes:

commit 6c88a31540012991de3add247a958fd83531256f
Author: Felipe Contreras 
Date:   Fri Aug 23 13:00:30 2013 -0500

    Test

Notes (hg):
    e392886b34c2498185eab4301fd0e30a888b5335

If you want to have the latest fixes and features, you need to use my personal repository:

https://github.com/felipec/git

Unfortunately, you not only need the python script, but to compile Git itself to have all the benefits (like push –force and –dry-run).

Wiki

Also, there’s more information and detailed instructions about how to install and configure this remote-helper.

https://github.com/felipec/git/wiki/git-remote-hg

I’m now quite confident git-remote-hg is by far the best bridge between Git and Mercurial, and here’s a comparison between this and other projects.

Enjoy :)

What it takes to improve Git or: How I fixed zsh completion

I’ve used Git since pretty much day one, and I use it all the time, so it’s important to me that it’s easy to type Git commands quickly and efficiently. I use zsh, which I believe is way superior to bash, unfortunately I found many issues with its Git completion.

In this blog post I will try to guide you through the ordeal from how I identified a problem, and how I ended up fixing it years after, for everyone’s benefit.

The issue

I work on the Linux (kernel) source tree from time to time, and I noticed that sometimes completion took a long, looong time. Specifically, I found that typing ‘git show v’ took several seconds to complete.

I decided to bring that issue up to the zsh developers, and it caused a lot of fuzz. I won’t go to every detail of the discussion, but long story short; they were not going to fix the issue because of their uncompromising principles; correctness over functionality, even if very few people use that correctness, and the functionality is almost completely broken to the point the completion is not usable in certain cases. I argued that completion is meant to make typing commands more efficient, and if completing a command takes longer than what it would have taken me to type it manually, the completion is failing its purpose. I thought any sane person would see the problem with that, but apparently I was wrong (or was I?).

Fortunately zsh has bash completion emulation, so it’s possible to use Git’s official bash completion in zsh. You loose some of the features of zsh completion, but it works very efficiently (‘git show v’ was instantaneous).

Unfortunately, zsh’s bash emulation, and zsh’ bash completion emulation (two different things), are not perfect, so some workarounds were needed in Git’s bash completion script, and those workarounds were not working properly by the time I started to use such completion, so that’s when my involvement begin.

Fixing the bridge

Each time I found a bug, I tried to fix it in Git (patch), and made sure that zsh folks fixed in their side too (commit), so eventually no workarounds would be needed, and everything would work correctly.

The completion worked for the most part, but with workarounds, and not exactly as good as bash’s. So I decided to fix zsh’s bash completion emulation once and for all. After my patches were applied by zsh developers, Git’s official completion worked much closer to how it did in bash, but there were still minor issues.

Moreover, Git’s bash completion was constantly changing, and it was only a matter of time before one change broke zsh’s completion, so I decided to get involved, understand the code and simplify it to minimize the possibility (e.g. d79f81a, 583e4d5). I saw a lot of areas of improvement, but in order to make sure nothing got broken in the process of simplification, I thought it would make sense to have some tests (5c293a6). Git’s testing framework is one of the most powerful and simple there is, so it was a pleasure to write those tests. Eventually the completion tests were good enough that I became confident in changing a lot of the completion code.

At the same time I realized most of zsh’s bash completion emulation code was not needed at all, so I wrote a very small version of it that only worked with Git’s completion. The result was very simple, and it worked perfectly, yet it could be even simpler, if only I could simplify Git’s completion even more.

The culmination of that work was the creation of __git_complete (6b179ad), a helper that has nothing to do with zsh, but it solved a long standing problem with Git completion and aliases. It’s not worth going into details about what was the problem, and why it received so much push-back from Git developers (mostly because of naming issues), what is important is that I implemented it with a wrapper function, a wrapper function that was *exactly* what my zsh simple completion wrapper needed.

Now that everything was in place, the final wrapper script ended up very small and simple (c940786), it didn’t have any of the bugs zsh’s bash completion emulation had, and was under full control of the Git project, so it could be improved later on.

Finally. I had Git completion in zsh that worked *perfectly*; it worked exactly the same as it did on bash. But that was not enough.

Now that Git completion worked just like in bash, it was time to implement some extras. zsh completion is extremely powerful, and does things bash cannot even dream of doing, and with my custom wrapper, it was possible to have the best of both worlds, and that’s exactly what I decided to do (4911589).

git-zsh-shot

Finally

So there it is, after years of work, several hundreds of mails, tons of patches through different iterations… Git now has nice zsh completion that not only works as efficiently as in bash without any difference, but in fact it even has more features.

If you want to give it a try, just follow the instructions: contrib/completion/git-completion.zsh

;)

Felipe Contreras (54):
      git-completion: fix regression in zsh support
      git-completion: workaround zsh COMPREPLY bug
      completion: work around zsh option propagation bug
      completion: use ls -1 instead of rolling a loop to do that ourselves
      completion: simplify __gitcomp and __gitcomp_nl implementations
      tests: add initial bash completion tests
      completion: simplify __gitcomp_1
      completion: simplify by using $prev
      completion: add missing general options
      completion: simplify __git_complete_revlist_file
      completion: add new __git_complete helper
      completion: rename internal helpers _git and _gitk
      completion: add support for backwards compatibility
      completion: remove executable mode
      completion: split __git_ps1 into a separate script
      completion: fix shell expansion of items
      completion: add format-patch options to send-email
      completion: add comment for test_completion()
      completion: standardize final space marker in tests
      completion: simplify tests using test_completion_long()
      completion: consolidate test_completion*() tests
      completion: refactor __gitcomp related tests
      completion: simplify __gitcomp() test helper
      completion: add new zsh completion
      completion: start moving to the new zsh completion
      completion: fix warning for zsh
      completion: add more cherry-pick options
      completion: trivial test improvement
      completion: get rid of empty COMPREPLY assignments
      completion: add new __gitcompadd helper
      completion: add __gitcomp_nl tests
      completion: get rid of compgen
      completion: inline __gitcomp_1 to its sole callsite
      completion: small optimization
      prompt: fix untracked files for zsh
      completion: add file completion tests
      completion: document tilde expansion failure in tests
      completion; remove unuseful comments
      completion: use __gitcompadd for __gitcomp_file
      completion: refactor diff_index wrappers
      completion: refactor __git_complete_index_file()
      completion: add hack to enable file mode in bash < 4
      completion: add space after completed filename
      completion: remove __git_index_file_list_filter()
      completion: add missing format-patch options
      complete: zsh: trivial simplification
      complete: zsh: use zsh completion for the main cmd
      completion: zsh: don't override suffix on _detault
      completion: cleanup zsh wrapper
      completion: synchronize zsh wrapper
      completion: regression fix for zsh
      prompt: fix for simple rebase
      completion: zsh: improve bash script loading
      completion: avoid ls-remote in certain scenarios

The problem with GNOME 3

Since I started using Linux I used GNOME, v1.2 in those times. It has always done what I needed, maybe not perfectly, and not fully, but for the most part. GNOME 3 changed all that.

I complained about GNOME 3 since day one, and I discussed with GNOME 3 developers many problems with their rationale about why what they were doing made sense, and I foresaw many of the problems they are now facing.

I even started the GNOME user survey, in an effort to make GNOME developers see the light.

I blogged once before about GNOME 3, but only superficially, and even then GNOME developers didn’t take it well. Some GNOME developers might be tired of my arguments in Google+ and other media, but it’s time to dive into the issues with GNOME 3 in this blog.

Two years after the first release of GNOME 3, once GNOME developers have had the time to polish the rough edges, it’s time to make the call; GNOME 3 sucks. Here’s why.

The purpose of GNOME

Before we start, we need to clarify the purpose of GNOME, or rather; the purpose of any software project in general, which is to be useful to the users. Your software might the most efficient at doing certain task, it might have the simpler code, and the best development practices, but it’s all for naught if it’s not useful to anybody.

Software has to be useful for other people, otherwise nobody will use your software, and nobody will contribute to your software. Quite likely there will always be other software that does something similar to yours, so you need to convince users that your software is better than the alternatives, usually by providing a good user experience.

Once the user has spent time analyzing the alternatives and has chosen your software, the user expects your software to keep working, and in the same way. If the software keeps changing the way it behaves from one version to the next, it’s not achieving it’s main goal; to be useful to the users. If you do this, your users will move on to another project that might not have as many features, but at least they can rely on it and they don’t have to spend any more of their valuable time learning what broke now in the new release.

More important than providing a good user experience, is to not break existing user expectations.

You would think this is as obvious as a leopard’s spots, yet many software projects, including GNOME, don’t seem to understand this fact.

No project is more important than the users of the project. — Linus Torvalds

What users want

The problems I’ll talk about are not my invention, but conformed by GNOME user surveys, which show the most important issues according to the users:

  1. Lack of configuration options
  2. Developers don’t listen to users
  3. Bring back traditional interface

I knew these were important issues myself before the survey, but a survey was needed in order to know for sure. Of course, even after the survey, GNOME developers reject these issues are important to their users.

Developer attitude

In my opinion the main problem with GNOME is not just the code, code can be fixed, but the attitude from the developers (which is reflected in the code), and as users made it clear in the survey.

Many open source projects would kill to have the user-base GNOME had, and welcome their input with open arms, but GNOME neglected their users, they thought they were irrelevant, and they tried to dismiss their complaints with typical defenses, which of course don’t make sense.

no-feedback

People hate new things

By far the most widely used defense for the changes in GNOME 3 is that some people are backwards, and always hate new things, so it’s no surprise that they hate GNOME 3.

Any rational person would see the problem with this claim, but unfortunately I must spell it, because I had to explain it at great length to GNOME developers, and based on the responses, one would think they are incapable of seeing hasty generalization fallacies.

Say 50% of users hate GNOME 3, how many of them have legitimate reasons, and how many hate it just because they always hate new things? It’s impossible to know. So to dismiss every complaint saying “people hate new things” is totally stupid; you cannot know beforehand if they belong to this group or not, you first need to hear their complaints, and then decide if the complaints are invalid or not.

Saying “Jon doesn’t have legitimate complaints” makes sense (after analyzing his complaints), saying “people hate new things” does not, and GNOME developers do the later when people complain about GNOME 3.

Users don’t know what they want

Really? People don’t know if they want a punch in the face or not? Some people might not know what they want, but they definitely know what they don’t want.

Moreover, that’s a very rudimentary notion of user choice. Malcolm Gladwell explains rather well how it’s true that people don’t know what they want, or rather; they can’t tell you what they want, but when they get a taste of it, they most definitely can tell you: this is what I want.

If it wasn’t clear by now; GNOME 3 is not what a lot of people wanted.

gnome-hate

People complained about GNOME 2 too

Did this actually happen? If you google GNOME 2 for results between 2002 and 2004 you would find barely (if any) results that show negative comments, so if it did happen, the backlash probably wasn’t that bad, and it probably died rather quickly. Check the thread in Slashdot, and try to find one comment suggesting to go back to GNOME 1; I couldn’t find any.

Did anybody create forks after GNOME 2 was released the same way people forked after GNOME 3? No.

Yes, people complained about GNOME 2, but not nearly as much as they do about GNOME 3. There’s even an entire wikipedia article dedicated to the controversy over GNOME 3.

 

Sorry GNOME developers, none of your excuses for not listening to your users make sense. You should always listen to your users. Without them your project is nothing.

Configuration

The biggest complaint users had was the lack of configuration options, which could have been solved very easily, if the developers had more than two neurons in their brain, but instead, there’s only sub-par alternatives.

Advanced mode

GNOME 3 got rid of a bunch of useful configurations users relied on, and to defend themselves they said you could use gsettings manually, or install (or write) an extension, and so on, but I argued that there should be an “advanced mode” that the user can optionally turn on in order to further configure the system the way they wanted.

GNOME developers argued that this was a bad idea, because more code would need to be maintained which is absolutely not true for gsettings; the configurations are already there in the code, the only thing missing is the UI. Then they argued that it would mess up the configuration UI, which they wanted to keep clean, but adding a single check box can’t possibly mess with anything. In the end, I never received a single satisfactory answer of why an advanced mode didn’t make sense.

Tweak tool

The tweak tool is the closest thing to a sane configuration interface, however, it should be cleaned up, distributed by default, and integrated in the GNOME control center by adding an “Advanced mode” checkbox that when switched on enables these settings which should be distributed among the different “capplets”.

Even then, the amount of configurations available is not enough.

Shell extensions

Finally, there’s shell extensions which sound like a great idea at first, but if you think about it for thirty seconds, you realize the problem; there’s no APIs for extensions. This means extensions bring many problems.

First, there’s no guarantee that an extension will work the next version of GNOME, because it’s modifying the JavaScript code directly, and there’s no guarantee that the code will remain the same. There’s no extension API like with Chrome, or Firefox. The result is not unexpected; extensions break very easily from one version of GNOME to the next.

The second issue is that extensions can easily break GNOME, because they can modify absolutely anything in the shell.

gsettings

Another option is to manually change the configurations through the command line. This is obviously not user-friendly, and not ideal either.

 

GNOME developers made the mistake of not allowing configurability directly into GNOME, and as a result they have made the much needed configurations second class citizens, than no matter what you do, they don’t work as expected. The problem is very easy to solve; the configurations should be integrated into GNOME, and activated by a single switch, so their normal configurations are not disturbed. But talking sense to these guys is next to impossible.

Classic mode

Finally, after so much backlash from the users for years, GNOME developers had to implement the classic mode, so users could be able to run their beloved GNOME 2 interface with GNOME 3 technology.

Unfortunately, GNOME classic is more of a hack than anything else; it’s full of bugs and inconsistencies. Sure, this mode is still in its infancy, and will probably improve in the next releases, but it’s safe to bet that it would never be as polished and usable as GNOME 2 was.

Sorry GNOME devs; too little, too late.

Suckage

I am not going to list every single reason why GNOME 3 sucks, I would never finish this post, and there’s plenty of people that have already added their share (I’ll list some of them).

Personally for me the most obvious way to see that GNOME 3 defaults are brain-dead is the way alt+tab works. The whole purpose of work-spaces is to separate work, instead of having all your browser windows, terminals, editors, email client, and so on, on the same work-space, you split them among different ones. Even their own design references for GNOME shell say so. So, to make alt-tab cycle through the windows in all works-paces instead of only the current one is idiotic; it undermines the whole purpose of work-spaces.

Instead of accepting this obvious fact (which many users complained about), they disregard any feedback, and say “oh, you can use an extension for that”. Never mind the fact that a huge percentage of their users need to install the extension just to have a sane behavior, and that the extension can break at any moment.

gnome-3-is

Another quite obviously stupid behavior is to switch to an already running “application” instead of opening a new one. For example; if I press the Windows key, and type chromium, I expect a new browser window to open, but no, instead I’m unexpectedly dragged to another work-space where there’s already a browser window. This makes sense on single-window applications, but not on other ones.

Linus Torvalds and others complained about this, and I proposed a solution; add a menu item when right clicking on the application that says “Always open new window”; problem solved. Now each time the user tries to open this application, it’s done in a new window. But no, users don’t know what they want; GNOME developers know better what’s best for us.

The list of stupidities in GNOME 3 is never ending:

gnome-3

The Linux way

What GNOME should have done is simple; don’t ever, ever break user experience. This is how the Linux project has managed to become the most successful software project in history. There are no forks of the Linux kernel, simply because there’s no need; user experience is never broken; what works in v2.0, works in v3.0, and will work in v4.0. New features are added carefully, making sure nothing breaks, and users always have a way to make the system work the way they expect it to work.

GNOME shell should have been developed alongside GNOME 2 and users should have given the option to turn this mode on and off. Slowly, GNOME shell would get more features from GNOME 2, until it was a full replacement, and at the same time GNOME 2 would evolve to integrate GNOME 3 technologies (like GTK+ 3.0). Eventually GNOME shell becomes the default, but always with the option to use the classic interface. After some cycles, the GNOME shell interface is bound to be a full replacement for the classic one.

Unfortunately, they chose not to do this.

Going this way requires more effort, of course, but it’s the only way to move forward without loosing developers and users along the way. GNOME developers argued that they didn’t have the resources to do such careful move, but that is clearly false, as the developers working on MATE, Cinnamon and GNOME classic show; there’s developers interested in making things work for the traditional mode.

They chose not to listen to the warnings, and they became impatient and went forward with a move that had the potential of angering a lot of users. Well, this potential was fully realized, and now they are paying the consequences.

They were wrong

They argued this was like GNOME 2; the people complaining about it will eventually learn the new ways, and most will be happy. They were wrong.

Even to this day people keep complaining about GNOME 3, how the interface doesn’t make sense, how the developers don’t listen, and how the design is brain-dead.

There’s no other way to put it; GNOME 3 was a mistake.

Bridge support in git for mercurial and bazaar

I’ve been involved in the git project for some time now, and through the years I’ve had to work with other DSCMs, like mercurial and monotone. Not really as a user, but investigating the inner workings of them, mostly for conversion purposes, and I’ve written tools that nobody used, like mtn2git, or pidgin-git-import, but eventually I decided; why not write something that everybody can use?

So, I present to you git-remote-hg, and git-remote-bzr. What do they do?

git clone "hg::http://selenic.com/repo/hello"
git clone "bzr::lp:gnuhello"

That’s right, you can clone both mercurial and bazaar repositories now with little to no effort.

The original repository will be tracked like any git repository:

% git remote show origin
* remote origin
  Fetch URL: hg::http://selenic.com/repo/hello
  Push  URL: hg::http://selenic.com/repo/hello
  HEAD branch: master
  Remote branches:hg and mercurial. 
    branches/default tracked
    master           tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (create)

You can pull, and in fact push as well. Actually, you wouldn’t even notice that this remote is not a git repository. You can create and push new branches… everything.

You might be thinking that this code being so new might have many issues, and you might screw up a mercurial repository. While that is true to some extent, there’s already a project that tries to act as a bridge between mercurial and git: hg-git. This project has a long history, and a lot of people use it successfully. So, I took their test framework, cleaned it up, and used to test my git-remote-hg code. Eventually all the tests passed, and I cloned big mercurial repositories without any issue, and the result were exact replicas, and I know that because the SHA-1’s were the same :)

So you can even keep working on top of clones you have created with hg-git, and switch back and forth between git-remote-hg and hg-git as you wish. The advantage over hg-git, is that you don’t need to use the mercurial interface at all, you can use git :)

To enable the hg-git compatibility mode you would need this:

% git config --global remote-hg.hg-git-compat true

What about the others?

Surely I must not be the first one try something this cool, right? I don’t want to dwell into all the details why none of the other tools suited my needs, but let’s give a quick look:

hg-git

It’s for mercurial, not git. So you need to use it through the mercurial UI.

fast-export

This is a very nice tool, but only allows you to export stuff (fetch an hg repo), and you have to manually run the ‘git fast-import’ command, setup the marks, and so on. Also, it’s not very well maintained.

Another git-remote-hg

It needs hg-git.

Yet another git-remote-hg

You need quite a lot of patches on top of git, so your vanilla version of git is not going to cut it. In addition to that it doesn’t support as many features; no tags, no bookmarks. Plus it fails all my extensive tests for git-remote-hg.

git-hg

It needs separate tools, such as ‘hg convert’, and fast-export, and it doesn’t use remote helper infrastructure, so it’s not transparent: you have to run git-hg clone, git-hg fetch, and so on.

Another git-hg

It needs hg-git, and doesn’t use git’s remote helper infrastructure either: you have to do git hg-clone, git hg-pull, etc.

There might be others, but you get the point.

The situation in bazaar is not much better, but I’m not going to bother listing the tools.

More coolness

In fact, you can push and pull from and to mercurial and bazaar :)

% git remote -v
bzr	bzr::lp:~felipec/+junk/test (fetch)
bzr	bzr::lp:~felipec/+junk/test (push)
gh	gh:felipec/test.git (fetch)
gh	gh:felipec/test.git (push)
hg	hg::bb+ssh://felipec/test (fetch)
hg	hg::bb+ssh://felipec/test (push)

Isn’t that cool?

So, are there any drawbacks? Surely some information must be lost.

Nope, not really. At least not when using the hg-git compat mode, because all the extra information is saved in the commit messages.

Update: I forgot to mention the only bidirectionality problem: octopus merges. In git a merge can have any number of parent commits, but in mercurial only two are allowed. So you might have problem converting from a git repository to a mercurial one. It’s the only feature that hg-git has, that git-remote-hg doesn’t. As for bazaar, it also allows multiple parents, so it should work fine, but this hasn’t been thoroughly tested.

The only consideration is that in the case of mercurial branches are quite different from git branches, so you need to be really careful to get the behavior you want, which is why it’s best to just ignore mercurial branches, and use bookmarks instead. When you create git branches and push them to mercurial, bookmarks will be created on whatever branch is current.

Trying it out

Just copy them files (git-remote-hggit-remote-bzr) to any location available in your $PATH (.e.g ~/bin), and start cloning :)

Soon they might be merged into upstream git, so they would be distributed as other contrib scripts (e.g. /usr/share/git), and eventually possibly enabled by default. They were distributed as other contrib scripts, and they were going to be enabled by default, until the maintainer decided not to, for no reason at all.

Enjoy!