The white and gold dress, and the illusion of free will

At first I didn’t really understand what was all the fuzz about, the dress was obviously white and gold, and everybody that saw it any other way was wrong, end of story. However I saw an article in IFLScience that explained why this might be an optical illusion, but I still thought I was seeing it right, the other people were the ones getting it wrong. Then I saw the original dress:

Original dress

#TheDress

Well, maybe it was a different version of the dress, or maybe the colors were washed away, or maybe it was a weird camera filter, or a bug in the lens. Sure, everything is possible, but maybe, I was just seeing it wrong.

I’ve read and heard a lot about cognitive science and the more we learn about the brain, the more faults we find in it. We don’t see the world as it is, we see the world as it is useful for us to see the world. In fact, we cannot see the world as it is, in atoms and quarks, we cannot, because we don’t even fully understand it yet. We see the world in ways that managed to get us where we are, we sometimes get an irrational fear of the dark and run quickly up the stairs in our safe home even if we know there can’t possibly be any tigers chasing behind us, but in the past it was better to be safe than sorry, and the ones that didn’t have that fear gene are not with us any more; they got a Darwin award.

I know what some people might be thinking; my brain is not faulty! I see the world as it truly is! Well, sorry to burst your bubble, but you don’t. Optical illusions are a perfect example, and here is one:

Optical illusion

If you are human, you will see the orange spot at the top darker than the one at the bottom, why? Because your brain assumes the one at the bottom is a shadow, and therefore it should be darker. However, they are exactly the same color (#d18600 in hex notation), remove the context, and you’ll see that, put the context back, and you can’t see them the same, you just can’t, and we all humans have the same “fault”.

This phenomenon can be explained by the theory of color constancy, and these faults are not limited to our eyes, but ears, and even rational thinking.

So, could the white and gold vs. blue and black debate be an example of this? The argument is that the people that see the dress as white and gold perceive it to be in a shadow behind a brightly lit part of a room, the people that see it as blue and black see it washed in bright light. Some people say they can see as both; some times white, some times blue.

XKCD

I really did try not to see it in a shadow, but I just couldn’t, even after I watched modified photos; I just saw a white and gold dress with a lot of contrast. I decided they were all wrong, no amount of lighting would turn a royal blue dress into white.

But then I fired GIMP (the open version of Photoshop), and played around with filters. Eventually I found what did the trick for me, and here you can see the progress:

So eventually I managed to see it, does that mean I was wrong? Well, yes, my brain saw something that wasn’t there, however, it happened for a reason, if the context was different, what my brain saw would have been correct. Perhaps in a parallel universe there’s a photo that looks exactly the same, but the dress was actually white and gold.

At the end of the day our eyes are the windows through which we see reality, and they are imperfect, just like our brains. We can be one hundred percent sure that what we are seeing is actually there, that what we remember is what happened, and that we are being rational in a discussion. Sadly one can be one hundred percent sure of something, and still be wrong.

The me the most perfect example is the illusion that we are in control of our lives. The more science finds out about the brain, the more we realize how little we know of what actually happens in the 1.5 kg meatloaf between our ears. You are not in control of your next thought any more than you are of my next thought, and when people try to explain their decisions, their reasons are usually wrong. Minds can be easily manipulated, and we rarely realize it.

There’s a lot of interesting stuff in the Internet about the subconscious and how the brain really works (as far as we know). Here’s is one talk that I particularly find interesting.

So, if you want to believe you are the master of your own will, go ahead, you can also believe the dress was white and gold. Those are illusions, regardless of how useful they might be. Reality, however, is different.

My favorite public intellectuals

Here’s a selection of my favorite public intellectuals. I love how these guys talk, write, and generally everything they do. Might be worth checking them out :)

Sam Harris

Sam Harris is an author, philosopher, and neuroscientist. Among his most notable books are The End of Faith, and The Moral Landscape. He has a blog, is on Twitter, appears on many TV shows as guest, has been on many debates, as well as lengthy talks, and has written numerous articles in respectable magazines such as The New York Times.

His topics mostly concentrate around religion, faith, morality, and science.

What I like about Sam Harris the most is the way he conveys very complex and nuanced ideas in a very effective way. He is very precise with words and has the patience to go on for ages in order to explain his ideas, but also, he is very witty and can deliver crushingly funny one-liners.

@samharrisorg

In the following video Harris is in a debate with a religious apologist and shows with very funny train of thought the ridiculousness of believing in things without evidence.

This is a quick talk at TED in which he explains how science can answer moral questions, which is the main idea behind The Moral Landscape.

Finally, my favorite talk, in which he basically destroys the idea of free will. Every minute in this hour long talk is pure gold.

Steven Pinker

Steven Pinker is an experimental psychologist, cognitive scientist, linguist, and popular science author. He is best known for his advocacy of evolutionary psychology, and the computational theory of mind.

Being an expert of language, the way he communicates in every medium is simply superb. Aside from linguistics, he goes into other topics, such as the history of violence, religion, and reason.

@sapinker

Here Pinker explains why taboos are bad, and political correctness can be dangerous.

This is a quick video where Pinker explains the importance of language in order to understand human nature.

Here’s a much longer version in which he goes into a lot of detail to explain language, and what we know about it.

Noam Chomsky

Noam Chomsky should need no introduction, he is a linguist, philosopher, cognitive scientist, logician, political commentator, anarcho-syndicalist activist. He has hundreds of books, countless articles, has been in many debates, constant talks all around the globe, in fact, he has done so many things in his life that there is even a documentary devoted to him; Noam Chomsky: Rebel Without a Pause. Not content with defining the whole field of modern linguistics at an early age, he devoted his life to political activism, even risking the well being of his own family. Today he is considered the most influential living intellectual, and the most cited author alive, right after Plato. Even at his advanced age and after losing his wife of almost 60 years, he continues to tirelessly inform the public about what happens in the world, and as he stated before, he will continue to do so as long as he is ambulatory.

Chomsky might not be the most entertaining public speaker, but what he lacks in charisma, he provides in full of content. He is basically a human encyclopedia, and he rarely states his opinion, everything he says is basically facts gathered from one place or another, and for every fact he says, he knows the reference where you can verify it.

It’s hard to find a short video that shows Chomsky’s brilliance, but this interview seems to do the job perfectly. Watch this interviewer get completely owned by Chomsky. Don’t forget part two.


Manufacturing Consent is one of Chomsky’s most powerful ideas, and if you are not in the mood of reading the book, this documentary explains the idea very well. It’s long, but you wouldn’t regret watching it.

Sorry Lennart, but you are wrong once again

Lennart Poettering’s post in G+ is gathering a lot of attention these days, most of the feedback is supportive, and positive, which is not surprising to me, because although Poettering would like us to believe otherwise, most of the open source community is pretty accommodating and non-confrontational.

I am however going to go against the current here, and criticize him, but first let me state clearly that I do not condone any physical attacks towards his person, or the threats of such. His ideas however are a different matter.

Lennart’s chief mistake is to attack the way the Linux’s kernel community is run, and say their success happens despite this. How does he know? Has he ever run a more successful community? Has anybody ever? Linux is the most successful software project in history, by more than one order of magnitude from any way you look at it. It would be presumptuous for anybody to say they know how to run this project better, specially without any evidence to back such claim, which is precisely what Poettering is doing.

In this blog I’ve analyzed the many reasons why the Linux kernel is so successful, and one of them is its combative style of discussion in which ideas are not exempt from ridicule, and strong language is often used to drive one’s point home as efficiently as possible. Many people in the community agree this is desirable, and there’s even scientific evidence that supports this notion; the best ideas arise in a confrontational environment, not in a protective one.

What’s more, Poettering himself accepts he hasn’t been involved in this community. So what the hell does he know about it? Nothing.

Poettering’s second mistake is to assume that for non-white, non-western, non-straight people the situation surely must be worst… That is not the case. Maybe, just maybe, he receives such vitriolic feedback not just because of what he does, but because of the horrible way he does it. Of course not, Poettering doesn’t need to change, his approach is perfect, in fact, the only reason he receives criticism is because he is too progressive, too audacious, too efficient, surely, that must be the reason!

Personally, my beef with Poettering starts from the fact that he blocked me from Google+. Why? Because I was complaining about a technical issue with systemd, which he initially spotted and commented, but then ignored. In the middle of the discussion I made some value judgements about certain systemd code, and he stopped responding and blocked me. That is the worst way to end a discussion; block the people who disagree with you.

Sorry Lennart, but actions have consequences, and you can only do so much disruptive changes to the Linux ecosystem without much care or consideration for others, there’s a limit to the amount of people you can block, and the criticism you ignore. You can grow as thick a skin as you want, you are still wrong. No community is going to let you continue being wrong and acting as if you are beyond reproach just like that (unless you run that community and have blocked any dissident voices of course).

Maybe it’s time to take a hard look in the mirror.

What’s missing in Git v2.0.0

I recently blogged about the Git v2.0.0 release, what changed, and why should you care. Unfortunately the conclusion was that nothing much changed (other than the usual new features and bug fixes). In this post I will discuss what should have changed, and why.

What is needed

Fortunately, Git has had the Git User’s Survey in the past, so we know what users want.

  1. user-interface: 3.25
  2. documentation: 3.22
  3. tools (e.g. GUI): 3.01
  4. more features: 2.41
  5. portability: 2.34
  6. performance: 2.28
  7. community (mailing list): 1.70
  8. localization (translation): 1.65
  9. community (IRC): 1.65

Obviously, since user-interface and documentation are the areas that need more improvement, that’s what Git v2.0.0 should have focused, right?

History

I already mentioned this in the other post, but I’ll do it again.

First of all, Git as a long history of never breaking user expectations (other than the Git v1.6.0 fiasco (which changed all the git-foo commands with ‘git foo’)), and as such a lot of thought is devoted into ways to minimize changes in behavior, or even how to avoid it completely. Perhaps too much care is devoted into this.

The preparation for Git v2.0.0 started more than three years ago with a mail from Junio C Hamano, asking for developers to submit ideas for changes that normally would not happen because they break backwards compatibility, he invited us to think as if “we were writing Git from scratch”. This big release that would break backwards compatibility was going to be named “1.8.0″ and people started to submit ideas for this important release. Eventually too much time passed, the versioning scheme changed, v1.8.0 was released, and the changes proposed for v1.8. slipped into what is now v2.0.

Since no substantial changes in behavior happened since v1.0, it would follow that v2.0 was an important release, and a good opportunity to gather all the ideas about what needs to change in Git. However, seemingly out of nowhere, without any discussion or even a warning, the maintainer tagged v2.0.0-rc0, and therefore all the features that were not already merged couldn’t be merged for v2.0.0.

Thus v2.0.0 was destined to have a small list of changes, and that’s how it remained.

What could have changed

The following is a list of things that I argued should be part of Git v2.0.0.

git update

I wrote a whole post about the issue, but basically, ‘git pull‘ is broken for the most common use-case: update the current branch.

This is a known issue that has been discussed over and over, and everyone agrees that it is indeed an issue, and something needs to be done to fix it.

There have been different proposals, but by far the most comprehensive and simple is to add a new ‘git update‘ command.

This way when you want to merge a pull request, you do ‘git pull‘, and when you just want to update the current branch, you do ‘git update‘, which by default would barf if there’s divergence between your local branch (e.g. ‘master’), and the remote one (e.g. ‘origin/master’), instead of doing a merge by default. This should decrease substantially the amount of “evil merges”, merges that happened by mistake, usually by somebody that is not familiar with Git.

The patches are relatively new, but the command is simple, so there isn’t much danger of screwing things up.

The publish tracking branch

I also wrote a blog post about this; basically Git’s support for triangular workflows is not the best.

A triangular workflow is when you pull from one location (e.g. central repo), and push to another (e.g. personal GitHub fork). If you are using upstream tracking branches (you should), you have to make a decision where you set your upstream; the central repo, or your personal one. Depending on which you use, is the advantages you get, but you cannot have it all.

But with the publish tracking branch you can have all the advantages.

I’ve been cooking these patches for a long long time and I have to say this is one essential feature for me, and they patches work perfectly.

Support for Mercurial and Bazaar

Support for Mercurial and Bazaar repositories has been cooking for a long time in the “contrib” area (you can both pull and push). At this point in time the code is production-ready, and it was already graduated and merged to be released in Git v2.1.

However, the maintainer suddenly changed his mind and decided it would be better to distribute them as third party tools. He didn’t give any valid reason and clearly didn’t think it through, but they are now separate.

The code is already widely used (git-remote-hg, git-remote-bzr), and could easily be merged.

Use “stage” instead of “index”

Everybody agrees that “index” is a horrible name for Git’s “staging area”, however, nobody has done much to fix the problem.

One first step is to replace all the –cached and –index options with –staged and –no-work, which are much simpler to understand.

Another step is to add a ‘git stage‘ command that acts as a helper to work with the staging area: ‘git stage add‘, ‘git stage diff‘, ‘git stage reset‘, ‘git stage rm‘, ‘git stage edit‘, and so on.

The patches are very straight-forward.

Default aliases

Virtually every version control system has default aliases (e.g. hg co, cvs ci, svn di, etc.), except Git.

Adding default aliases is very simple to do and only brings advantages. If you don’t like the default alias, you can override it.

Patches here.

Shoulda coulda woulda

It would have been great if you could just do ‘git clone hg::mercurial-repo‘ without installing anything extra, if everybody could start using ‘git update‘ instead of ‘git pull‘, if you could do ‘git stage diff‘, or ‘git reset --stage‘. Also, if triangular workflows were properly supported.

Unfortunately that’s not the case, and Git v2.0.0 is already released, and there isn’t much to be excited about.

You might think “perhaps for Git v3.0″ (which could happen in two years, or ten, how knows), but if the past is any indication of the future, it won’t happen, specially since I’ve given up on all these patches.

The fact of the matter is that in every release of Git, there is only one focus: performance. Despite the fact that it’s #6 in the list of concerns of users, Git developers work on this because that’s their area of expertise, because it’s fun for them, and because they get paid to do so. There are occasional new features, and a bit of portability now and then, but for the most part Windows support is neglected in Git, which is why the msysgit project was born.

The documentation will always remain cryptic, because for the developers, it’s not cryptic, it’s very clear. And the user-interface will never change, because the developers don’t like change.

If you don’t believe me look at the backwards-incompatible changes in Git v2.0.0, or in fact, try to think back to the last time Git changed anything. Personally other than the git-foo -> ‘git foo’ change in v1.6.0 (which was horribly handled), I can’t think of anything but minor changes.

Anyway, you can use all these features I listed today (and more) if you use git-fc instead of Git. It is my own fork of Git that has all the features of Git, plus more.

Is there anything in that list that I missed? Do you think Git v2.0.0 has enough changes as it is?

Git v2.0.0, what changed, and why should you care

Git v2.0.0 is a backward-incompatible release, which means you should expect differences since the v1.x series.

Unless you’ve been following closely the Git mailing list, you probably don’t know the history behind the v2.0 release, which started long time ago (more than three years). It all started with a mail from Junio C Hamano, asking for developers to submit ideas for changes that normally would not happen because they break backwards compatibility, he invited us to think as if “we were writing Git from scratch”. This big release that would break backwards compatibility was going to be named “1.8.0” and people started to submit ideas for this important release. Eventually too much time passed, the versioning scheme changed, v1.8.0 was released, and the changes proposed for v1.8. slipped into what is now v2.0.

Parts of v2.0 have been already been deployed one way or the other (for example if you have configured ‘push.default = simple’), but finally today we have v2.0 final. And here are the big changes that we got.

‘git push’ default has changed

Here’s what the release notes say:

When "git push [$there]" does not say what to push, we have used the
traditional "matching" semantics so far (all your branches were sent
to the remote as long as there already are branches of the same name
over there).  In Git 2.0, the default is now the "simple" semantics,
which pushes:

 - only the current branch to the branch with the same name, and only
   when the current branch is set to integrate with that remote
   branch, if you are pushing to the same remote as you fetch from; or

 - only the current branch to the branch with the same name, if you
   are pushing to a remote that is not where you usually fetch from.

You can use the configuration variable "push.default" to change
this.  If you are an old-timer who wants to keep using the
"matching" semantics, you can set the variable to "matching", for
example.  Read the documentation for other possibilities.

Is that clear? Given the bad track record of Git documentation it wouldn’t surprise me if you didn’t get what this chunk of text is trying to say at all. Personally I find it much easier to read the code to figure out what is happening.

So let me try to explain. When you type ‘git push’ (without any arguments), Git uses the configuration ‘push.default’ in order to find out what to push. Before ‘push.default’ defaulted to ‘matching’, and now it defaults to ‘simple’.

The ‘matching’ configuration essentially converts ‘git push‘ into ‘git push origin :‘, which means push all the matching branches, so if you have a local ‘master’, and there’s a remote ‘master’, ‘master’ is pushed; if you have a local and remote ‘fix-1′, ‘fix-1′ is pushed, if you have a local ‘ext-feature-1′, but there’s no matching remote branch, it’s not pushed, and so on.

The ‘simple’ configuration pushes a single branch instead, and it uses your configured upstream branch (see this post for a full explanation of the upstream branch), so if your current branch is ‘master’, and if ‘origin/master’ is the upstream of your ‘master’ branch, ‘git push’ will basically be the same as ‘git push origin master‘, or to be more specific ‘git push origin master:master‘ (the upstream branch can have a different name).

Note: If you are not familiar with the src:dst syntax; you can push a local branch ‘src’ and have the ‘dst’ name on the server, so you don’t need to rename a local branch, you can do ‘git push origin foobar:feature-a’, and your local branch “foobar” will be named “feature-a” on the server. This has nothing to do with v2.0.

However, if the current branch is ‘fix-1′ and the upstream is ‘origin/master’, ‘git push’ will complain that the name of the destination branch is not the same, because it doesn’t know if to do ‘git push origin fix-1:master‘ or ‘git push origin fix-1:fix-1‘.

Additionally if you do ‘git push github‘ (not the remote of your upstream branch), Git will simply use the name of the current branch, essentially ‘git push github fix-1‘ (‘fix-1′ being the name of the current branch).

This mode is anything but simple to describe. But perhaps the name is OK, because you can expect it to “simply work”.

Would I care?

If you don’t type ‘git push’, but instead specify what and where to push… you don’t care.

If you have configured ‘push.default’ already, which most likely you already did, because otherwise you will be getting the following annoying message all the time since two years ago… you don’t care.

warning: push.default is unset; its implicit value is changing in
Git 2.0 from 'matching' to 'simple'. To squelch this message
and maintain the current behavior after the default changes, use:

  git config --global push.default matching

To squelch this message and adopt the new behavior now, use:

  git config --global push.default simple

When push.default is set to 'matching', git will push local branches
to the remote branches that already exist with the same name.

In Git 2.0, Git will default to the more conservative 'simple'
behavior, which only pushes the current branch to the corresponding
remote branch that 'git pull' uses to update the current branch.

See 'git help config' and search for 'push.default' for further information.
(the 'simple' mode was introduced in Git 1.7.11. Use the similar mode
'current' instead of 'simple' if you sometimes use older versions of Git)

So, most likely you don’t care.

‘git add’ in directory

Here’s what the release notes say:

When "git add -u" and "git add -A" are run inside a subdirectory
without specifying which paths to add on the command line, they
operate on the entire tree for consistency with "git commit -a" and
other commands (these commands used to operate only on the current
subdirectory).  Say "git add -u ." or "git add -A ." if you want to
limit the operation to the current directory.

Although this is a clearer explanation, it’s not very clear what is changing, so let me give you can example.

Say you have modified two files, ‘README’ and ‘test/basic.t’, then you go to the ‘test’ directory, and run ‘git add -u‘, in pre-v2.0 only ‘test/basic.t’ will be staged, in post-v2.0 both files will be staged. If you run the command in the top level directory, nothing changes.

Would I care?

If you haven’t seen the following warning while doing ‘git add -u‘ or ‘git add -A‘, or if you don’t even use those options, you are fine.

warning: The behavior of 'git add --update (or -u)' with no path argument from a
subdirectory of the tree will change in Git 2.0 and should not be used anymore.
To add content for the whole tree, run:

  git add --update :/
  (or git add -u :/)

To restrict the command to the current directory, run:

  git add --update .
  (or git add -u .)

With the current Git version, the command is restricted to the current directory.

‘git add’ adds removals

Here’s what the release notes say:

"git add " is the same as "git add -A " now, so that
"git add dir/" will notice paths you removed from the directory and
record the removal.  In older versions of Git, "git add " used
to ignore removals.  You can say "git add --ignore-removal " to
add only added or modified paths in , if you really want to.

Again, it should be clearer with an example. Say you removed the file ‘test/basic.t’ and added a new file ‘test/main.t’, those changes are not staged, so you stage them with ‘git add test/’, pre-v2.0 ‘test/basic.t’ would remain tracked, post-v2.0, ‘test/basic.t’ is removed from the stage.

Would I care?

If you haven’t seen the following warning while doing ‘git add‘, you are fine.

warning: You ran 'git add' with neither '-A (--all)' or '--ignore-removal',
whose behaviour will change in Git 2.0 with respect to paths you removed.
Paths like 'test/basic.t' that are
removed from your working tree are ignored with this version of Git.

* 'git add --ignore-removal ', which is the current default,
  ignores paths you removed from your working tree.

* 'git add --all ' will let you also record the removals.

Run 'git status' to check the paths you removed from your working tree.

The rest

The "-q" option to "git diff-files", which does *NOT* mean "quiet",
has been removed (it told Git to ignore deletion, which you can do
with "git diff-files --diff-filter=d").

Most people don’t use this command, thus don’t care.

"git request-pull" lost a few "heuristics" that often led to mistakes.

Again, most people don’t use this command, which is mostly broken anyway.

The default prefix for "git svn" has changed in Git 2.0.  For a long
time, "git svn" created its remote-tracking branches directly under
refs/remotes, but it now places them under refs/remotes/origin/ unless
it is told otherwise with its "--prefix" option.

If you don’t use ‘git svn’, you don’t care. If you don’t see a difference between ‘trunk’ and ‘origin/trunk’, you don’t care.

tl;dr

You probably don’t care about these backward-incompatible changes. Sure, Git v2.0.0 received a good dosage of new features and bug-fixes, but so did v1.9.0, and all the versions before.

Given the fact that Git v2.0.0 has been cooking for three years, I think it’s a big missed opportunity that nothing really changed, specially given that in previous user surveys people have said the user-interface and documentation needs to improve, and there have been patches to try to do so. In a separate post I discuss what I think Git v2.0.0 should have included.

Is ‘git pull’ broken? If so, what’s the fix?

Is ‘git pull’ really broken? I know what you are thinking; such a pervasive and basic command cannot possibly be broken. Unfortunately, it is.

It is not some marginal issue, many experienced Git users avoid ‘git pull’ and even urge newcomers to avoid using that command, there’s many sites that encourage you to not use the command, and there have been a lot of threads on the mailing list about the issue (Pull is mostly evil, A failing attempt to use Git in a centralized environment), the maintainer, Junio C Hamano has accepted there’s a big problem, even Linus Torvalds agreed something needs to change.

In order to identify the problem we first need to define the two main ways ‘git pull’ is used.

Pull requests

One way ‘git pull’ is used, is to integrate pull requests into the mainline. For example in the Linux kernel, the DRM maintainer sends a pull request to Linus Torvalds, saying basically:

The following changes are available in the git repository at:

git://people.freedesktop.org/~airlied/linux drm-next

So Linus can just do:

git pull git://people.freedesktop.org/~airlied/linux drm-next

In this mode ‘git pull’ actually works fine, which is not too surprising, since it’s the main thing Linus Torvalds does.

However, this is not the way most people use ‘git pull’.

Update branch

What most people do is for example update their local ‘master’ branch, to the remote ‘origin/master’ branch. Essentially doing ‘git fetch origin’, ‘git merge origin/master’.

However, that’s not exactly what most people actually want to do.

If you don’t have any changes of your own in ‘master’, then yes, ‘git pull’ does what you want, but if you do have changes, and thus the branches have diverged, then ‘git pull’ will create a new merge commit. This might or might not be what you want, but the majority of Git newbies do not want that, or rather, the team they contribute to don’t want those “evil merges”. Unfortunately these newbies don’t know what they are doing, and Git is not making it easier.

So you end up with something like this:

git-pull

Most likely what the team wants is that the local chances are rebased on top of the remote ones, but if they want a merge, they want it the other way around, that is: merge the local changes to the remote ones, as if a topic branch was merged.

git-pull-fix

A merge with this order of parents has many advantages, including a clearer history, however, it’s not possible to do that with ‘git pull’, so you have to do ‘git fetch’, create a new branch, switch to the master branch, merge the other branch, and finally remove the other branch. It’s not straight-forward at all.

It is this mode that is broken, and that’s the reason many people try to avoid ‘git pull'; it rarely does what you want by default.

The solution

There have been many solutions proposed, however, there are many many use-cases to consider, and a solution that takes them all into consideration for the future is not easy to find.

The best solution that seems to accommodate all present use-cases and future ones is the introduction of a new command: ‘git update‘.

By default this command will complain if the branches have diverged, so you have to either do ‘git update --rebase‘ or ‘git update --merge‘, this ensures that newbies aren’t going to do “evil merges” by mistake.

Also, when you do a ‘git update --merge‘ the order of the parents is reversed, which means it appears you are merging ‘master’ to ‘origin/master’, and not the other way around as it happens with ‘git pull’, which means it appears as if you are merging a topic branch, which is what most people want.

git-update

There are many many more advantages to this new command, but probably too subtle to mention in this post.

When will this be ready?

Probably never. I sent a summary of the issues and the solution to the mailing list, which addresses all the use-cases that were discussed. I have the required patches with tests and documentation on my personal branch, and I’ve been using this new command for a while now.

Why isn’t this picked? Maybe it’s because none of the core developers experience these issues. Maybe because they don’t use ‘git pull’ in the second form. Who knows.

The fact is that there is no interest to get this fixed, even though the issue has been acknowledged, so it’s not likely to be fixed any time soon.

So what can you do about it? The best thing you can do right now is simply avoid using ‘git pull’. Additionally, you might want to instruct your fellow coworkers to avoid unsing it as well, specially the ones that are not very familiar with Git.

Also, you might want to use my fork, git-fc, which does have the ‘git update‘ command, which works better than ‘git pull‘ even when there’s no branch divergence, and when there is, ‘git update --merge‘ is also superior, because the order of the parents is right.

Using Git with triangular workflows; tips, tricks, and more

Chances are you are using a triangular workflow, even if you don’t know it. A triangular workflow simply means that you pull from one repository, and push to another. This is what the vast majority of Git users do, unfortunately most of the good stuff is buried in the nearly incomprehensible official manpages.

In this blog post I’ll try to shine some light into triangular workflows, how to make use of the upstream tracking branch for them, and explain the new publish tracking branch.

The basics

Say you clone a repository:

% git clone https://github.com/tiimgreen/github-cheat-sheet
% cd github-cheat-sheet

Then you do some changes and want to share them back.

What most people would do is create a fork in GitHub and push their changes there.

% git remote add mine https://github.com/felipec/github-cheat-sheet
% git push mine

After doing that they do a pull request so their changes can be merged to the original repository.

This workflow is not specific to GitHub by any means, for example the Linux kernel developers have the main repository in git.kernel.org, and they send pull requests by mail using repositories all over the map (example).

The help

If you do this over and over it becomes clear that a little help from Git would be nice.

The first thing you can do is setup the configuration ‘remote.pushdefault’ to the repository you usually push to (in the above case ‘mine’). So now you can type `git push` instead of `git push mine` every time.

The next thing would be to setup an upstream tracking branch (read my blog post about it if you are not familiar with it).

% git branch --set-upstream-to mine/fix-typos

Then Git would greet you with the following help:

Your branch is ahead of 'mine/fix-typos' by 1 commit.

This is telling you that you probably want to push your branch again, since it’s not up-to-date in the remote. It shows you that each time you switch to that branch, or when you do `git status`.

Moreover, `git branch -vv` would show you this help:

* fix-typos ... [mine/fix-typos: ahead 1] Fix a bunch of typos

So it seems Git already has tons of help for this workflow, doesn’t it? Not so fast.

The real upstream

The upstream tracking branch is useful for other purposes, but for that we need to set a different upstream:

% git branch --set-upstream-to origin/master

Now that the upstream is ‘master’ in the ‘origin’ remote, and when you run `git status`, you get:

Your branch and 'origin/master' have diverged,
and have 2 and 10 different commits each, respectively.

What that message is telling you is that ‘origin/master’ has moved, so there are 10 commits in ‘origin/master’ that your branch doesn’t have (and your branch has 2 commits ‘origin/master’ doesn’t have). In those cases you probably would want to rebase on top of ‘origin/master’ so that it’s easier for upstream maintainers to merge your branch, although you can merge ‘origin/master’ too, or simply do nothing and hope there are no conflicts. Either way the information is useful so you can decide what to do.

In addition, if you want to rebase, the command is easier; instead of `git rebase origin/master` you can just type `git rebase`, since `git rebase` by default uses the upstream tracking branch.

Moreover, if you always stay up-to-date, you can do `git pull --rebase`, which will fetch all remote the branches, and then rebase your current branch (e.g. ‘fix-typos’) on top of the upstream (e.g. ‘origin/master’). You can also configure ‘pull.rebase = true’ to always do this when you type `git pull`.

Not to mention that `git branch -vv` gives a much more useful information:

* fix-typos ... [master: ahead 2, behind 10] Fix a bunch of typos

Check how it looks in my real repository:

git branch --vv with upstream

You get other additional benefits, like for example you get warned if you try to delete a branch that hasn’t been merged to its upstream:

warning: not deleting branch 'fix-typos' that is not yet merged to
'origin/master', even though it is merged to HEAD.
error: The branch 'fix-typos' is not fully merged.
If you are sure you want to delete it, run 'git branch -D fix-typos'.

This is actually what the upstream tracking branch is meant for: to track the upstream, that is; the target branch where eventually all the commits of the source branch eventually should end up. All the commits of ‘fix-typos’ should end up in ‘origin/master’, therefore ‘origin/master’ is the upstream of ‘fix-typos’.

We want to have all the goodies of tracking ‘origin/master’ as our upstream, but we also want to track ‘mine/fix-typos’ so we know when we need to push. Unfortunately we can’t set them both as upstream, so we must choose one set of benefits over the other. Or should we?

The solution

The solution is not that hard to figure out: we need another upstream! Or rather; we need some concept that is similar to the upstream tracking branch, but instead of tracking the final destination, we track the location we push our commits to.

This is the publish tracking tracking branch.

When you set it up, you get all the information:

Your branch and 'origin/master' have diverged,
and have 2 and 10 different commits each, respectively.
Some commits haven't been published to 'mine/fix-typos'.

* fix-typos ... [origin/master, mine/fix-typos *: ahead 2, behind 10]

Notice the extra ‘*’ next to the publish branch, which hints that it needs to be published.

Also, you can type `git pull` and `git rebase`, which will use the upstream branch as you would expect, and `git push` which will use the publish branch.

In other words; everything just works perfectly.

You set up the publish branch just like you set up the upstream branch:

% git branch --set-publish-to mine/fix-typo

Or:

% git push --set-publish mine

But wait, there’s more: you are not tied to push to a single remote; you can set different branches in different remotes as publish tracking. For example ‘fix-typos’ to ‘github/fix-typos’, ‘bug-fix’ to ‘client/bug-fix’, and so on. You can even choose a different branch name in the remote: ‘client-b-bug-fix’ to ‘client-b/bug-fix’.

Nice, isn’t it?
git branch -vv publish

The problem

There is only one problem with the publish branch: it’s not in upstream git :(

It is part of my fork, git-fc. If you use my fork, you will get this and other features, and you won’t loose any feature from official Git. Or you can use the specific branch, ‘fc/publish‘.

I’ve been using this code for more than half a year, and it has been reviewed in the Git mailing list, so you can trust it won’t eat your babies :)

Why isn’t it in official Git?

WARNING: if you don’t like conflicts or you know me for “adversarial” style (and don’t like it), skip this section

That’s a very good question. If the maintainer (Junio C Hamano) has accepted the triangular workflows are lacking, and a separate ‘upstream’ tracking branch is needed. Why isn’t it there?

The short answer is that they have an ad hominem thing against me, so even if my patches are correct and they solve a long-standing problem, they are not applied. They are only picked if they are trivial, or not controversial, or obvious fixes. Which is why I started a fork.

I sent the original version of the patches in September 2013, with virtually no comments. Then on January 2014 people start discussing (once again) about the issues with triangular workflows, and even complain about the lack of @{publish}. Eventually they start writing preparatory patches. But I had already written the whole thing several months ago!

It can’t be attributed to the fact they went inadvertently unnoticed because I re-sent the series once, and because I wrote about the support for @{publish} when I announced the git-fc fork.

Then I returned to the project after a long hiatus, and noticed they were working on something I already did, so let them know and send the patches again. This time they receive more feedback, and even make it into Junio’s “pu” (proposed updates) branch. Patches are often dropped from “pu”, sometimes for no reason at all, so this is not a reason they will get in.

This is the message Junio attached to the patch series:

 Add branch@{publish}; it seems that this is somewhat different from
 Ram and Peff started working on.  There were many discussion
 messages going back and forth but it does not appear that the
 design issues have been worked out among participants yet.

The “design issues” have not been worked out because “Ram” is not actively working on Git anymore (possibly thanks to the fact that nothing ever changes), and “Peff” said he wasn’t interested in the @{publish} concept, but more like a @{push} concept which will only benefit him and his weird bare-bones mode of interacting with Git. The fact that the @{publish} concept is what would benefit a vast majority of the user base is of no consequence to “Peff”.

So will it ever get into Git’s mainline? Who knows.

Get the goodies

If you want to use the publish tracking branch feature, get git-fc and follow the installation instructions. In addition you would get a ton of other features, and will loose none :)

If you use ArchLinux, you can get the package from AUR.

Enjoy :)