How not to ban a prolific git developer

On the 28th of July 2021 I received an email from Git’s Project Leadership Committee saying that my recent behavior on the mailing list was found to be in violation of the Code of Conduct. This was a complete surprise to me.

What was particularly surprising to me is the fact that I’m very familiar with the Code of Conduct, not only have I reported CoC violations in the past, but I’m the only person who has sent a patch to try to improve it, not only to the git mailing list, but the upstream project Contributor Covenant as well. It is because of this that I know the document doesn’t demand the examples of positive behavior in the section titled “Our Standards”, so the only objective reason why a project leader could argue that I violated the CoC would be if I had committed an action on the list of examples of unacceptable behavior:

  • The use of sexualized language or imagery, and sexual attention or advances of any kind
  • Trolling, insulting or derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others’ private information, such as a physical or email address, without their explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Did I do any of those things? You will be the judge.

If you don’t like drama then don’t read this post. Although there’s some technical discussion, it’s mostly an analysis of my alleged CoC violations and their context. According to the Project Leadership Committee the following ten emails–which I will list as exhibits–violated the CoC.

Exhibit 1

This is a common debate tactic known as “shifting the burden of proof”.

Ævar does not need to prove that your patch is undesirable, you have to prove that it is desirable.

You have the burden of proof, so you should answer the question.

https://www.logicallyfallacious.com/logicalfallacies/Shifting-of-the-Burden-of-Proof

Felipe Contreras

In this mail I am telling Johannes Schindelin my opinion about what is the default position regarding any patch: the patch is not needed. This is not even contentious. The first thing Junio C Hamano (the maintainer) does is ask: why is this patch needed? If this question is not answered to his satisfaction the patch is rejected. And I’m not the only one that has argued precisely this point:

I don’t need that data. You are proposing a change so it is your duty to support your claim that the change is worthwhile.
Otherwise it’s a change just for the sake of change.

Michal Suchánek

How is my comment remotely close to any of the examples of unacceptable behavior? For that matter, isn’t Michal’s behavior worse? Not to mention that it’s a response to Johannes’s comment, which is objectively way more aggressive:

> Why put this in an ifdef?

Why not? What benefit does this question bring to improving this patch series?

Johannes Schindelin

Johanness is avoiding a well-intentioned question by doubting the good faith of Ævar, and it became even more clear later on in the thread:

You still misunderstand. This is not about any “opinion” of yours, it is about your delay tactics to make it deliberately difficult to finish this patch series, by raising the bar beyond what is reasonable for a single patch series.

And you keep doing it. I would appreciate if you just stopped with all those tangents and long and many replies that do not seem designed to help the patch series stabilize, but do the opposite.

Johannes Schindelin

How am I the bad guy here? And for the record, Ævar is part of the Project Leadership Committee.

Exhibit 2

This is loaded language. You are inserting your opinion into the text.

Don’t. The guidelines are not a place to win arguments.

Note that this sounds ungrammatical and unnatural to some people.

And it sounds ungrammatical because it is ungramatical, not only to native English speakers, but professional linguists.

Felipe Contreras

This is part of a long discussion in which Derrick Stolee attempted to change the guidelines to explicitly avoid any and all gendered language (he/she) for ungendered (they).

One argument that Derrick kept repeating is that only non-native English speakers find the singlar “they” ungrammatical, but I kept repeating that’s not true, and as example I used the test the American Heritage Dictionary used in their usage note regarding “they”:

We thank the anonymous reviewer for their helpful comments.

58% of the panel found the sentence unacceptable. The Usage Panel is comprised of writers, professors, linguists, editors, and multiple Pulitzer Prize winners.

Derrick ignored all my feedback, and for the record he also ignored all the feedback from Ævar, and in fact native speakers in the mailing list stated that they found these sentences ungrammatical too.

Regardless of on which side you are on, Derrick tried to win the argument by adding this to the guidelines:

Note that this sounds ungrammatical and unnatural to readers who learned English in a way that dictated “they” as always plural, especially those who learned English as a second language.

All I said is that he shouldn’t insert his opinion into the guideline, and instead should use something neutral he knows everyone can agree with:

Note that this sounds ungrammatical and unnatural to some people.

Once again, how is stating my opinion “unacceptable”?

Exhibit 3

Yeah, now you are starting to see the problem.

How many more failed attempts do you need to go through before accepting that the approach you thought was feasible is in fact not feasible?

The solution is simple and self-documenting:

pull.mode={fast-forward,merge,rebase}

Felipe Contreras

In my 8,200-word article git update: the odyssey for a sensible git pull I explored around 13 years of discussions in the mailing list regarding problems with git pull, and the first and more obvious solution is my proposed pull.mode configuration which was a direct competitor to different solutions proposed by Junio C Hamano.

For some reason Junio didn’t accept this proposal in 2013 even though other people were in favor of it. I tried again in 2020 multiple times, and after trying all the approaches other people proposed I became convinced it was the only feasible solution.

Elijah Newren–who we will later see is an important actor in this story–decided to try to fix the problem by himself, but every patch series he sent struck a dead end. It became clear to me that Elijah started to see the problem:

However, even if the above table is filled out, it may be complicated enough that I’m at a bit of a loss about how to update the documentation to explain it short of including the table in the documentation.

Elijah Newren

Yes, that’s precisely the reason why I opted for a solution that was a) easy to understand, b) easy to document, c) easy to test, and d) easy to program: pull.mode.

Elijah ignored all my feedback to all his patch series, which is why he still hasn’t realized his patches will break behavior current users rely on. I ran a poll on r/git and 19% of users responded that they rely on the behavior Elijah plans to break, and 15% said even though they don’t use it, they know what it should do.

All Elijah is achieving by ignoring me is at best wasting time, and at worst hurting users. Either way it’s not my fault.

How is my comment unacceptable? I’m just stating facts and my opinion. Even if my opinion is wrong that doesn’t make my comment “unacceptable”.

Exhibit 4

That is a problem specific for your shop.

The defaults are meant for the majority of users. If a minority of users (who happen to be working under the same umbrella) have a problem with the defaults, they can change the defaults.

Felipe Contreras

I understand why some people might think this is an attack on Randall S. Becker, but it’s really not.

Randall is a contributor that often sends reports when the latest git version breaks something in his platform: NonStop OS. I think it’s fair to say this is a relatively obscure OS. Even a smaller subset are the people that work with git inside his company.

Now, of course git should consider the NonStop platform, and of course it should consider all the users inside Randall’s company, but they are probably not even 0.01% of all git users, so why should 99.99% of users suffer at their expense?

No. The defaults are for 99.99% of users, if people in Randall’s company have a problem with the defaults and want to avoid git rebase like the plague, they can configure git anyway they want. That’s what the configurations are for: for the minority.

My opinion is that the defaults are for the majority.

If anyone has a problem with my opinion, we can debate it, but how is stating my opinion “unacceptable behavior”?

Exhibit 5

I’m sending this stub series because 1. it’s still in ‘seen’ [1], 2. my conflicting version is not in ‘seen’ [2], and 3. brian has not responded to my constructive criticism of his version [3].

Felipe Contreras

In the cover letter of my patch series: doc: asciidoctor: direct man page creation and fixes (brian’s version), I explained the reasons why I was sending it, but mainly it’s because Junio continued to refuse to drop brian’s version and include mine, even though brian himself told Junio to drop his version.

Now, if you look at my comments on the patches you could say I trashed them, but this is not my fault: brian stated plenty of things are just not true. Just on the first patch:

  • We generally require Asciidoctor 1.5, but versions before 1.5.3 didn’t contain proper handling of the apostrophe… Not true
  • [GNU_ROFF] for the DocBook toolchain, as well as newer versions of Asciidoctor, makes groff output an ASCII apostrophe instead of a Unicode apostrophe in text… Not true
  • These newer versions of Asciidoctor (1.5.3 and above) detect groff and do the right thing in all cases… Not true
  • Because Asciidoctor versions before 2.0 had a few problems with man page output… Not true

The changes were correct, but I was the one that originally wrote the patch, brian merely changed the commit message and took authorship because I objected to his text.

I understand that some people have trouble hearing that they are wrong, but this is not my problem. If you send a patch to the mailing list you should be prepared to hear all the ways in which it’s wrong. And brian m. carlson is not some rookie, he is the top #5 contributor to the Git project in the past five years, he shouldn’t need training wheels.

I do not blame brian for all these inaccuracies tough, there’s way too many details in the documentation toolchain, and the only reason why I know he was wrong is that I spent several man-days investigating these details. I explained part of the story in my post: Adventures with man color.

Moreover, if you think I’m lacking tact, remember that this is the second time I’m bringing these issues, the first time brian argued back all my suggestions and did not implement a single one. Eventually he just stopped responding to me.

Now, regarding integration: the “seen” branch is an integration branch which is maintained by Junio and contains all the topic branches that are currently being discussed and Junio is considering merging. Junio picked brian’s branch even though it was full of inaccuracies and in my opinion the commits were not properly split (each commit was doing four to five different things at once).

On the other hand my commit message did not contain inaccuracies, I wrote the original patch, my patches were properly split, contained patches from other people–including Jeff King and Martin Ågren–and in addition contained plenty of cleanups and more fixes to the output of the documentation.

If that was not enough, just the commit message of first patch (doc: remove GNU troff workaround) took me several hours of investigation to write.

It’s not my fault that brian decided to stop arguing any further, it’s not my fault that Junio decided to carry brian’s version for two months and ignore my version which was objectively superior. Those are the facts, and I was merely stating them.

Suppose that I’m wrong, let’s suppose that brian’s version is superior, in that case my statement of fact is incorrect. OK, but how is that “unacceptable behavior”?

Exhibit 6

I meant that I meant what he said I meant.

Felipe Contreras

I understand how this clarification I sent to Junio might look like to some people, but it’s simply a convoluted way of saying “what he said”. SZEDER Gábor said “I think you meant X”, I replied “yes, I meant Y, which in practical terms means what you said (X)”, and Junio (who has a habit of rewriting what I write) asked if my previous statement said “yes, yours is better and I’ll use it in an update, thanks”, but I don’t agree with Junio’s restatement.

I meant Y, which in practical terms means what SZEDER said I meant (X). So I meant either of these:

  • Y: Otherwise commands like ‘for-each-ref’ are not completed correctly by __gitcomp_builtin.
  • X: Otherwise options of commands like ‘for-each-ref’ are not completed.

Which one of these is really “better”? I don’t know, and to be honest I don’t care. I’ve been sending this patch for nine months now and in truth the original “Otherwise commands like ‘for-each-ref’ are not completed” is good enough for me.

The important thing is the fix, and the code continues to be broken to this day.

Additionally, it’s not uncommon for Junio to update the commit messages himself, in fact he did so for my last patch (doc: pull: fix rebase=false documentation), even though I sent an updated commit message he ignored my suggested modification and used his own. So why can’t he simply do the same here with a simple “s/commands/options of commands/” as I suggested?

To me this simply looks like an excuse not to merge the series, which I’ve sent eleven times already.

To avoid problems I re-sent the patch series nine minutes after Junio sent his message, and I used SZEDER’s suggestion. That was on June 8, and to this day Junio keeps repeating this on his status mails:

* fc/completion-updates (2021-06-07) 4 commits
 - completion: bash: add correct suffix in variables
 - completion: bash: fix for multiple dash commands
 - completion: bash: fix for suboptions with value
 - completion: bash: fix prefix detection in branch.*

 Command line completion updates.

 Expecting a reroll.
 cf. <60be6f7fa4435_db80d208f2@natae.notmuch>

I’ve already replied multiple times that I did the reroll immediately after, and after that I’ve re-sent the series four times since then.

Now, I understand if Junio took my reply in a way that I did not intend, but why should git users suffer? The problems these patches fix are real. SZEDER Gábor has already reviewed part of the series, and David Aguilar has tested it and he confirmed the problems exist, and the fixes work.

I have plenty more fixes on top of these (41), the reason why I kept this series small was to maximize the possibility of them getting merged, and now it turns out the reason Junio hasn’t merged them is because he didn’t like one comment I said?

To me this seems like pettiness. We are adults, if he found one of my replies objectionable, he could have simply stated so on the mailing list, or he could have sent me a personal reply (he has never done so). He could have said “I don’t like your tone, so I’m going to drop this series”, but instead he made me waste my time resending patches he was never going to merge. He kept his disapproval for himself, and only used it for ammunition to justify a future ban.

The slowness from Junio to accept these and other patches is why I chose to start the git-completion project, which is a fork of the bash and zsh completion stuff. For the record I was the one that started the zsh completion, and I started the bash completion tests in order for the completion stuff to be first-class citizens of the git project, which Junio has refused to accept as well.

Choosing to knowingly hurt git users because he didn’t find one comment palatable does not seem to me to be a behavior fit for a project leader.

I foresee that some people will conclude that I’m being petty too, but I don’t think that’s the case. I found the problems, I wrote the patches, I sent the patches, I addressed the feedback, and I updated the patches with the feedback. I’ve been trying to get them merged for nine months. What more do you want? If Junio has any further problems with the patches, he can just let me know and I’ll address them. But instead he says nothing.

To exemplify even more how Junio’s pettiness is hurting users, Harrison McCullough reported a regression with the __git_complete helper (which I wrote) on June 16, and it was caused by a change from Denton Liu which introduced a variable __git_cmd_idx, but he forgot to initialize it in __git_complete. Initially I fixed the problem by initializing __git_cmd_idx to 1, and Harrison reported that my fix indeed got rid of the issue. Two days later Fabian Wermelinger reported the same issue but sent a patch that initialized __git_cmd_idx to 0. Initially I thought he made a mistake, but upon further reflection I realized that 0 was more correct, so I updated my patch.

My patch is superior because it fixes the regression not only for bash, but for zsh too. In addition it mentions Harrison McCullough reported the issue, and it’s also simpler and more maintainable too. Junio picked Fabian’s patch and ignored my patch, and my feedback, therefore the regression is still present for zsh users. This is a fact.

Ignoring me is objectively hurting users. I’ve resent my fix on top of Fabian’s patch, so all Junio has to do to fix the regression is pick it.

How does the fact that Junio is primed against me make my convoluted statement “unacceptable behavior”?

Exhibit 7

That makes me think we might want a converter that translates (local)main -> (remote)master, and (remote)master -> (local)mail everywhere, so if your eyes have trouble seeing one, you can configure git to simply see the other… Without bothering the rest of the word.

Felipe Contreras

Once again I can see how people might misinterpret what I said here, but it’s nothing nefarious.

The rename of the “master” branch has been one of the most hotly debated topics of late (this has nothing to do with me as I didn’t even participate in the discussion). Even today it’s not entirely clear what’s the future of this proposal. If you are not familiar with the debate, you can read my blog post: Why renaming Git’s master branch is a terrible idea.

But what I attempted to do was achieve what the original poster–Antoine Beaupré–wanted to achieve but in another way. I listened to Antoine, and I proposed a different solution, that’s all.

What Antoine wanted (as I understand it) was to change all his “master” branches to “main” so that he didn’t have to see “master” everywhere. But he wanted to do this properly, so he wrote a python script to address as many renaming issues as he could.

I see some value in what Antoine wanted to do, but he literally said: “I am tired of seeing the name “master” everywhere“, if that was literally the problem, then a mapping of branch names would fix it. And this is not even that foreign to me.

When I wrote git-remote-hg and git-remote-bzr one of the main features people wanted was a way to map branch names. So for example a branch named “default” in Mercurial could be mapped to “master” in Git (this was done by default, but you get the point).

Even if this didn’t help Antoine, I thought this would help other people, say for example that some Linux developers couldn’t manage to convince Linus Torvalds to rename the master branch to “main”, but for some reason they found the name “master” offensive. Well, with my patch they didn’t have to convince Linus, they could simply configure a branch mapping so they “never had to see the name “master”“.

After listening to my reply Antoine did a more adult approach than Junio, and actually replied his discontent to my comment:

I guess that I’ll take that as a “no, not welcome here” and move on…

Antoine Beaupré

This saddened me. It was never my intention to shit on Antoine’s idea, and in fact I never intended to act a representative of the Git community, so I explained very clearly to Antoine that he shouldn’t just give up:

Do not take my response as representative of the views community.

I do believe there’s value in your patch, I’m just not personally interested in exploring it. I don’t see much value in renaming branches, especially on a distributed SCM where the names are replicated in dozens, potentially thousands of repositories. But that’s just me.

Felipe Contreras

However, brian m. carlson either didn’t read my response, or chose to not factor it in, because he replied:

There is a difference between being firm and steadfast, such as when responding to someone who repeatedly advocates an inadvisable technical approach, and being rude and sarcastic, especially to someone who is genuinely trying to improve things, and I think this crosses the line.

brian m. carlson

I wasn’t trying to be “rude and sarcastic”, I was simply trying to suggest a different approach. It did’t even need to be a competing approach, because both approaches could be implemented at the same time. I explained that to brian, did he respond back? No.

Now, even if you remove me from the picture nobody else responded to Antoine, so how exactly did my responses hinder in any way the community?

If you think my response was “rude and sarcastic” I would love to debate that, but the fact of the matter is that only I know what my intentions were, and they were definitely not that.

Moreover, I don’t believe in being offended by proxy. If Antoine Beaupré had a problem with my comment, I would listen to his objection, and even though I think I have already clarified what I meant, I would even consider apologizing to him. But I will not apologize to brian m. carlson who I’m pretty sure got offended by proxy.

Also, for the record, reductio ad absurdum arguments are not uncommon in the Git mailing list, and they don’t necessarily imply anything nefarious. Here’s one recent example from Junio:

I am somewhat puzzled. What does “can imagine” exactly mean and justify this change? A script author may imagine “git cat-file” can be expected to meow, but the command actually does not meow and end up disappointing the author, but that wouldn’t justify a rename of “cat-file” to something else.

Junio C Hamano

Can Junio’s response be considered “rude and sarcastic”? Yes. But you can also assume good faith and presume he didn’t intend to offend anyone and simply tried to prove a point using a ridiculous example.

Can my comment be considered “unacceptable behavior”? In this case I’d say yes, but not necessarily so. You can also give me the benefit of the doubt and simply not assume bad faith (and I can tell you no bad faith was intended).

Elijah Newren

So far the incidents have been pretty sporadic and could easily be reduced to misunderstandings, but the following are not. Elijah Newren decided to wage a personal vendetta against me (literally), and that context is necessary to understand the rest of the evidence.

Spring cleanup challenge

It all started when I sent my git spring cleanup challenge in which I invited git developers to get rid of their carefully crafted git configuration for a month. The idea was to force ourselves to experience git as a newcomer does, and figure out which are the most important configurations we rely on.

This challenge was a success and very quickly we figured out at least two configurations virtually every experienced git developer enables: merge.conflictstyle=diff3 and rerere.enabled=true. Additionally my contention is that git should also have default aliases (e.g br => branch), but there was no clear consensus on that.

Take for example the configuration merge.defaulttoupstream. Back in 2010 while exploring issues with git pull I proposed that git merge should by default merge the upstream branch. I later sent the patch in 2011 which received pretty universal support, and there were interesting comments, like:

I totally agree — this would be a good change[*] — but this suggestion has been made before, and always gets shot down for vaguely silly reasons…

Miles Bader

Just a few days later Jared Hance sent his own version of the patch by introducing merge.defaultupstream to activate the new behavior. Jared sent in total five versions and implemented all the suggestions from everyone, including Junio, but in the end Junio decided to rewrite the whole thing, take authorship, not mention that Jared wrote the original version of the patch (see the final commit), nor the fact that originally the idea came from me.

I personally do not care too much. I had an idea and the idea got implemented, that’s mostly all I cared about. However, it wouldn’t have costed anything to Junio to add “Original-idea-by: Felipe Contreras” as it’s customary (here’s an example).

Then in 2014 I realized that before git 2.0 (which was meant to break backwards compatibility) it was the perfect time to enable merge.defaulttoupstream by default (literally no one disabled it anyway), so I sent a patch for that and other defaults. Junio disagreed and excluded these from 2.0, but in the end he ended up merging them.

So finally in git 2.1 git merge ended up merging the upstream branch by default.

What I find interesting is that in 2021 (7 years later), Ævar–a prominent git developer–didn’t even know that merge.defaulttoupstream changed it’s default value, so he still had merge.defaulttoupstream=true in his configuration. Thanks to my challenge he argued it should be true by default, and thus realized that was already the case. Additionally I noticed the default value was not mentioned in the documentation, so I sent a fix for that.

At that point my relationship with Elijah was amicable, and he decided to join the challenge in his words “to join in on the fun“.

zdiff3

The problems started when I took it upon myself to try to enable merge.conflictstyle=diff3 by default. This task turned out to be much more complicated than I initially thought. Flipping the switch is extremely easy, but once you do that a ton of tests that expect the diff2 format start to fail. No problem, I thought by simply changing merge.conflictstyle to the old value at the beginning of each test file that fails would solve it. Later on each test file can be updated to expect the diff3 format.

It turned out that didn’t work. Many commands completely ignored the configuration merge.conflictstyle, and those are clearly bugs. So even before attempting to change merge.conflictstyle we have to fix those bugs first.

This is what I attempted to do on my patch series: Make diff3 the default conflict style.

Immediately Johannes Sixt pointed out that such change resulted in a very convoluted output for him. This however I think is just bad luck, since never in all my years of using diff3 have I ever seen an output similar to that, and that’s because recursive merges are involved.

Jeff King provided a link to a discussion from 2013 regarding the output of diff3, and that discussion itself lead to other discussions from 2008. It took me a while to read all those discussions, but essentially it boils down to a disagreement between Junio C Hamano and Jeff King (both part of the leadership committee) about whether or not a level higher than XDL_MERGE_EAGER (1) made sense for diff3. Junio argued that the output wasn’t proper, but Jeff argued that even though it wasn’t proper, it was still useful. That lead to Uwe Kleine-König–a Linux kernel developer–to implement zdiff3, which is basically diff3 but without artificially capping the level as Junio did in order for the output to be proper.

When I said maybe we should consider adding this zdiff3 mode, Jeff King mentioned his experience with it:

I had that patch in my daily build for several years, and I would occasionally trigger it when seeing an ugly conflict. IIRC, it segfaulted on me a few times, but I never tracked down the bug. Just a caution in case anybody wants to resurrect it.

Jeff King

My interpretation of that message is extremely crucial. I am very precise with language, both when writing, and reading. So I read what Jeff said exactly how he said it. First, he said in “several years” (1+ years) he occasionally would try this. Then he said of of the times he tried it, it crashed a few times, but crucially he said “if I recall correctly”, so this means he is not sure. Maybe it crashed more than a few times, or maybe it crashed less than a few times, he is not sure.

Whatever the issue was, Jeff did not find it serious enough to track the bug, since he never tracked down the bug in the several years he was using the patch.

Jeff sent his message on a Thursday, on Friday Elijah Newren said he might investigate the zdiff3 stuff, and on Sunday I re-sent the 2013 patch from Uwe. I added a note to the patch stating why I was sending it, and what I did to test it. Essentially I ran the entire test suite using zdiff3 instead of diff3 and everything passed. This implies that if there’s any issue with zdiff3 it probably is not that serious.

I sent the patch at 9:30. One hour later Jeff King replied “I take it you didn’t investigate the segfault I mentioned”. I don’t know how I was supposed to investigate that, other than what I already did: run the whole test suite with zdiff3. Jeff had the idea to recreate all the merges of the git repository using zdiff3 and after 2500 merges you can find one that crashes. This is a good idea, but it never occurred to me to do that.

By 13:00 I replied with a command to replicate the issue very simply using a git merge-file command. By 16:24 I had found the issue and sent a fix.

The next Monday Elijah Newren started to complain:

This is going to sound harsh, but people shouldn’t waste (any more) time reviewing the patches in this thread or the “merge: cleanups and fix” series submitted elsewhere. They should all just be rejected.

Elijah Newren

Elijah provided a list of reasons, none of which were true–like “no attempt was made to test” (I did attempt to test, as I explained in the note of the patch). Additionally he stated that in his opinion my submissions were “egregiously cavalier”.

Egregiously cavalier? I spent several hours on a Sunday trying to find the fix for a patch one of the most prolific git developers didn’t bother to fix for several years, and I actually did it… the very same day.

This was the start of Elijah’s personal vendetta against me. He urged Junio not only to drop Uwe’s patch that I resent, but to drop all my patches:

If I were in charge, at this point I would drop all of Felipe’s patches on the floor (not just the ones from these threads), and not accept any further ones. I am not in charge, though, and you have more patience than me. But I will not be reviewing or responding to Felipe’s patches further. Please do not trust future submissions from him that have an “Acked-by” or “Reviewed-by” from me, including resubmissions of past patches.

Elijah Newren

If that wasn’t enough, Elijah accused me of fabricating his endorsement of my patches:

Troubled enough that I do not want my name used to endorse your changes, particularly when I already pointed out that you have used my name in a false endorsement of your patch and you have now responded to but not corrected that problem (including again just now in this thread), making it appear to me that this was no mistake.

Elijah Newren

This is blatantly false. Elijah very clearly said “Yes, very nice! Thanks.” to one of my patches, and that can be considered endorsement by many people. I do not take those kinds of accusations lightly, so that prompted me to start an investigation into how many “reviewed-by” commit trailers are explicitly given as opposed to inferred, and I found that 38% of them are not explicit.

Not only that, but I found one example a month earlier in which Derrick Stolee took a “Looks good to me” comment from Elijah and used it as “reviewed-by”. Did Elijah accuse Derrick of “false endorsement”? No.

I don’t know what happened to Elijah, nor why from one day to the next he decided to make me his enemy, but clearly he is not being objective. If he was he would be objecting to Derrick’s behavior as well, since he did exactly the same thing as I. Elijah’s responses are clearly emotional, as can be seen from this reply in which he cites nineteen references, some going back to 2014–a dubious debating technique called Gish gallop, but writing this response he didn’t even pause to see that the links matched what he was referring to. Moreover, the first thing he said: “attacking a random bystander like Alex is rather uncalled for”, was completely off-base because my reply wasn’t even addressed to Alex.

I have absolutely nothing against Elijah, but clearly he does have something against me, and the rest of the reports (and probably all the previous ones) are entirely the result of that antagonism.

Exhibit 8

If you didn’t mean this patch to be applied then perhaps add the RFC prefix.

Felipe Contreras

When Alex Henrie took me up on the challenge to try to fix git pull, he created a patch that broke 11 test files, each one with many unit tests broken. This is not a huge deal, some people send patches that break the test suite, but generally when they do that they add the “RFC” (request for comments) prefix to make sure this patch is not picked by Junio.

So, assuming good faith there’s two options a) Alex didn’t see the tests breaking, or b) he knew the tests were breaking, but didn’t know he had to add RFC in that case. I simply said if this was case b), RFC should have been added.

Only a person already primed to presume malice would have seen a problem with my comment.

Exhibit 9

Wouldn’t you consider sending a patch without running ‘make test’ “cavalier”?

Felipe Contreras

This response was directed to Elijah Newren, not Alex Henrie, and I wrote it because I was genuinely curious to see if Elijah was capable of seeing the obvious discrepancy between his response to Alex, and his response to me.

The patch I sent (which wasn’t even my patch):

  • Did not break any tests by itself
  • Did not break any tests by forcing diff3 to be zdiff3
  • The patch was not going to land on the integration branch “seen”
  • The patch doesn’t change anything, so everyone using diff3 wouldn’t see the crashes
  • If the patch was applied, and a person manually enables zdiff3, out of the 16,000 merges in git.git it crashes on 13 of them (0.08%)
  • Has already been tested by multiple people for many years
  • Was fixed hours later

So the patch was relatively safe by any standard.

On the other hand Alex’s patch:

  • Broke a ton of tests by default
  • Broke the existing user interface
  • Hasn’t been used by anyone
  • Was intended to be integrated

Elijah’s response to my patch is that it was “egregiously cavalier”, and his response to Alex’s patch was “thanks for working on this”.

If this is not double standards I don’t know what is.

I did not criticize Alex, I was asking a question to Elijah.

Exhibit 10

When Alex responded to my response, he thought it was directed to him. I explained to him that it wasn’t and I made sure he knew I didn’t blame him for anything:

I do appreciate all contributions, especially if they were done pro bono.

Felipe Contreras

I explained that Elijah’s personal attacks towards me are a different subject that he should probably not attempt to touch, because I didn’t think anything productive could come from that (it’s a matter between Elijah and me).

And just to be crystal clear, I thanked him again:

This has absolutely nothing to do with you, again… I appreciate you used some of your free time to try to improve git pull.

Felipe Contreras

In my opinion all the reports that have anything to do with Elijah Newren are primed and could only be judged “unacceptable” if you assume bad faith (and probably not even in that case).

Judgement

I have been a moderator on pretty big online communities, therefore I’m familiar on how justice is supposed to be dispensed outside of the justice system.

The single most important thing is the presumption of innocence. The accused should not be considered guilty until evidence against him has been presented and he has had a chance to defend himself. Even in something as silly as Twitch chat, the people that get banned have the opportunity to appeal those decisions.

The leadership committee not only did not allow me to present my case, they weren’t interested in the least, when I asked them about presenting my case their response was “this isn’t a trial”. I was presumed guilty and that’s that.

My punishment was to avoid all interaction with 17 people for three months. That included Junio C Hamano and the entire leadership committee. If I engage in any “unwanted interaction” with any of these people or if I say something similar to the alleged violations above, then I’ll get banned.

Ævar Arnfjörð Bjarmason offered to be my “point of contact”, that means he was the only person I was allowed to ask questions to. While initially Ævar did a good job of keeping an open dialog with me, eventually he stopped responding because he went on vacation.

Other than the initial email I only received one response from the leadership committee.

However Ævar did manage to respond to a few questions for this blog post (as himself):

  1. Do you believe there’s a possibility the PLC might have made a mistake?

Sure, but I think that’s more in the area of how we’d deal with clashes like this generally than that nothing should have been done in this case. Did we do the optimal thing? I don’t know.

  1. Do you believe the PLC could have done more to address this particular situation?

Yes, I think it’s the first case (at least that I’m aware of) under the CoC framework that’s gone this far.

That’s a learning process, one particular thing in this case I think could have gone better is more timely responses to everyone involved. That’s collectively on us (the PLC), harder when there’s little or no experience in dealing with these cases, everyone volunteering their time across different time zones etc.

  1. Do you believe the identify of the person(s) who reported the complaints had an effect on the final verdict?

I wouldn’t mind honestly answering this for my part at least (and not for other PLC members), but feel I can’t currently due to our promise that CoC reports and identities of reporters etc. not be made public unless on their request.

While I’m thankful for Ævar’s responses these did not answer the particulars of what I asked, especially the last one, which he could very well be answered without revealing the identity of the people who reported the “violations”.

At the end of the day I’m still fairly certain that the person who did the vast majority of the reports (if not all of them) was Elijah Newren, and it’s only because he is a big name (#7 contributor in the past 5 years) that those reports were taken seriously. Additionally he sent as many as he could as an eristic technique–Gish gallop–because he knew that way nobody in the leadership team would investigate any single one at depth.

Ultimately I do not blame Elijah: if he thinks my comments violated the code of conduct, he is entitled to believe so and report them. But I do blame the leadership team because they didn’t do their job. They didn’t investigate any of the reports carefully enough, and they did not do the minimum job any judge should do (inside or outside a real courtroom); hear the defendant.

Not only did they not do their job, but they didn’t even want to attempt to do it, and even more… it seems they don’t know what their minimum job is.

Conclusion

Linus Torvalds once said in a TED interview:

What I’m trying to say is we are different. I’m not a people person; it’s not something I’m particularly proud of, but it’s part of me. And one of the things I really like about open source is it really allows different people to work together. We don’t have to like each other — and sometimes we really don’t like each other. Really — I mean, there are very, very heated arguments. But you can, actually, you can find things that — you don’t even agree to disagree, it’s just that you’re interested in really different things.

Linus Torvalds

You may not like my style of communication, but “being abrasive” is not a crime. The question is not “can Felipe be nicer?”, the question is “did Felipe violate the code of conduct?”. I believe any objective observer who carefully investigates any of the reports would have to conclude that no violation actually took place. You would have to believe that I was trolling, insulted people, threw personal attacks, or harassed people; none of that actually took place.

I debate with people all the time, and this particular issue comes up very often, but it’s a fallacy called tone policing. Any time anybody focuses on the way somebody states an argument instead if what the argument itself is, that person is being unproductive.

Paul Graham’s hierarchy of disagreement

I often bring Paul Graham’s hierarchy of disagreement, where tone policing is the third lowest level of “argumentation”, right after ad hominem attacks.

Nobody benefits by my tone being policed, especially the Git community. My patches not only provide much needed improvements to the user interface, but fixes clear bugs that have already been verified and tested, not to mention regressions. Ignoring me only hurts git users.

And even if in your opinion the tone I used in the reports above is not acceptable, it’s worth nothing that these are merely the worst ten instances out of two thousand emails, that’s just 0.5%.

And let’s remember that unlike the vast majority of git developers: I’m doing this work entirely for free.

Moreover, if anyone is violating the code of conduct I would venture to say it’s other members of the community:

  • Being respectful of differing opinions, viewpoints, and experiences

It is very clear other members of the community do not respect my opinions. I have suggested that this particular point should be tolerance, not respect (see Cambridge University votes to safeguard free speech), but even if you lower the requirement from respect to tolerance, not even that is happening: other members do not tolerate my opinions.

Why am I being punished because other members (who happen to be big names) can’t tolerate my opinions?

The Git community claims to be inclusive, but as this incident shows that’s not truly the case since the most important diversity is not welcomed: diversity of thought.

Freedom of speech in online communities

I have debated freedom of speech countless times, and it is my contention that today (in 2021) the meaning of that concept is lost.

The idea of freedom of speech didn’t exist as such until censorship started to be an issue, and that was after the invention of the printing press. It was after people starting to argue in favor of censorship that other people started to argue against censorship. Freedom of speech is an argument against censorship.

Today that useful meaning is pretty much lost. Now people wrongly believe that freedom of speech is a right, and only a right, and worse: they equate freedom of speech with the First Amendment, even though freedom of speech existed before such law, and exists in countries other than USA. I wrote about this fallacy in my other blog in the article: The fatal freedom of speech fallacy.

The first problem when considering freedom of speech a right is that it shuts down discussion about what it ought to be. This is the naturalistic fallacy (confusing what is to what ought to be). If we believed that whatever laws regarding cannabis are what we ought to have, then we can’t discuss any changes to the laws, because the answer to everything would be “that’s illegal”. The question is not if X is illegal currently, the question is should it? When James Damore was fired by Google for criticizing Google’s ideological echo chamber, a lot of people argued that Google was correct in firing him, because it was legal, but that completely misses the point: the fact that something is legal doesn’t necessarily mean it’s right (should be illegal).

Today people are not discussing what freedom of speech ought to be.

Mill’s argument

In the past people did debate what freedom of speech ought to be, not in terms of rights, but in terms of arguments. The strongest argument comes from John Stuart Mill which he presented in his seminal work On Liberty.

Mill’s argument consists on three parts:

  1. The censored idea may be right
  2. The censored idea may not be completely wrong
  3. If the censored idea is wrong, it strengthens the right idea

It’s obvious that if an idea is right, society will benefit from hearing it, but if an idea is wrong, Mill argues that it still benefits society.

Truth is antifragile. Like the inmune system it benefits from failed attacks. Just like a bubble boy which is shielded from the environment becomes weak, so do ideas. Even if an idea is right, shielding it from attacks makes the idea weak, because people forget why the idea was right in the first place.

I often put the example of the idea of flat-Earth. Obviously Earth is round, and flat-Earthers are wrong, but is that a justification for censoring them? Mill argues that it’s not. I’ve seen debates with flat-Earthers, and what I find interesting are the arguments trying to defend the round Earth, but even more interesting are the people that fail to demonstrate that the Earth is round. Ask ten people that you know how would they demonstrate that the Earth is round. Most would have less knowledge about the subject than a flat-Earther.

The worst reason to believe something is dogma. If you believe Earth is round because science says so, then you have a weak justification for your belief.

My notion of the round Earth only became stronger after flat-Earth debates.

Censorship hurts society, even if the idea being censored is wrong.

The true victim

A common argument against freedom of speech is that you don’t have the right to make others listen to your wrong ideas, but this commits all the fallacies I mentioned above, including confusing the argument of freedom of speech with the right, and ignores Mill’s argument.

When an idea is being censored, the person espousing this idea is not the true victim. When the idea was that Earth was circling the Sun (and not the other way around as it was believed), Galileo Galilei was not the victim: he already knew the truth: the victim was society. Even when the idea is wrong, like in the case of flat-Earth, the true victim is society, because by discussing wrong ideas everyone can realize by themselves precisely why they are wrong.

XKCD claims the right to free speech means the government can't arrest you for what you say.
XKCD doesn’t know what freedom of speech is

The famous comic author Randall Munroe–creator of XKCD–doesn’t get it either. Freedom of speech is an argument against censorship, not a right. The First Amendment came after freedom of speech was already argued, or in other words: after it was already argued that censorship hurts society. The important point is not that the First Amendment exists, the important point is why.

This doesn’t change if the censorship is overt, or the idea is merely ignored by applying social opprobrium. The end result for society is the same.

Censorship hides truth and weakens ideas. A society that holds wrong and weak ideas is the victim.

Different levels

Another wrong notion is that freedom of speech only applies in public spaces (because that’s where the First Amendment mostly applies), but if you follow Mill’s argument, when Google fired James Damore, the true victim was Google.

The victims of censorship are at all levels: society, organization, group, family, couple.

Even at the level of a couple, what would happen to a couple that just doesn’t speak about a certain topic, like say abortion?

What happens if the company you work for bans the topic of open spaces? Who do you think suffers? The people that want to criticize open spaces, or the whole company?

The First Amendment may apply only at a certain level, but freedom of speech, that is: the argument against censorship, is valid at every level.

Online communities

Organizations that attempt to defend freedom of speech struggle because while they want to avoid censorship, some people simply don’t have anything productive to say (e.g. trolls), and trying to achieve a balance is difficult, especially if they don’t have a clear understanding of what freedom of speech even is.

But my contention is that most of the struggle comes from the misunderstandings about freedom of speech.

If there’s a traditional debate between two people, there’s an audience of one hundred people, and one person in the audience starts to shout facts about the flat-Earth, would removing that person from the venue be a violation of freedom of speech? No. It’s just not part of the format. In this particular format an audience member can ask a question at the end in the Q&A part of the debate. It’s not the idea that is being censored, it’s the manner in which the idea was expressed that is the problem.

The equivalent of society in this case is not hurt by a disruptive person being removed.

Online communities decide in what format they wish to have discussions in, and if a person not following the format is removed, that doesn’t hide novel ideas nor weakens existing ideas. In order words: the argument against censorship doesn’t apply.

But in addition the community can decide which topics are off-topic. It makes no sense to talk about flat-Earth in a community about socialism.

But when a person is following the format, and talking about something that should be on-topic, but such discussion is hindered either by overt censorship (e.g. ban), or social opprobrium (e.g. downvotes), then it is the community that suffers.

Ironically when online communities censor the topic of vaccine skepticism, the only thing being achieved is that the idea becomes weak, that is: the people that believe in vaccines do so for the wrong reasons (even if correct), so they become easy targets for anti-vaxxers. In other words: censorship creates the exact opposite of what it attempts to solve.

Online communities should fight ideas with ideas, not censorship.

Why renaming Git’s master branch is a terrible idea

Back in May (in the inauspicious year of 2020) a thread in the Git mailing list with the tile of “rename offensive terminology (master)” was started, it lasted for more than a month, and after hundreds of replies, no clear ground was gained. The project took the path of least resistance (as you do), and the final patch to do the actual rename was sent today (November).

First things first. I’ve been a user of Git since 2005 (before 1.0), and a contributor since 2009, but I stopped being active, and only recently started to follow the mailing list again, which is why I missed the big discussion, but just today read the whole enchilada, and now I’m up-to-date.

The discussion revolved around five subjects:

  1. Adding a new configuration (init.defaultbranch)
  2. Should the name of the master branch be changed?
  3. Best alternative name for the master branch
  4. Culture war
  5. The impact to users

I already sent my objection, and my rationale as to why I think the most important point–the impact to users–was not discussed enough, and in fact barely touched.

In my opinion the whole discussion was a mess of smoke screen after smoke screen and it never touched the only really important point: users. I’m going to tackle each subject separately, leaving the most important one at the end, but first I would like to address the actual context and some of the most obvious fallacies people left at the table.

The context

It’s not a coincidence that nobody found the term problematic for 15 years, and suddenly in the height of wokeness–2020 (the year of George Floyd, BLM/ANTIFA uprising, and so on)–it magically becomes an issue. This is a solution looking for a problem, not an actual problem, and it appeared precisely at the same time the Masters Tournament received attention for its name. The Masters being more renowned than Git certainly got more attention from the press, and plenty of articles have been written explaining why it makes no sense to link the word “masters” to slavery in 2020 in this context (even though the tournament’s history does have some uncomfortable relationship with racism) (No, the masters does not need renaming, Masters Name Offensive? Who Says That?, Will Masters Be Renamed Due to BLM Movement? Odds Favor “No” at -2500, Calls for The Masters to change its name over ‘slave’ connotations at Augusta). Few are betting on The Masters actually changing its name.

For more woke debates, take a look at the 2 + 2 = 5 debate (also in 2020).

The obvious fallacies

The most obvious fallacy is “others are doing it”. Does it have to be said? Just because all your friends are jumping off a cliff doesn’t mean you should too. Yes, other projects are doing it, that doesn’t mean they don’t have bad reasons for it. This is the bandwagon fallacy (argumentum ad populum).

Even if it was desirable for the git.git project to change the name of the master branch for itself–just like the Python project did, it’s an entirely different thing to change the name of the master branch for everyone. The bandwagon argument doesn’t even apply.

The second fallacy comes straight out of the title “offensive terminology”. This is a rhetorical technique called loaded language; “what kind of person has to deny beating his wife?”, or “why do you object to the USA bringing democracy to Iraq?”. Before the debate even begins you have already poisoned the well (another fallacy), and now it’s an uphill battle for your opponents (if they don’t notice what you are doing). It’s trying to smuggle a premise in the argument without anyone noticing.

Most people in the thread started arguing why it’s not offensive, while the onus was on the other side to prove that it was offensive. They had the burden of proof, and they inconspicuously shifted it.

If somebody starts a debate accusing you of racism, you already lost, especially if you try to defend yourself.

Sorry progressives, the word “master” is not “offensive terminology”. That’s what you have to prove. “What kind of project defends offensive terminology?” Is not an argument.

Adding a new configuration

This one is easy. There was no valid reason not to add a new configuration. In fact, people already had configurations that changed the default branch. Choice is good, this configuration was about making it easier to do what people were already doing.

The curious thing is that the only places in the thread where the configuration was brought up was as a diversion tactic called motte and bailey.

What they started with was a change of the default branch, a proposition that was hard to defend (bailey), and when opponents put enough pressure they retreated to the most defensible one (motte): “why are you against a configuration?”

No, nobody was against adding a new configuration, what people were against was changing the default configuration.

Should the name of the master branch be changed?

This was the crux of the matter, so it makes sense that this is where most of the time debating was spent. Except it wasn’t.

People immediately jumped to the next point, which is what is a good name for the default branch, but first it should be determined that changing the default is something desirable, which was never established.

You don’t just start discussing with your partner what color of apartment to choose. First, your girlfriend (or whatever) has to agree to live together!

Virtually any decision has to be weighted in with pros and cons, and they never considered the cons, nor established any real pro.

Pro

If the word “master” is indeed offensive, then it would be something positive to change it. But this was never established to be the case, it was just assumed so. Some arguments were indeed presented, but they were never truly discussed.

The argument was that in the past (when slavery was a thing), masters were a bad thing, because they owned slaves, and the word still has that bad connotation.

That’s it. This is barely an argument.

Not only is very tenuously relevant in the present moment, but it’s not actually necessarily true. Slavery was an institution, and masters simply played a role, they were not inherently good or bad. Just because George Washington was a slave owner, that doesn’t mean he was a monster, nor does it mean the word “master” had any negative connotation back then. It is an assumption we are making in the present, which, even if true: it’s still an assumption.

This is called presentism. It’s really hard to us to imagine the past because we didn’t live it. When we judge it we usually judge it wrong because we have a modern bias. How good or bad masters were really viewed by their subjects is a matter for debate, but not in a software project.

Note: A lot of people misunderstood this point. To make it crystal clear: slavery was bad. The meaning of the word “master” back then is a different issue.

Supposing that “master” was really a bad word in times of slavery (something that hasn’t been established), with no other meaning (which we know it isn’t true) this has no bearing in the modern world.

Prescriptivism

A misunderstanding many people have of language is the difference between prescriptive and descriptive language. In prescriptivism words are dictated (how they ought to be used). In descriptivism words are simply described (how they are actually used). Dictionaries can be found on both camps, but they are mainly on the descriptive side (especially the good ones).

This misunderstanding is the reason why many people think (wrongly) that the word “literally” should not mean “virtually” (even though many people use it this way today). This is prescriptiveness, and it doesn’t work. Words change meaning. For example, the word “cute” meant “sharp” in the past, but it slowly changed meaning, much to the dismay of prescriptivists. It does not matter how much prescriptivists kick and scream; the masses are the ones that dictate the meaning of words.

So it does not matter what you–or anyone–thinks, today the word “literally” means “virtually”. Good dictionaries simply describe the current use, they don’t fight it (i.e. prescribe against it).

You can choose how you use words (if you think literally should not mean virtually, you are free to not use it that way). But you cannot choose how others use language (others decide how they use it). In other words, you cannot prescribe language, it doesn’t matter how hard you try: you can’t fight everyone.

Language evolves on its own, and like democracy: it’s dictated by the masses.

So, what do the masses say about the word “master”? According to my favorite dictionary (Merriam-Webster):

  1. A male teacher
  2. A person holding an academic degree higher than a bachelor’s but
    lower than a doctor’s
  3. The degree itself (of above)
  4. A revered religious leader
  5. A worker or artisan qualified to teach apprentices
  6. An artist, performer, or player of consummate skill
  7. A great figure of the past whose work serves as a model or ideal
  8. One having authority over another
  9. One that conquers or masters
  10. One having control
  11. An owner especially of a slave or animal
  12. The employer especially of a servant
  13. A presiding officer in an institution or society
  14. Any of several officers of court appointed to assist a judge
  15. A master mechanism or device
  16. An original from which copies can be made

These are not even all the meanings, just the noun meanings I found relevant to today, and the world in general.

Yes, there is one meaning which has a negative connotation, but so does the word “shit”, and being Mexican, I don’t get offended when somebody says “Mexico is the shit”.

So no, there’s nothing inherently bad about the word “master” in the present. Like all words: it depends on the context.

By following this rationale the word “get” can be offensive too, one of the definitions is “to leave immediately”. If you shout “get!” to a subordinate, that might be considered offensive (and with good reason)–especially if this person is a discriminated minority. Does that mean we should ban the word “get” completely? No, that would be absurd.

Also, there’s another close word that can be considered offensive: git.

Prescriptives would not care how the word is actually used today, all they care about is to dictate how the word should be used (in their opinion).

But as we saw above: that’s not how language works.

People will decide how they want to use the word “master”. And thanks to the new configuration “init.defaultbranch”, they can decide how not to use that word.

If and when the masses of Git users decide (democratically) to shift away from the word “master”, that’s when the Git project should consider changing the default, not before, and certainly not in a prescriptive way.

Moreover, today the term is used in a variety of contexts that are unlikely to change any time soon (regardless of how much prescriptivists complain):

  1. An important room (master bedroom)
  2. An important key (master key)
  3. Recording (master record)
  4. An expert in a skill (a chess master)
  5. The process of becoming an expert (mastering German)
  6. An academic degree (Master of Economics)
  7. A largely useless thing (Master of Business Administration [MBA])
  8. Golf tournaments (Masters Tournament [The Masters])
  9. Famous classes by famous experts (MasterClass Online Classes)
  10. Online tournament (Intel Extreme Masters [IEM])
  11. US Navy rank (Master-at-Arms [MA])
  12. Senior member of a university (Master of Trinity College)
  13. Official host of a ceremony (master of ceremonies [MC])
  14. Popular characters (Jedi Master Yoda)
  15. A title in a popular game (Dungeon Master)
  16. An important order (Grand Master)
  17. Vague term (Zen master)
  18. Stephen Hawking (Master of the Universe)
  19. Inside joke (master of your domain)

And many, many more.

All these are current uses of the word, not to mention the popular BDSM context, where having a master is not a bad thing at all.

Subjectiveness

Even if we suppose that the word is “bad” (which is not), changing it does not solve the problem, it merely shuffles it around. This notion is called language creep (also concept creep). First there’s the n-word (which I don’t feel comfortable repeating, for obvious reasons), then there was another variation (which ends in ‘o’, I can’t repeat either), then there was plain “black”, but even that was offensive, so they invented the bullshit term African-American (even for people that are neither African, nor American, like British blacks). It never ends.

This is very well exemplified in the show Orange Is The New Black where a guard corrects another guard for using the term “bitches”, since that term is derogatory towards women. The politically correct term now is “poochies”, he argues, and then proceeds to say: “these fucking poochies”.

Words are neither good or bad, is how you use them that make them so.

You can say “I love you bitches” in a positive way, and “these fucking women make me vomit” in a completely derogatory way.

George Carlin became famous in 1972 for simply stating seven words he was forbidden from using, and he did so in a completely positive way.

So no, even if the word “master” was “bad”, that doesn’t mean it was always bad.

But supposing it’s always bad, who are the victims of this language crime? Presumably it’s black people, possibly descended from slaves, who actually had masters. Do all black people find this word offensive? No.

I’m Mexican, do I get offended when somebody uses the word “beaner”? No. Being offended is a choice. Just like nobody can make you angry, it’s you the one that gets angry, nobody inflicts offense on other people, it’s the choice of the recipients. There’s people with all the reason in the world, who don’t get offended, and people that have no reason, and yet they get easily offended. It’s all subjective.

Steve Hughes has a great bit explaining why nothing happens when you get offended. So what? Be offended; nothing happens. Being offended is part of living in a society. Every time you go out the door you risk being offended, and if you can’t deal with that, then don’t interact with other people. It’s that simple.

Collective Munchausen by proxy

But fine, let’s say for the sake of argument that “master” is a bad word, even on modern times, in any context, and the people that get offended by it have all the justification in the world (none of which is true). How many of these concerned offended users participated in the discussion?

Zero.

That’s right. Not one single person of African descent (or whatever term you want to use) complained.

What we got instead were complainers by proxy: people who get offended on behalf of other (possibly non-existent) people.

Gad Saad coined a term Collective Munchausen by proxy that explains the irrationality of modern times. He borrows from the established disorder called Munchausen Syndrome by Proxy.

So you see, Munchausen is when you feign illness to gain attention. Munchausen by proxy is when you feign the illness of somebody else to gain attention towards you. Collective Munchausen is when a group of people feign illness. And collective Munchausen by proxy is when a group of people feign the illness of another group of people.

If you check the mugshots of BLM activists arrested, most of them are actually white. Just like the people pushing for the rename (all white), they are being offended on behalf of others by proxy.

Black people did not ask for this (the master rename (but probably many don’t appreciate the destruction of their businesses in riots either)).

Another example is the huge backlash J. K. Rowling received for some supposedly transphobic remarks, but the people that complained were not transgender, they were professional complainers that did so by proxy. What many people in the actual transgender community said–like Blair White–is that this was not a real issue.

So why on Earth would a group of people complain about an issue that doesn’t affect them directly, but according to them it affects another group of people? Well, we know it has nothing to do with the supposed target victim: black people, and everything to do with themselves: they want to win progressive points, and are desperate to be “on the right side of history”.

They are like a White Knight trying to defend a woman who never asked for it, and in fact not only can she defend herself, but she would prefer to do so.

This isn’t about the “victims”, it’s all about them.

The careful observer probably has already noticed this: there are no pros.

Cons

Let’s start with the obvious one: it’s a lot of work. This is the first thing proponents of the change noticed, but it wasn’t such a big issue since they themselves offered to do the work. However, I don’t think they gauged the magnitude of the task, since just changing the relevant line of code basically breaks all the tests.

The tests are mostly done now, but all the documentation still needs to be updated. Not only the documentation of the project, but the online documentation too, and the Pro Git book, and plenty of documentation scattered around the web, etc. Sure, a lot of this doesn’t fall under the purview of Git developers, but it’s something that somebody has to do.

Then we have the people that are not subscribed to the mailing list and are completely unaware that this change is coming, and from one day to the next they update Git and they find out there’s no master branch when they create a new repository.

I call these the “silent majority”. The vast majority of Git users could not tell you the last Release Notes they read (probably because they haven’t read any). All they care about is that Git continues to work today as it did yesterday.

The silent majority doesn’t say anything when Git does what it’s supposed to do, but oh boy do they complain when it doesn’t.

This is precisely what happened in 2008, when Git 1.6.0 was released, and suddenly all the git-foo commands disappeared. Not only did end-users complained, but so did administrators in big companies, and distribution maintainers.

This is something any project committed to its user-base should try to avoid.

And this is a limited list, there’s a lot more than could go wrong, like scripts being broken, automated testing on other projects, and many many more.

So, on one side of the balance we have a ton of problems, and in other: zero benefits. Oh boy, such a tough choice.

Best alternative name for the master branch

Since people didn’t really discuss the previous subject, and went straight to the choice of name, this is where they spent a lot of the time, but this is also the part where I paid less attention, since I don’t think it’s interesting.

Initially I thought “main” was a fine replacement for “master”. If you had to choose a new name, “main” makes more sense, since “master” has a lot of implications other than the most important branch.

But then I started to read the arguments about different names, and really think about it, and I changed my mind.

If you think in terms of a single repository, then “main” certainly makes sense; it’s just the principal branch. However, the point of Git is that it’s distributed, there’s always many repositories with multiple branches, and you can’t have multiple “main” branches.

In theory every repository is as important as another, but in practice that’s not what happens. Humans–like pretty much all social animals–organize themselves in hierarchies, and in hierarchies there’s always someone at the top. My repository is not as important as the one of Junio (the maintainer).

So what happens is that my master branch continuously keeps track of Junio’s master branch, and I’d venture to say the same happens for pretty much all developers.

The crucial thing is what happens at the start of the development; you clone a repository. If somebody made a clone of you, I doubt you would consider your clone just as important as you. No, you are the original, you are the reference, you are the master copy.

The specific meaning in this context is:

an original from which copies can be made

Merriam-Webster

In this context the word has absolutely nothing to do with master/slaves. The opposite of a master branch is either a descendant (most branches), or an orphan (in rare cases).

The word “main” may describe correctly a special branch among a bunch of flat branches, but not the hierarchical nature of branches and distributed repositories of clones of clones.

The name “master” fits like a glove.

Culture war

This was the other topic where a lot of time was spent on.

I don’t want to spend too much time on this topic myself–even though it’s the one I’m most familiar with–because I think it’s something in 2020 most people are faced with already in their own work, family, or even romantic relationships. So I’d venture to say most people are tired of it.

All I want to say is that in this war I see three clear factions. The progressives, who are in favor of ANTIFA, BLM, inclusive language, have he/him in bio, use terms like anti-racism, or intersectional feminism, and want to be “on the right side of history”. The anti-progressives, who are pretty much against the progressives in all shapes or forms, usually conservatives, but not necessarily so. But finally we have the vast majority of people who don’t care about these things.

The problem is that the progressives are trying to push society into really unhealthy directions, such as blasphemy laws, essentially destroying the most fundamental values of modern western society, like freedom of speech.

The vast majority of people remain silent, because they don’t want to deal with this obvious nonsense, but eventually they will have to speak up, because these dangerous ideologies are creeping up everywhere.

For more about the subject I can’t recommend enough the new book of Gad Saad: The Parasitic Mind: How Infectious Ideas Are Killing Common Sense.

It really is a parasitic mindset, and sensible people must put a stop to it.

Update: The topic has been so controversial that as a result of this post reddit’s r/git decided to ban the topic completely, and remove the post. Hacker News also banned this post.

The impact to users

I already touched on this on the cons of the name change, but what I didn’t address are the mitigation strategies that could be employed.

For any change there’s good and bad ways of going about it.

Even if the change from “master” to “main’ was good and desirable (which it isn’t), simply jumping to it in the next version (Git 2.30) is the absolute worst way of doing it.

And this is precisely what the current patch is advancing.

I already briefly explained what happened in 2008 with the v1.6.0 release, but what I find most interesting is that looking back at those threads many of the arguments of how not to do a big change, apply exactly in the same way.

Back then what most people complained about was not the change itself (from git-foo to “git foo“) (which they considered to be arbitrary), but mainly the manner in which the change was done.

The main thing is that there was no deprecation period, and no clear warning. This lesson was learned, and the jump to Git 2.0 was much smoother precisely because of the warnings and period of adjustment, along with clear communication from the development team about what to expect.

This is not what is being done for the master branch rename.

I also find what I told Linus Torvalds very relevant:

What other projects do is make very visible when something is deprecated, like a big, annoying, unbearable warning. Next time you deprecate a command it might be a good idea to add the warning each time the command is used, and obsolete it later on.

Also, if it’s a big change like this git- stuff, then do a major version bump.

If you had marked 1.6 as 2.0, and added warnings when you deprecated the git-foo stuff then the users would have no excuse. It would have been obvious and this huge thread would have been avoided.

I doubt anyone listened to my suggestion, but they did this for 2.0, and it worked.

I like to refer to a panel Linus Torvalds participated in regarding the importance of users (educating Lennart Poettering). I consider this an explanation of the first principles of software: the main purpose of software is that it’s useful to users, and that it continues to be useful as it moves forward.

“Any time a program breaks the user experience, to me that is the absolute worst failure that a software project can make.”

Linus Torvalds
Linus Torvalds schools Lennart Poettering on the importance of users

Now it’s the same mistake of not warning the users of the upcoming change, except this time it’s much worse, since there’s absolutely no good reason for the change.

Update: the git.git developers listened to my advice and they eventually added a warning.

The Git project is simply another victim of the parasitic mindset that is infecting our culture. It’s being held hostage by a tiny amount of people pushing for a change nobody else wants, would benefit no one, would affect negatively everyone, and they want to do it in a way that maximizes the potential harm.

If I was a betting man, my money would be on the users complaining about this change when it hits them on the face with no previous warning.

Sorry Lennart, but you are wrong once again

Lennart Poettering’s post in G+ is gathering a lot of attention these days, most of the feedback is supportive, and positive, which is not surprising to me, because although Poettering would like us to believe otherwise, most of the open source community is pretty accommodating and non-confrontational.

I am however going to go against the current here, and criticize him, but first let me state clearly that I do not condone any physical attacks towards his person, or the threats of such. His ideas however are a different matter.

Lennart’s chief mistake is to attack the way the Linux’s kernel community is run, and say their success happens despite this. How does he know? Has he ever run a more successful community? Has anybody ever? Linux is the most successful software project in history, by more than one order of magnitude from any way you look at it. It would be presumptuous for anybody to say they know how to run this project better, specially without any evidence to back such claim, which is precisely what Poettering is doing.

In this blog I’ve analyzed the many reasons why the Linux kernel is so successful, and one of them is its combative style of discussion in which ideas are not exempt from ridicule, and strong language is often used to drive one’s point home as efficiently as possible. Many people in the community agree this is desirable, and there’s even scientific evidence that supports this notion; the best ideas arise in a confrontational environment, not in a protective one.

What’s more, Poettering himself accepts he hasn’t been involved in this community. So what the hell does he know about it? Nothing.

Poettering’s second mistake is to assume that for non-white, non-western, non-straight people the situation surely must be worst… That is not the case. Maybe, just maybe, he receives such vitriolic feedback not just because of what he does, but because of the horrible way he does it. Of course not, Poettering doesn’t need to change, his approach is perfect, in fact, the only reason he receives criticism is because he is too progressive, too audacious, too efficient, surely, that must be the reason!

Personally, my beef with Poettering starts from the fact that he blocked me from Google+. Why? Because I was complaining about a technical issue with systemd, which he initially spotted and commented, but then ignored. In the middle of the discussion I made some value judgements about certain systemd code, and he stopped responding and blocked me. That is the worst way to end a discussion; block the people who disagree with you.

Sorry Lennart, but actions have consequences, and you can only do so much disruptive changes to the Linux ecosystem without much care or consideration for others, there’s a limit to the amount of people you can block, and the criticism you ignore. You can grow as thick a skin as you want, you are still wrong. No community is going to let you continue being wrong and acting as if you are beyond reproach just like that (unless you run that community and have blocked any dissident voices of course).

Maybe it’s time to take a hard look in the mirror.

What’s missing in Git v2.0.0

I recently blogged about the Git v2.0.0 release, what changed, and why should you care. Unfortunately the conclusion was that nothing much changed (other than the usual new features and bug fixes). In this post I will discuss what should have changed, and why.

What is needed

Fortunately, Git has had the Git User’s Survey in the past, so we know what users want.

  1. user-interface: 3.25
  2. documentation: 3.22
  3. tools (e.g. GUI): 3.01
  4. more features: 2.41
  5. portability: 2.34
  6. performance: 2.28
  7. community (mailing list): 1.70
  8. localization (translation): 1.65
  9. community (IRC): 1.65

Obviously, since user-interface and documentation are the areas that need more improvement, that’s what Git v2.0.0 should have focused, right?

History

I already mentioned this in the other post, but I’ll do it again.

First of all, Git as a long history of never breaking user expectations (other than the Git v1.6.0 fiasco (which changed all the git-foo commands with ‘git foo’)), and as such a lot of thought is devoted into ways to minimize changes in behavior, or even how to avoid it completely. Perhaps too much care is devoted into this.

The preparation for Git v2.0.0 started more than three years ago with a mail from Junio C Hamano, asking for developers to submit ideas for changes that normally would not happen because they break backwards compatibility, he invited us to think as if “we were writing Git from scratch”. This big release that would break backwards compatibility was going to be named “1.8.0″ and people started to submit ideas for this important release. Eventually too much time passed, the versioning scheme changed, v1.8.0 was released, and the changes proposed for v1.8. slipped into what is now v2.0.

Since no substantial changes in behavior happened since v1.0, it would follow that v2.0 was an important release, and a good opportunity to gather all the ideas about what needs to change in Git. However, seemingly out of nowhere, without any discussion or even a warning, the maintainer tagged v2.0.0-rc0, and therefore all the features that were not already merged couldn’t be merged for v2.0.0.

Thus v2.0.0 was destined to have a small list of changes, and that’s how it remained.

What could have changed

The following is a list of things that I argued should be part of Git v2.0.0.

git update

I wrote a whole post about the issue, but basically, ‘git pull‘ is broken for the most common use-case: update the current branch.

This is a known issue that has been discussed over and over, and everyone agrees that it is indeed an issue, and something needs to be done to fix it.

There have been different proposals, but by far the most comprehensive and simple is to add a new ‘git update‘ command.

This way when you want to merge a pull request, you do ‘git pull‘, and when you just want to update the current branch, you do ‘git update‘, which by default would barf if there’s divergence between your local branch (e.g. ‘master’), and the remote one (e.g. ‘origin/master’), instead of doing a merge by default. This should decrease substantially the amount of “evil merges”, merges that happened by mistake, usually by somebody that is not familiar with Git.

The patches are relatively new, but the command is simple, so there isn’t much danger of screwing things up.

The publish tracking branch

I also wrote a blog post about this; basically Git’s support for triangular workflows is not the best.

A triangular workflow is when you pull from one location (e.g. central repo), and push to another (e.g. personal GitHub fork). If you are using upstream tracking branches (you should), you have to make a decision where you set your upstream; the central repo, or your personal one. Depending on which you use, is the advantages you get, but you cannot have it all.

But with the publish tracking branch you can have all the advantages.

I’ve been cooking these patches for a long long time and I have to say this is one essential feature for me, and they patches work perfectly.

Support for Mercurial and Bazaar

Support for Mercurial and Bazaar repositories has been cooking for a long time in the “contrib” area (you can both pull and push). At this point in time the code is production-ready, and it was already graduated and merged to be released in Git v2.1.

However, the maintainer suddenly changed his mind and decided it would be better to distribute them as third party tools. He didn’t give any valid reason and clearly didn’t think it through, but they are now separate.

The code is already widely used (git-remote-hg, git-remote-bzr), and could easily be merged.

Use “stage” instead of “index”

Everybody agrees that “index” is a horrible name for Git’s “staging area”, however, nobody has done much to fix the problem.

One first step is to replace all the –cached and –index options with –staged and –no-work, which are much simpler to understand.

Another step is to add a ‘git stage‘ command that acts as a helper to work with the staging area: ‘git stage add‘, ‘git stage diff‘, ‘git stage reset‘, ‘git stage rm‘, ‘git stage edit‘, and so on.

The patches are very straight-forward.

Default aliases

Virtually every version control system has default aliases (e.g. hg co, cvs ci, svn di, etc.), except Git.

Adding default aliases is very simple to do and only brings advantages. If you don’t like the default alias, you can override it.

Patches here.

Shoulda coulda woulda

It would have been great if you could just do ‘git clone hg::mercurial-repo‘ without installing anything extra, if everybody could start using ‘git update‘ instead of ‘git pull‘, if you could do ‘git stage diff‘, or ‘git reset --stage‘. Also, if triangular workflows were properly supported.

Unfortunately that’s not the case, and Git v2.0.0 is already released, and there isn’t much to be excited about.

You might think “perhaps for Git v3.0” (which could happen in two years, or ten, how knows), but if the past is any indication of the future, it won’t happen, specially since I’ve given up on all these patches.

The fact of the matter is that in every release of Git, there is only one focus: performance. Despite the fact that it’s #6 in the list of concerns of users, Git developers work on this because that’s their area of expertise, because it’s fun for them, and because they get paid to do so. There are occasional new features, and a bit of portability now and then, but for the most part Windows support is neglected in Git, which is why the msysgit project was born.

The documentation will always remain cryptic, because for the developers, it’s not cryptic, it’s very clear. And the user-interface will never change, because the developers don’t like change.

If you don’t believe me look at the backwards-incompatible changes in Git v2.0.0, or in fact, try to think back to the last time Git changed anything. Personally other than the git-foo -> ‘git foo’ change in v1.6.0 (which was horribly handled), I can’t think of anything but minor changes.

Anyway, you can use all these features I listed today (and more) if you use git-fc instead of Git. It is my own fork of Git that has all the features of Git, plus more.

Is there anything in that list that I missed? Do you think Git v2.0.0 has enough changes as it is?

Git v2.0.0, what changed, and why should you care

Git v2.0.0 is a backward-incompatible release, which means you should expect differences since the v1.x series.

Unless you’ve been following closely the Git mailing list, you probably don’t know the history behind the v2.0 release, which started long time ago (more than three years). It all started with a mail from Junio C Hamano, asking for developers to submit ideas for changes that normally would not happen because they break backwards compatibility, he invited us to think as if “we were writing Git from scratch”. This big release that would break backwards compatibility was going to be named “1.8.0” and people started to submit ideas for this important release. Eventually too much time passed, the versioning scheme changed, v1.8.0 was released, and the changes proposed for v1.8. slipped into what is now v2.0.

Parts of v2.0 have been already been deployed one way or the other (for example if you have configured ‘push.default = simple’), but finally today we have v2.0 final. And here are the big changes that we got.

‘git push’ default has changed

Here’s what the release notes say:

When "git push [$there]" does not say what to push, we have used the
traditional "matching" semantics so far (all your branches were sent
to the remote as long as there already are branches of the same name
over there).  In Git 2.0, the default is now the "simple" semantics,
which pushes:

 - only the current branch to the branch with the same name, and only
   when the current branch is set to integrate with that remote
   branch, if you are pushing to the same remote as you fetch from; or

 - only the current branch to the branch with the same name, if you
   are pushing to a remote that is not where you usually fetch from.

You can use the configuration variable "push.default" to change
this.  If you are an old-timer who wants to keep using the
"matching" semantics, you can set the variable to "matching", for
example.  Read the documentation for other possibilities.

Is that clear? Given the bad track record of Git documentation it wouldn’t surprise me if you didn’t get what this chunk of text is trying to say at all. Personally I find it much easier to read the code to figure out what is happening.

So let me try to explain. When you type ‘git push’ (without any arguments), Git uses the configuration ‘push.default’ in order to find out what to push. Before ‘push.default’ defaulted to ‘matching’, and now it defaults to ‘simple’.

The ‘matching’ configuration essentially converts ‘git push‘ into ‘git push origin :‘, which means push all the matching branches, so if you have a local ‘master’, and there’s a remote ‘master’, ‘master’ is pushed; if you have a local and remote ‘fix-1’, ‘fix-1’ is pushed, if you have a local ‘ext-feature-1’, but there’s no matching remote branch, it’s not pushed, and so on.

The ‘simple’ configuration pushes a single branch instead, and it uses your configured upstream branch (see this post for a full explanation of the upstream branch), so if your current branch is ‘master’, and if ‘origin/master’ is the upstream of your ‘master’ branch, ‘git push’ will basically be the same as ‘git push origin master‘, or to be more specific ‘git push origin master:master‘ (the upstream branch can have a different name).

Note: If you are not familiar with the src:dst syntax; you can push a local branch ‘src’ and have the ‘dst’ name on the server, so you don’t need to rename a local branch, you can do ‘git push origin foobar:feature-a’, and your local branch “foobar” will be named “feature-a” on the server. This has nothing to do with v2.0.

However, if the current branch is ‘fix-1’ and the upstream is ‘origin/master’, ‘git push’ will complain that the name of the destination branch is not the same, because it doesn’t know if to do ‘git push origin fix-1:master‘ or ‘git push origin fix-1:fix-1‘.

Additionally if you do ‘git push github‘ (not the remote of your upstream branch), Git will simply use the name of the current branch, essentially ‘git push github fix-1‘ (‘fix-1’ being the name of the current branch).

This mode is anything but simple to describe. But perhaps the name is OK, because you can expect it to “simply work”.

Would I care?

If you don’t type ‘git push’, but instead specify what and where to push… you don’t care.

If you have configured ‘push.default’ already, which most likely you already did, because otherwise you will be getting the following annoying message all the time since two years ago… you don’t care.

warning: push.default is unset; its implicit value is changing in
Git 2.0 from 'matching' to 'simple'. To squelch this message
and maintain the current behavior after the default changes, use:

  git config --global push.default matching

To squelch this message and adopt the new behavior now, use:

  git config --global push.default simple

When push.default is set to 'matching', git will push local branches
to the remote branches that already exist with the same name.

In Git 2.0, Git will default to the more conservative 'simple'
behavior, which only pushes the current branch to the corresponding
remote branch that 'git pull' uses to update the current branch.

See 'git help config' and search for 'push.default' for further information.
(the 'simple' mode was introduced in Git 1.7.11. Use the similar mode
'current' instead of 'simple' if you sometimes use older versions of Git)

So, most likely you don’t care.

‘git add’ in directory

Here’s what the release notes say:

When "git add -u" and "git add -A" are run inside a subdirectory
without specifying which paths to add on the command line, they
operate on the entire tree for consistency with "git commit -a" and
other commands (these commands used to operate only on the current
subdirectory).  Say "git add -u ." or "git add -A ." if you want to
limit the operation to the current directory.

Although this is a clearer explanation, it’s not very clear what is changing, so let me give you can example.

Say you have modified two files, ‘README’ and ‘test/basic.t’, then you go to the ‘test’ directory, and run ‘git add -u‘, in pre-v2.0 only ‘test/basic.t’ will be staged, in post-v2.0 both files will be staged. If you run the command in the top level directory, nothing changes.

Would I care?

If you haven’t seen the following warning while doing ‘git add -u‘ or ‘git add -A‘, or if you don’t even use those options, you are fine.

warning: The behavior of 'git add --update (or -u)' with no path argument from a
subdirectory of the tree will change in Git 2.0 and should not be used anymore.
To add content for the whole tree, run:

  git add --update :/
  (or git add -u :/)

To restrict the command to the current directory, run:

  git add --update .
  (or git add -u .)

With the current Git version, the command is restricted to the current directory.

‘git add’ adds removals

Here’s what the release notes say:

"git add " is the same as "git add -A " now, so that
"git add dir/" will notice paths you removed from the directory and
record the removal.  In older versions of Git, "git add " used
to ignore removals.  You can say "git add --ignore-removal " to
add only added or modified paths in , if you really want to.

Again, it should be clearer with an example. Say you removed the file ‘test/basic.t’ and added a new file ‘test/main.t’, those changes are not staged, so you stage them with ‘git add test/’, pre-v2.0 ‘test/basic.t’ would remain tracked, post-v2.0, ‘test/basic.t’ is removed from the stage.

Would I care?

If you haven’t seen the following warning while doing ‘git add‘, you are fine.

warning: You ran 'git add' with neither '-A (--all)' or '--ignore-removal',
whose behaviour will change in Git 2.0 with respect to paths you removed.
Paths like 'test/basic.t' that are
removed from your working tree are ignored with this version of Git.

* 'git add --ignore-removal ', which is the current default,
  ignores paths you removed from your working tree.

* 'git add --all ' will let you also record the removals.

Run 'git status' to check the paths you removed from your working tree.

The rest

The "-q" option to "git diff-files", which does *NOT* mean "quiet",
has been removed (it told Git to ignore deletion, which you can do
with "git diff-files --diff-filter=d").

Most people don’t use this command, thus don’t care.

"git request-pull" lost a few "heuristics" that often led to mistakes.

Again, most people don’t use this command, which is mostly broken anyway.

The default prefix for "git svn" has changed in Git 2.0.  For a long
time, "git svn" created its remote-tracking branches directly under
refs/remotes, but it now places them under refs/remotes/origin/ unless
it is told otherwise with its "--prefix" option.

If you don’t use ‘git svn’, you don’t care. If you don’t see a difference between ‘trunk’ and ‘origin/trunk’, you don’t care.

tl;dr

You probably don’t care about these backward-incompatible changes. Sure, Git v2.0.0 received a good dosage of new features and bug-fixes, but so did v1.9.0, and all the versions before.

Given the fact that Git v2.0.0 has been cooking for three years, I think it’s a big missed opportunity that nothing really changed, specially given that in previous user surveys people have said the user-interface and documentation needs to improve, and there have been patches to try to do so. In a separate post I discuss what I think Git v2.0.0 should have included.

Is ‘git pull’ broken? If so, what’s the fix?

Is ‘git pull’ really broken? I know what you are thinking; such a pervasive and basic command cannot possibly be broken. Unfortunately, it is.

It is not some marginal issue, many experienced Git users avoid ‘git pull’ and even urge newcomers to avoid using that command, there’s many sites that encourage you to not use the command, and there have been a lot of threads on the mailing list about the issue (Pull is mostly evil, A failing attempt to use Git in a centralized environment), the maintainer, Junio C Hamano has accepted there’s a big problem, even Linus Torvalds agreed something needs to change.

In order to identify the problem we first need to define the two main ways ‘git pull’ is used.

Pull requests

One way ‘git pull’ is used, is to integrate pull requests into the mainline. For example in the Linux kernel, the DRM maintainer sends a pull request to Linus Torvalds, saying basically:

The following changes are available in the git repository at:

git://people.freedesktop.org/~airlied/linux drm-next

So Linus can just do:

git pull git://people.freedesktop.org/~airlied/linux drm-next

In this mode ‘git pull’ actually works fine, which is not too surprising, since it’s the main thing Linus Torvalds does.

However, this is not the way most people use ‘git pull’.

Update branch

What most people do is for example update their local ‘master’ branch, to the remote ‘origin/master’ branch. Essentially doing ‘git fetch origin’, ‘git merge origin/master’.

However, that’s not exactly what most people actually want to do.

If you don’t have any changes of your own in ‘master’, then yes, ‘git pull’ does what you want, but if you do have changes, and thus the branches have diverged, then ‘git pull’ will create a new merge commit. This might or might not be what you want, but the majority of Git newbies do not want that, or rather, the team they contribute to don’t want those “evil merges”. Unfortunately these newbies don’t know what they are doing, and Git is not making it easier.

So you end up with something like this:

git-pull

Most likely what the team wants is that the local chances are rebased on top of the remote ones, but if they want a merge, they want it the other way around, that is: merge the local changes to the remote ones, as if a topic branch was merged.

git-pull-fix

A merge with this order of parents has many advantages, including a clearer history, however, it’s not possible to do that with ‘git pull’, so you have to do ‘git fetch’, create a new branch, switch to the master branch, merge the other branch, and finally remove the other branch. It’s not straight-forward at all.

It is this mode that is broken, and that’s the reason many people try to avoid ‘git pull’; it rarely does what you want by default.

The solution

There have been many solutions proposed, however, there are many many use-cases to consider, and a solution that takes them all into consideration for the future is not easy to find.

The best solution that seems to accommodate all present use-cases and future ones is the introduction of a new command: ‘git update‘.

By default this command will complain if the branches have diverged, so you have to either do ‘git update --rebase‘ or ‘git update --merge‘, this ensures that newbies aren’t going to do “evil merges” by mistake.

Also, when you do a ‘git update --merge‘ the order of the parents is reversed, which means it appears you are merging ‘master’ to ‘origin/master’, and not the other way around as it happens with ‘git pull’, which means it appears as if you are merging a topic branch, which is what most people want.

git-update

There are many many more advantages to this new command, but probably too subtle to mention in this post.

When will this be ready?

Probably never. I sent a summary of the issues and the solution to the mailing list, which addresses all the use-cases that were discussed. I have the required patches with tests and documentation on my personal branch, and I’ve been using this new command for a while now.

Why isn’t this picked? Maybe it’s because none of the core developers experience these issues. Maybe because they don’t use ‘git pull’ in the second form. Who knows.

The fact is that there is no interest to get this fixed, even though the issue has been acknowledged, so it’s not likely to be fixed any time soon.

So what can you do about it? The best thing you can do right now is simply avoid using ‘git pull’. Additionally, you might want to instruct your fellow coworkers to avoid unsing it as well, specially the ones that are not very familiar with Git.

Also, you might want to use my fork, git-fc, which does have the ‘git update‘ command, which works better than ‘git pull‘ even when there’s no branch divergence, and when there is, ‘git update --merge‘ is also superior, because the order of the parents is right.

Using Git with triangular workflows; tips, tricks, and more

Chances are you are using a triangular workflow, even if you don’t know it. A triangular workflow simply means that you pull from one repository, and push to another. This is what the vast majority of Git users do, unfortunately most of the good stuff is buried in the nearly incomprehensible official manpages.

In this blog post I’ll try to shine some light into triangular workflows, how to make use of the upstream tracking branch for them, and explain the new publish tracking branch.

The basics

Say you clone a repository:

% git clone https://github.com/tiimgreen/github-cheat-sheet
% cd github-cheat-sheet

Then you do some changes and want to share them back.

What most people would do is create a fork in GitHub and push their changes there.

% git remote add mine https://github.com/felipec/github-cheat-sheet
% git push mine

After doing that they do a pull request so their changes can be merged to the original repository.

This workflow is not specific to GitHub by any means, for example the Linux kernel developers have the main repository in git.kernel.org, and they send pull requests by mail using repositories all over the map (example).

The help

If you do this over and over it becomes clear that a little help from Git would be nice.

The first thing you can do is setup the configuration ‘remote.pushdefault’ to the repository you usually push to (in the above case ‘mine’). So now you can type `git push` instead of `git push mine` every time.

The next thing would be to setup an upstream tracking branch (read my blog post about it if you are not familiar with it).

% git branch --set-upstream-to mine/fix-typos

Then Git would greet you with the following help:

Your branch is ahead of 'mine/fix-typos' by 1 commit.

This is telling you that you probably want to push your branch again, since it’s not up-to-date in the remote. It shows you that each time you switch to that branch, or when you do `git status`.

Moreover, `git branch -vv` would show you this help:

* fix-typos ... [mine/fix-typos: ahead 1] Fix a bunch of typos

So it seems Git already has tons of help for this workflow, doesn’t it? Not so fast.

The real upstream

The upstream tracking branch is useful for other purposes, but for that we need to set a different upstream:

% git branch --set-upstream-to origin/master

Now that the upstream is ‘master’ in the ‘origin’ remote, and when you run `git status`, you get:

Your branch and 'origin/master' have diverged,
and have 2 and 10 different commits each, respectively.

What that message is telling you is that ‘origin/master’ has moved, so there are 10 commits in ‘origin/master’ that your branch doesn’t have (and your branch has 2 commits ‘origin/master’ doesn’t have). In those cases you probably would want to rebase on top of ‘origin/master’ so that it’s easier for upstream maintainers to merge your branch, although you can merge ‘origin/master’ too, or simply do nothing and hope there are no conflicts. Either way the information is useful so you can decide what to do.

In addition, if you want to rebase, the command is easier; instead of `git rebase origin/master` you can just type `git rebase`, since `git rebase` by default uses the upstream tracking branch.

Moreover, if you always stay up-to-date, you can do `git pull --rebase`, which will fetch all remote the branches, and then rebase your current branch (e.g. ‘fix-typos’) on top of the upstream (e.g. ‘origin/master’). You can also configure ‘pull.rebase = true’ to always do this when you type `git pull`.

Not to mention that `git branch -vv` gives a much more useful information:

* fix-typos ... [master: ahead 2, behind 10] Fix a bunch of typos

Check how it looks in my real repository:

git branch --vv with upstream

You get other additional benefits, like for example you get warned if you try to delete a branch that hasn’t been merged to its upstream:

warning: not deleting branch 'fix-typos' that is not yet merged to
'origin/master', even though it is merged to HEAD.
error: The branch 'fix-typos' is not fully merged.
If you are sure you want to delete it, run 'git branch -D fix-typos'.

This is actually what the upstream tracking branch is meant for: to track the upstream, that is; the target branch where eventually all the commits of the source branch eventually should end up. All the commits of ‘fix-typos’ should end up in ‘origin/master’, therefore ‘origin/master’ is the upstream of ‘fix-typos’.

We want to have all the goodies of tracking ‘origin/master’ as our upstream, but we also want to track ‘mine/fix-typos’ so we know when we need to push. Unfortunately we can’t set them both as upstream, so we must choose one set of benefits over the other. Or should we?

The solution

The solution is not that hard to figure out: we need another upstream! Or rather; we need some concept that is similar to the upstream tracking branch, but instead of tracking the final destination, we track the location we push our commits to.

This is the publish tracking tracking branch.

When you set it up, you get all the information:

Your branch and 'origin/master' have diverged,
and have 2 and 10 different commits each, respectively.
Some commits haven't been published to 'mine/fix-typos'.

* fix-typos ... [origin/master, mine/fix-typos *: ahead 2, behind 10]

Notice the extra ‘*’ next to the publish branch, which hints that it needs to be published.

Also, you can type `git pull` and `git rebase`, which will use the upstream branch as you would expect, and `git push` which will use the publish branch.

In other words; everything just works perfectly.

You set up the publish branch just like you set up the upstream branch:

% git branch --set-publish-to mine/fix-typo

Or:

% git push --set-publish mine

But wait, there’s more: you are not tied to push to a single remote; you can set different branches in different remotes as publish tracking. For example ‘fix-typos’ to ‘github/fix-typos’, ‘bug-fix’ to ‘client/bug-fix’, and so on. You can even choose a different branch name in the remote: ‘client-b-bug-fix’ to ‘client-b/bug-fix’.

Nice, isn’t it?
git branch -vv publish

The problem

There is only one problem with the publish branch: it’s not in upstream git 😦

It is part of my fork, git-fc. If you use my fork, you will get this and other features, and you won’t loose any feature from official Git. Or you can use the specific branch, ‘fc/publish‘.

I’ve been using this code for more than half a year, and it has been reviewed in the Git mailing list, so you can trust it won’t eat your babies 🙂

Why isn’t it in official Git?

WARNING: if you don’t like conflicts or you know me for “adversarial” style (and don’t like it), skip this section

That’s a very good question. If the maintainer (Junio C Hamano) has accepted the triangular workflows are lacking, and a separate ‘upstream’ tracking branch is needed. Why isn’t it there?

The short answer is that they have an ad hominem thing against me, so even if my patches are correct and they solve a long-standing problem, they are not applied. They are only picked if they are trivial, or not controversial, or obvious fixes. Which is why I started a fork.

I sent the original version of the patches in September 2013, with virtually no comments. Then on January 2014 people start discussing (once again) about the issues with triangular workflows, and even complain about the lack of @{publish}. Eventually they start writing preparatory patches. But I had already written the whole thing several months ago!

It can’t be attributed to the fact they went inadvertently unnoticed because I re-sent the series once, and because I wrote about the support for @{publish} when I announced the git-fc fork.

Then I returned to the project after a long hiatus, and noticed they were working on something I already did, so let them know and send the patches again. This time they receive more feedback, and even make it into Junio’s “pu” (proposed updates) branch. Patches are often dropped from “pu”, sometimes for no reason at all, so this is not a reason they will get in.

This is the message Junio attached to the patch series:

 Add branch@{publish}; it seems that this is somewhat different from
 Ram and Peff started working on.  There were many discussion
 messages going back and forth but it does not appear that the
 design issues have been worked out among participants yet.

The “design issues” have not been worked out because “Ram” is not actively working on Git anymore (possibly thanks to the fact that nothing ever changes), and “Peff” said he wasn’t interested in the @{publish} concept, but more like a @{push} concept which will only benefit him and his weird bare-bones mode of interacting with Git. The fact that the @{publish} concept is what would benefit a vast majority of the user base is of no consequence to “Peff”.

So will it ever get into Git’s mainline? Who knows.

Get the goodies

If you want to use the publish tracking branch feature, get git-fc and follow the installation instructions. In addition you would get a ton of other features, and will loose none 🙂

If you use ArchLinux, you can get the package from AUR.

Enjoy 🙂

Demystifying the init system (PID 1)

With all the talk about debian choosing a default init system (link, link), I’ve decided to share with the world a little project I’ve been working on to help me understand /sbin/init aka. PID 1.

In this blog post I will go step by step showing what an init system must do to be functional. I will ignore all the legacy SysVinit stuff, and technical nuances, and just concentrate on what’s really important.

Introduction

First of all, what is ‘init‘? In it’s essence it’s a process that must be running at all times, if this process ends, the kernel enters into a panic mode, after which you cannot do anything else, except rebooting.

This process doesn’t need to do anything special, you can use /bin/sh as your init, or even /bin/yes (although the latter wouldn’t be very useful).

So let’s write our very first init.

#!/usr/bin/ruby
Process.spawn('agetty', 'tty1')
sleep

Believe it or not, this is actually a rather useful init. How useful it is depends on how your kernel was compiled, your partitioning scheme, and if your root file-system is mounted rw or not. But either way, it covers the basics: rule #1; always keep running no matter what.

This is almost true, except that we need to be listening for SIGCHLD, otherwise some processes wouldn’t be cleaned up properly, so:

Signal.trap(:SIGCHLD) do
  loop do
    begin
      status = Process.wait(-1, Process::WNOHANG)
      break if status == nil
    rescue Errno::ECHILD
      break
    end
  end
end

Reboot

Now that we have the running indefinitely under control, it’s time to stop running (only when requested), but in order to do that we need some kind of IPC with the running process. There’s many ways to achieve this, but I chose UNIX sockets to do that.

So instead of sleeping forever, we listen for commands issued to /run/initctl:

begin
  server = UNIXServer.open('/run/initctl')
rescue Errno::EADDRINUSE
  File.delete('/run/initctl')
  retry
end

loop do
  ctl = server.accept
  cmd = ctl.readline.chomp.to_sym
  # do stuff
end

And when the user is calling us with arguments, we pass those commands through /run/initctl.

def do_cmd(*cmd)
  ctl = UNIXSocket.open('/run/initctl')
  ctl.puts(cmd.join(' '))
  puts(ctl.readline.chomp)
  exit
end

case ARGV[0]
when 'poweroff', 'restart', 'halt'
  do_cmd(ARGV[0].to_sym)
end

So can issue the command init poweroff to turn off the machine, but in order to do that we need to tell the kernel:

def sys_reboot(cmd)
  map = { poweroff: 0x4321fedc, restart: 0x01234567, halt: 0xcdef0123 }
  syscall(169, 0xfee1dead, 537993216, map[cmd])
end

These numbers are not important, what is important is that the kernel understands them, and with this we actually turn off the machine (or halt, or reboot).

Thread carefully

Obviously it would be tedious to type a bunch of commands each time the machine starts, so we need to actually do stuff after booting, however, if we do something wrong, we might render the system unusable. A simple way to solve this is to use scripts, fork a shell, and let it run those, so if there’s something wrong with the scripts, the shell dies, but not PID 1, so the system remains usable, which again, is rule #1.

Fortunately Ruby has exceptions, so we can run code with a safety net that catches all exceptions, and there’s no need to fork, which would waste precious booting time.

def action(name)
  print(name)
  begin
    yield
  rescue => e
    print(' (error: %s)' % e)
  end
  puts
end

With this helper, we can safely run chunks of code, and if they fail, the error is reported to the user.

Initialization

This is the bulk of the code; the instructions you don’t want to type every time. This is mostly tedious stuff, you can skim or skip this section safely.

def mount(type, device, dir, opts)
  Dir.mkdir(dir) unless File.directory?(dir)
  system('mount', '-t', type, device, dir, '-o', opts)
end

action 'Mounting virtual file-systems' do
  mount('proc', 'proc', '/proc', 'nosuid,noexec,nodev')
  mount('sysfs', 'sys', '/sys', 'nosuid,noexec,nodev')
  mount('tmpfs', 'run', '/run', 'mode=0755,nosuid,nodev')
  mount('devtmpfs', 'dev', '/dev', 'mode=0755,nosuid')
  mount('devpts', 'devpts', '/dev/pts', 'mode=0620,gid=5,nosuid,noexec')
  mount('tmpfs', 'shm', '/dev/shm', 'mode=1777,nosuid,nodev')
end

And set the hostname.

action 'Setting hostname' do
  hostname = File.read('/etc/hostname').chomp
  File.write('/proc/sys/kernel/hostname', hostname)
end

Notice that many things can go wrong, for example the file ‘/etc/hostname’ might not exist, however, that would cause an exception, and our init would continue just fine.

Another thing we would want to do is kill all the processes, otherwise we might not be able to unmount the file-systems. We could do killall5, but we wouldn’t have much control over the processes, and that would require a fork. Instead we can rely on the kernel to do the right thing, and all we have to do is wait for the results.

def killall

  def allgone?()
    Dir.glob('/proc/*').each do |e|
      pid = File.basename(e).to_i
      begin
        next if pid < 2
        # Is it a kernel process?
        next if File.read('/proc/%i/cmdline' % pid).empty?
      rescue Errno::ENOENT
      end
      return false
    end
    return true
  end

  def wait_until(timeout = 2, interval = 0.25)
    start = Time.now
    begin
      break true if yield
      sleep(interval)
    end while (Time.now - start) < timeout
  end

  ok = false

  action 'Sending SIGTERM to processes' do
    Process.kill(:SIGTERM, -1)
    ok = wait_until(10) { allgone? }
    raise 'Failed' unless ok
  end

  return if ok

  action 'Sending SIGKILL to processes' do
    Process.kill(:SIGKILL, -1)
    ok = wait_until(15) { allgone? }
    raise 'Failed' unless ok
  end

end

Time to mount real file-systems:

NETFS = %w[nfs nfs4 smbfs cifs codafs ncpfs shfs fuse fuseblk glusterfs davfs fuse.glusterfs]
VIRTFS = %w[proc sysfs tmpfs devtmpfs devpts]

action 'Mounting local filesystems' do
  except = NETFS.map { |e| 'no' + e }.join(',')
  system('mount', '-a', '-t', except, '-O', 'no_netdev')
end

# On shutdown

action 'Unmounting real filesystems' do
  except = (NETFS + VIRTFS).map { |e| 'no' + e }.join(',')
  system('umount', '-a', '-t', except, '-O', 'no_netdev')
end

If you are using a modern distribution, chances are your /run and /tmp directories are cleared up on every boot, so many files and directories need to be re-created. We could do this by hand, but we could also use the systemd-tmpfiles utility which uses the configuration already provided by your distribution in tmpfiles.d directories.

action 'Manage temporary files' do
  system('systemd-tmpfiles', '--create', '--remove', '--clean')
end

begin
  File.delete('/run/nologin')
rescue Errno::ENOENT
end

Unless you are using a custom kernel with modules built-in, chances are you are going to need udev, so fire it up:

action 'Starting udev daemon' do
  system('/usr/lib/systemd/systemd-udevd', '--daemon')
end

action 'Triggering udev uevents' do
  system('udevadm', 'trigger', '--action=add', '--type=subsystems')
  system('udevadm', 'trigger', '--action=add', '--type=devices')
end

action 'Waiting for udev uevents to be processed' do
  system('udevadm', 'settle')
end

# On shutdown

action 'Shutting down udev' do
  system('udevadm', 'control', '--exit')
end

Finally

After all this initialization stuff, your system is most likely very usable already, and in fact I was able to start a display manager (SLiM) at this point, which was my main goal while writing this. But we are just getting started.

In control

Another thing init should do is keep track of launched daemons. Each time we do that we store the PID, and when the child exists, we remove it from the list.

def start(id, cmd)
  $daemons[id] = Process.spawn(*cmd)
end

start('agetty1', %w[agetty tty1])

# On SIGCHLD
key = $daemons.key(status)
$daemons.delete(key) if key

Once we have this it’s trivial to report the status of them (e.g. init status agetty1).

ctl.puts($daemons[args.first] ? 'ok' : 'dead')

At this point we actually have a feature that SysVinit doesn’t have. Not bad for 200 lines of code!

cgroups

cgroups is a feature that is often misunderstood, probably because there are no good tools to make use of them, but they are not that hard. Lennart Pottering went to a lot of trouble trying to explain exactly what systemd does with them and it does not, but I don’t think he did a very good job of clarifying anything. Basically systemd is not doing anything with them Normally systemd is not doing anything with them (by default), simply labeling processes so you can see how they are grouped by using visualization tools like systemd-cgls, but that’s it.

The single most important way you can take advantage of cgroups is for scheduling purposes, so for example your web browser is a control group, and your heavy compilation is in another, then Linux scheduler would isolate the two processes from stealing resources from each other without the need of adjusting the nice level. Basically with cgroups there’s no need for nice (although you can use alongside).

But you don’t have to move a finger to get this benefit, the kernel already does that if you have CONFIG_SCHED_AUTOGROUP, which you should. Then, cgroups would be created for each session in the system, if you don’t know what sessions are, you can run ‘ps f -eo pid,sid,cmd‘ to find out to which session id each process belongs to.

To prove this I wrote a little script that finds out the auto-grouping as reported by the Linux kernel, and you can find groups like:

------------------------------------------------------------------------------
503	slim -nodaemon
895	/bin/sh /etc/xdg/xfce4/xinitrc -- /etc/X11/xinit/xserverrc
901	dbus-launch --sh-syntax --exit-with-session
938	xfce4-session
948	xfwm4
952	xfce4-panel
954	Thunar --daemon
956	xfdesktop
958	conky -q
964	nm-applet
------------------------------------------------------------------------------

This is exactly what you would expect, the session leader (SLiM) starts a bunch of processes, and all of them belong to the same session, and if I compile a Linux kernel, I get:

------------------------------------------------------------------------------
14584	zsh
17920	make
20610	make -f scripts/Makefile.build obj=arch/x86
20661	make -f scripts/Makefile.build obj=kernel
20715	make -f scripts/Makefile.build obj=mm
20734	make -f scripts/Makefile.build obj=arch/x86/kernel
20736	make -f scripts/Makefile.build obj=fs
20750	make -f scripts/Makefile.build obj=arch/x86/kvm
20758	make -f scripts/Makefile.build obj=arch/x86/mm
21245	make -f scripts/Makefile.build obj=ipc
21274	make -f scripts/Makefile.build obj=security
21281	make -f scripts/Makefile.build obj=security/keys
21376	/bin/sh -c set -e; 	   echo '  CC      mm/mmu_context.o'; ...
21378	gcc -Wp,-MD,mm/.mmu_context.o.d ...
21387	/bin/sh -c set -e; 	   echo '  CC      ipc/msg.o'; ...
21390	gcc -Wp,-MD,ipc/.msg.o.d ...
21395	/bin/sh -c set -e; 	   echo '  CC      kernel/extable.o'; ...
21399	/bin/sh -c set -e; 	   echo '  CC [M]  arch/x86/kvm/pmu.o'; ...
21400	gcc -Wp,-MD,kernel/.extable.o.d ...
21403	gcc -Wp,-MD,arch/x86/kvm/.pmu.o.d .
21405	/bin/sh -c set -e; 	   echo '  CC      arch/x86/kernel/probe_roms.o'; ...
21407	gcc -Wp,-MD,arch/x86/kernel/.probe_roms.o.d ...
21413	/bin/sh -c set -e; 	   echo '  CC      fs/inode.o'; ...
21415	/bin/sh -c set -e; 	   echo '  CC      arch/x86/mm/srat.o'; ...
21418	/bin/sh -c set -e; 	   echo '  CC      security/keys/keyctl.o'; ...
------------------------------------------------------------------------------

This group will contain a lot of processes that take a lot of resources, but the scheduler knows they belong to the same group. If somebody logs in to my machine and starts running folding@home we would have two cgroups trying to use 100% of the CPU, so the scheduler would assign 50% to one, and 50% to the other, even though the first one has many more processes. Without the grouping, the scheduler would be unfair against folding@home, giving it as much time as it gives each one of the compilation processes.

All this without you moving a finger. Well, almost.

def start(id, cmd)
  pid = fork do
    Process.setsid()
    exec(*cmd)
  end
  $daemons[id] = pid
end

Socket activation

systemd has made a lot of fuss about socket activation, and how it’s the next best thing after sliced bread. I agree it’s a great idea, but the idea didn’t come from systemd, AFAIK it came from OSX. But, do we need systemd to get the same in Linux?

def start_with_socket(id, stream, cmd)

  server = TCPServer.new(stream)

  Thread.new do
    loop do
      socket = server.accept
      system(*cmd, :in => socket, : out => socket)
    end
  end

end

start_with_socket('sshd', 22, %w[/usr/bin/sshd -i])

Believe it or not, this simple code achieves socket activation. We create a socket, and a new thread that waits for connections, if nobody connects, nothing happens, we have an idle thread, each time somebody connects, we launch ssh -i, which as far as I can tell is the same thing xinetd does, and systemd.

But hey, this is the simple socket activation, it’s not the really fancy one.

Thread.new do
  if managed
    IO.select([server])
    pid = fork do
      env = {}
      env['LISTEN_PID'] = $$.to_s
      env['LISTEN_FDS'] = 1.to_s
      Process.setsid()
      exec(env, *cmd, 3 => server)
    end
    $daemons[id] = pid
  else
    loop do
      socket = server.accept
      system(*cmd, :in => socket, : out => socket)
    end
  end
end

There, this does exactly the same thing as systemd (at least for one socket, multiple ones are easy too), so yeah, we have socket activation.

But wait, there’s more

Hopefully this covers the basics of what an init system should do, and how it’s not rocket science, nor voodoo. It is actually something very straightforward; start the system, keep it running, simple. Of course there’s many other things an operating system should do, but those things don’t belong to the init system, don’t let anyone tell you otherwise.

I have more changes on top of this that bring my little toy init system almost up-to-par to Arch Linux’s initscripts, which is what they used before moving to systemd, so chances are if you use my init, you would have little to no problems in your own system.

Unlike systemd and others, this code is actually very readable, so you can add and remove code as you like very easily, and of course, the less code you have, the faster you boot.

Personally when I hear somebody saying “Oh! but OpenRC doesn’t have socket activation, we need systemd!”, I just roll my eyes.

If you want to give it a try, get the code from GitHub:

https://github.com/felipec/finit

Cheers.

finit

Announcing git-fc; a friendly fork of Git

I’ll start with the obvious question; why a fork? Well, the short answer is; my patches are not being applied, the long answer is convoluted and would require long explanation of how Git development works, principles and guidelines, but more importantly the culture of the core developers, and I’m not going to get into that, maybe in the comments section if somebody is interested.

So what is git-fc? It is a friendly fork, and by that I mean that it’s a fork that won’t deviate from the mainline, it is more like a branch in Git terms. This branch will move forward close to Git’s mainline, and it could be merged at any point in time, if the maintainer wished to do so.

git-fc doesn’t include experimental code, or half-assed features, so you can expect the same level of stability as Git’s mainline. Also, it doesn’t remove any feature, or do any backwards incompatible changes, so you can replace git with git-fc and you wouldn’t notice the difference. The delta comes in the extra features that I’ll describe in detail below, that is all.

Who am I? I’ve contributed many patches to Git, mainly the git-remote-hg/bzr two-way bridges, but many many other things. Here’s a list of the top 10 contributors to Git since last year by number of patches:

% git shortlog --since='1 year ago' --no-merges -n -s | head -n 10
   388	Junio C Hamano
   308	Felipe Contreras
   230	Jeff King
   161	Nguyễn Thái Ngọc Duy
   122	Michael Haggerty
   103	Ramkumar Ramachandra
    96	John Keeping
    69	Eric Sunshine
    59	Thomas Rast
    51	René Scharfe

More info in ohloh.

As you see, I’ve done a lot of work for Git’s mainline, so chances are you have already benefited from my code one way or the other.

However, the most interesting patches are not merged. I wrote a summary of my 160 patches, explaining their status, so Git developers would prioritize them, but I think it’s fair to say they are just not going to apply them.

So, what do you get if you use git-fc?

@ shortcut

Many people have suggested a shortcut for the non-particularly-intuitive “HEAD”, but none of these suggestions seemed very appealing, or feasible.

Because Git already has an ref@op revision syntax, where if you remove the ref, HEAD is implied, I thought @ could be thought as HEAD.

This change was welcome and accepted by the Git mainline, and it even was on track for v1.8.4 but it was dropped last minute because of some issues that are fixed now, and you probably will see it in v1.8.5. But why wait? 🙂

Nice ‘branch -v’

If you have configured the upstream tracking branch for your branches (I wrote a blog post about them), when you do ‘git branch -v’ you see something like this:

  fc/branch/fast      177dcad [ahead 2] branch: reorganize verbose options
  fc/stage            abb6ad5 [ahead 14] completion: update 'git reset' ...
  fc/transport/improv eb4d3c7 [ahead 10] transport-helper: don't update ...

While that provides useful information, it doesn’t show the upstream tracking branch, just says “ahead 2” but “ahead 2” compared to what?

If you do ‘git branch -vv’, then you see the answer:

  fc/branch/fast      177dcad [master: ahead 2] branch: reorganize ...
  fc/stage            abb6ad5 [master: ahead 14] completion: update ...
  fc/transport/improv eb4d3c7 [master: ahead 10] transport-helper: don't ...

Unfortunately both options take a lot of time (relative to most Git commands which are instantaneous), because computing the “ahead 2” takes a lot of time. So I decided to switch things around, so ‘git branch -v’ gives you:

  fc/branch/fast      177dcad [master] branch: reorganize verbose options
  fc/stage            abb6ad5 [master] completion: update 'git reset' new ...
  fc/transport/improv eb4d3c7 [master] transport-helper: don't update refs ...

And it does so instantaneously.

Default aliases

Many (if not all) version control system tools have shortcuts for their most common operations; hg ci, svn co, cvs st. But not Git. You can configure your own aliases manually, but you might have some trouble if you use somebody else’s machine.

Adding default aliases is trivial, it helps everyone, and it doesn’t hurt anyone, yet the patch to do so was rejected.

For now, there are only four aliases, but more can be added later if they are requested.

co = checkout
ci = commit
rb = rebase
st = status

If you have already these aliases, or mapped to something else, your aliases would take precedence over the default ones, so you won’t have any problems.

Streamlined remote helpers

I have spent a lot of time working on git-remote-hg and git-remote-bzr, and although they are relatively new, they have proven to be quite stable and solid, yet they are only part of the “contrib” area side by side with much simpler and way less solid scripts.

In order these in Git mainline you might need a bit of tinkering, and it’s not straight-forward to package them for distributions.

With git-fc they are installed by default, and in the right way, making things easier for distributions.

Improvements to the transport helper

The two way bridges between Git and Mercurial/Bazaar already work quite well, but they lack some features, specifically you cannot do –force, or –dry-run, or use an old:new refspec. If you are not familiar with the old:new refspec; you can do ‘git push master:my-master’, which would push your ‘master’ branch, as if it was named ‘my-master’ in the remote repository.

This is extremely useful if you are really serious about using Git as a transparent client to access a Mercurial repository.

New core.mode configuration

Git is already preparing users for the v2.0 release which would bring minor backward compatibility breakage, but some people would rather get rid of the warnings which are going to stay probably for many releases more and just move to the new behavior already.

Testing Git v2.0 behavior today would not only help git-fc, but also the Git mainline, and you can do that by setting core.mode = next, so if you do this and provide feedback about any issues, that would be greatly appreciated. Unfortunately you cannot test the v2.0 behavior in Git mainline because they rejected the patches, but you can in git-fc.

Please note that the v2.0 behavior might change in the future, before v2.0 is released, so if you enable this mode you need to be aware of that. Chances are you are not going to notice any difference anyway.

In addition to the “next” (v2.0) mode, there’s the “progress” mode. This mode enables “next” plus other configurations that have been proposed to change by default in v2.0, but hasn’t yet been agreed.

In particular, you get these:

merge.defaulttoupstream = true
branch.autosetupmerge = always
mergetool.prompt = false

There might be more in the future, and suggestions are welcome.

It is recommended that you setup this mode for git-fc:

git config --global core.mode progress

Non-ff pulls rejected by default

Even in the Git project everybody has agreed this is the way to go in order to avoid the typical Git newbie making the mistake of doing a merge, when perhaps (s)he wanted to do git reset, or git rebase. With this change git complains that that a non-fast-forward branch is being pulled, so the user has to decide what to do.

The user would have to do either ‘git pull --merge‘ or ‘git pull --rebase‘, the former being what Git mainline currently does.

The user can of course choose the old behavior, which is easy to configure:

git config --global pull.mode merge

Official staging area

Everybody already uses the term “staging area” already, and Git developers also agreed it the best term to what is officially referred to as “the index”. So git-fc has new options for all commands that modify the staging area (e.g. git grep –staged, git rm –staged), and also adds a new git stage command that makes it easier to work with the staging area.

'git stage' [options] [--] [...]
'git stage add' [options] [--] [...]
'git stage reset' [-q|--patch] [--] [...]
'git stage diff' [options] [] [--] [...]
'git stage rm' [options] [--] [...]
'git stage apply' [options] [--] [...]
'git stage edit'

Without any command, git stage adds files to the stage, same as git add, same as in Git mainline.

New fetch.default configuration

When you have configured the upstream tracking branch for all your branches, you will probably have tracking branches that point to a local branch, for example feature-a pointing to master, in which case you would get something like:

% git fetch
From .
 * branch            master     -> FETCH_HEAD

Which makes absolutely no sense, since the ‘.’ repository is not even documented, and FETCH_HEAD is a marginally known concept. In this case git fetch is basically doing nothing from the user’s point of view.

So the user can configure fetch.default = simple to get a simple sensible default; ‘git fetch‘ will always use origin by default, which is not ideal for everyone, but it’s better than the current alternative.

If you use the “progress” mode, this option is also enabled.

Publish tracking branch

Git mainline doesn’t have the greatest support for triangular workflows, a good solution for that is to introduce a second “upstream” tracking branch which is for the reverse; the branch you normally push to.

Say you clone a repository (libgit2) in GitHub, then create a branch (feature-a) and push it to your personal repository, you would want to track two branches (origin/master), and (mine/feature-a), but Git mainline only provides support for a single upstream tracking branch.

If you setup your upstream tracking branch to origin/master, then you can just do git rebase without arguments and git will pick the right branch (origin/master) to rebase to. However, git push by default will also try to push to origin/master, which is not what you want. Plus git branch -v will show how ahead/behind your branch is compared to origin/master, not mine/feature-a.

If you set up your upstream to mine/feature-a, then git push will work, but git rebase won’t.

With this option, git rebase uses the upstream branch, and git push uses the publish branch.

Setting the publish tracking branch is easy:

git push --set-publish mine feature-a

Or:

git branch --set-publish mine/feature-a

And git branch -v will show it as well:

fc/branch/fast      177dcad [master, gh/fc/branch/fast] branch: ...
fc/stage            abb6ad5 [master, gh/fc/stage] completion: ...
fc/transport/improv eb4d3c7 [master, gh/fc/transport/improv] ...

Support for Ruby

By far the most complex and interesting feature, but unfortunately also the one that is not yet 100% complete.

There is partial optional support for Ruby. Git already has tooling so any language can use it’s plumbing and achieve plenty of tasks:

IO.popen(%w[git for-each-ref]) do |io|
io.each do |line|
sha1, kind, name = line.split()
# stuff
end
end

However, this a) requires a process fork, and b) requires I/O communication to get the desired data. While this is not a big deal on many systems, it is in Windows systems where forks are slow, and many Git core programs don’t work as well as they do in Linux.

Git has a goal to replace all the core scripts with native C versions, but it’s a goal only in name that is not actually pursued. In addition, that still leaves out any third party tools since Git doesn’t provide a shared libgit library, which is why an independent libgit2 was needed in the first place.

Ruby bindings solve these problems:

for_each_ref() do |name, sha1, flags|
# stuff
end

The command ‘git ruby‘ can use this script by providing the bindings for many Git’s internal C functions (though not all), which makes it easier to write Ruby programs that take full advantage of Git without any need of forks, or I/O communication.

Conclusion

As you might guess, I’ve spent a lot of time working on all these features, plus all the ones that are already merged in Git’s mainline. Hopefully they are useful to some people.

It’s easy to compile and install:

make install

By default git will be installed in your home directory, but you can also do what I do: ‘make prefix=/opt/git install‘, and add ‘/opt/git/bin’ to your $PATH. All you need is a few development packages; zlib, curl, expat, openssl.

The code is in Github, the home page is in Google code, and the mailing list in Google groups. All comments and patches are welcome.

You can find future comments and releases in this blog, under the git-fc tag.

git-fc