dot-team: a new way of managing dotfiles

After more than 20 years of configuring Linux systems I’ve become increasingly sensitive to configuring yet another system from scratch. That’s why about a decade ago I decided to store my configurations in a git repository: felipec/dotfiles.

The way to do this properly is with a little trick I learned from a blog post I can’t recall anymore:

alias config='git --git-dir=$DOTFILES/.git/ --work-tree=$HOME'

With this trick it’s possible to have your git repository in a different directory than your work tree. That means your files are in $HOME, while your dotfiles’ git repository somewhere else. When you do config diff in your home directory, git knows where your dotfiles’ repository actually is.

You can issue all git commands with the config command instead, and everything works almost magically (e.g. config diff, config status, config add, etc.).

While this method works for me–and apparently it works for other people as well–it appears quite redundant for thousands of people to repeat what are essentially the same obvious configurations (how many people do you know that maintain their dotfiles in GitHub?).

Let’s take a simple example:

alias ls='ls --color=auto'

This is a simple configuration that most people would want (in fact coreutils should probably make it the default). Why do thousands of people need to set this up?

That’s why I’ve created dot-team, a collection of collaborative dotfiles. The idea is to share the most common configurations most of us would want, so that we don’t have to do the same thing over and over. Our personal configurations (stuff not everyone necessarily wants) can then sit on top of the master dot-team configuration.

For an example of what shared configurations can achieve, take a look at the oh-my-zsh project.


Let’s look at some basic configurations in dot-team.


alias ls='ls --color=auto'
alias grep='grep --color=auto'
alias diff='diff --color=auto'

Colors are very important to me–and I bet for most people–so it makes sense that all the basic utilities have color enabled by default. The reason why --color=auto is used instead of --color is that the latter is essentially --color=always, which would output color if you do something like ls -1 > list.txt, which is not something we want. The auto option uses color only on a tty (when not outputting to a file, or another program).


PS1='\[\e[1;34m\]\h \[\e[0;37m\]\w \[\e[0;32m\]\$\[\e[0m\] '

host dir $ command

While the prompt is something most people will probably want to change, it makes sense to have a nice simple default.


While dot-team is meant to be editor-agnostic, vim is the editor I use, and I know what is a good minimal configuration:

syntax on
filetype plugin indent on

set hlsearch
set clipboard^=unnamedplus

We could load $VIMRUNTIME/defaults.vim, but I find most of those configurations unnecessary (and many are personal preferences of Bram Moolenaar).

I find syntax and file type detection to be indispensable, and so is highlighted search.

As for the clipboard, we obviously want to yank text automatically to the X11 CLIPBOARD selection, not PRIMARY, so we want to copy to the + register, not the *. That’s what unnamedplus does. Also, we want selecting text to copy to the PRIMARY selection, autoselect does that, but that’s the default, so in order to not override the default, I prepend the new clipboard value (^=).

If you have some good defaults for emacs, feel free to send a pull request.


export PATH="$HOME/bin:$HOME/.local/bin:$PATH"

It’s very typical to have executables in ~/bin, and ~/.local/bin (for example when installing tools with pip or npm).


alias v='$VISUAL'
alias e='$EDITOR'
alias o='xdg-open'

The very essential aliases in my opinion are those needed to open and edit files.

xdg-open is the most standard way to open files (or URLs) in Linux. To edit files the $VISUAL configuration is used (e.g. vim), however there’s a difference between VISUAL and EDITOR: VISUAL is supposed to be a full-fledged editor (like gvim) while EDITOR is supposed to be a simple editor (like ed or vi). I personally use graphical vim as my full-fledged editor (opens a new window), and terminal vim as my simple editor (same terminal window). You can set both as the same editor if you like.



setopt auto_cd

autoload -U compinit && compinit

dot-team is shell-agnostic, that means both bash and zsh are supported.

To make zsh minimally work correctly, I’ve found history needs to be enabled, and the completion system (which is amazing in zsh). Moreover, the auto_cd option is very nice.


    "Return": exo-open --launch TerminalEmulator
    "b": exo-open --launch WebBrowser
    left_workspace: "Left"
    right_workspace: "Right"
    up_workspace: "Up"
    down_workspace: "Down"
    workspace_1: "1"
    workspace_2: "2"
    workspace_3: "3"
    workspace_4: "4"
    "/general/easy_click": Super
    "/general/cycle_tabwin_mode": 1
    "/compat/LaunchGNOME": true
    "/panels/dark-mode": true
- size: 30
  length: 100
  length-adjust: false
  position: p=10;x=0;y=0
  position-locked: true
  icon-size: 16
  - - whiskermenu
  - - tasklist
  - - separator
    - expand: true
      style: 0
  - - windowmenu
  - - pager
  - - systray
    - icon-size: 16
      square-icons: true
  - - pulseaudio
  - - power-manager-plugin
  - - notification-plugin
  - - clock
    - digital-format: "%R"

Throughout the years I’ve found it’s a hassle to maintain Xfce configuration files, so I created a tool to load and dump entire configurations in an easy way: xfce-config-helper.

With this tool it’s possible to load an entire configuration in one go, like the simple configuration above. It’s also easy to make changes, maintain it in a git repository, and share it with other people.


This is just a taste, there’s plenty of other basic configurations, like aliases for common ls shortcuts (e.g. ll), aliases for common git commands (e.g. git st), colorized man pages, code to change the title of windows (in both bash and zsh), sane readline defaults, and more.

Here I’m just showing examples of configurations I think most people would want, or at least the least common denominator.

If you think some basic configuration is missing, simply send a pull request so we all can benefit from it.


It’s easy to configure a system, what is hard is to keep the configuration repository up to date. The git trick mentioned above helps a lot, but it’s better to have some standard tools to maintain dotfiles.

That’s why I created dot-tools: a set of tools to maintain dotfiles using git.

While I created dot-tools to work together with dot-team, you don’t need to: dot-tools can be used in any dotfiles repository, independently of dot-team.

Getting started

To get started all you need to do is:


This will create an empty git repository in ~/.local/share/dot-files, which is the standard location used in dot-tools.

Then, to add a file and keep track of it, you need to first add the file, then create a commit (just like in git):

dot-config add .bashrc
dot-config commit

dot-config is the main command of dot-files, and will be used a lot, that’s why I recommend you create an alias (e.g. alias c=dot-config)–this is already done by default in dot-team.

This is enough to get started. Each time you want to update your tracked file (e.g. .bashrc), you simply do the same commands again, or even simpler:

c commit $file

Since this commit command is going to be used a lot, the dot-team project has an alias: co.


The above is enough to get you started and keep track of your configuration files, however, there’s at least two very important commands that will help you maintain clean dotfiles.


Let’s say you’ve made changes to your .bashrc file, to see those changes you can run the following command:

% c diff .bashrc
diff --git a/.bashrc b/.bashrc
index 41f62c2..8e94f75 100644
--- a/.bashrc
+++ b/.bashrc
@@ -1,6 +1,6 @@
 source ~/.aliases
-PS1='\[\e[1;34m\]\h \[\e[0;37m\]\w \[\e[0;32m\]\$\[\e[0m\] '
+PS1='|\[\e[1;35m\]\t\[\e[m\]|\[\e[1m\]\u\[\e[m\]@\[\e[1;36m\]\h\[\e[m\]:\[\e[1;32m\]\W>\[\e[m\] '
 # Change the window title of X terminals
 PROMPT_COMMAND='echo -ne "\e]2;${PWD/#$HOME/\~}\a"'

This is the typical diff format, it should be clear which line was removed (red), and which line was added (green).

If you run the command without specifying any file (c diff), then you can see all the changes to all the tracked files.

Then you can choose to commit changes to specific files (c commit $file), or even specify which specific chunks you want in your commit (c commit --patch $file).

All of this comes from git.


% c status --short
 M .bashrc
?? .config/discord/

The other useful command is c status which gives you an overview of all the files that have been modified, and also the new files that haven’t been tracked yet.

When you see an untracked file you have two options: either add the file to the repository in order to track it, or add it to the .gitignore file in order to ignore it.

Once you have all your tracked files updated and no untracked files, c status will show nothing.


It is straightforward to keep an existing configuration up-to-date, but what happens the next time you want to apply some dotfiles into a new machine?

dot-checkout is a tool that will essentially copy all the files from the dotfiles repository (~/.local/share/dot-files) into your home directory. In theory you only need to do this once.

However, sometimes you might want to do other stuff, like for example downloading some vim plugins, or some themes, this is the job of the bootstrap script, which lives inside .config/dot-tools/bootstrap.

dot-setup is a tool that will do all that for you: clone the dotfiles repository, and then call dot-bootstrap, which simply calls the bootstrap script.

For an example of a bootstrap script, check the one of the dot-team project: bootstrap.

The idea is that with just one command you go from a blank home directory into a fully updated and managed setup.

Just do it

Hopefully at this point it should be clear why using a standard way to manage dotfiles using dot-tools might be desirable. And also why sharing dotfiles using the dot-team repository as a basis might be helpful.

So where do you start?

Simply clone the dot-tools repository and run make install.

git clone
cd dot-tools
make install

Alternatively if you are using Arch Linux, you can install the AUR package:

yay -Sy dot-tools

Note: somebody removed the AUR package, so this isn’t available right now



And that’s it.

You don’t have to worry about your existing configuration being overridden, dot-checkout will only copy files that don’t already exist.

Then, when you do dot-config diff you’ll see the delta between your configuration, and the dot-team configuration. Then you can choose to adopt some of the dot-team configurations, or update the repository to use your current stuff.

Eventually you might want to update the basis version of dot-team, rebase your personal configuration, or perhaps even send a pull request upstream, but that’s for another blog post.

If you have any question feel free to add a comment here, or add an issue in the project’s tracker: dot-team issues.

I hope you find this useful đź‘Ť

xfce-config-helper 0.1 released

Over the years I’ve tried many ways to store my Xfce configuration in a way that is easier to read and maintain, and ultimately I ended up creating my own format and developed a tool to make use of it.


Xfce’s configurations are handled with Xfconf, a hierarchical configuration system. Essentially configurations are grouped in “channels” and each channel has a tree of configurations, for example:

<?xml version="1.0" encoding="UTF-8"?>

<channel name="mychannel" version="1.0">
  <property name="group" type="empty">
    <property name="property" type="string" value="foobar"/>

Here is the configuration for a channel called “mychannel” which has a single property: “/group/property”. This file is stored as “mychannel.xml”.

To fetch this configuration we can use the standard tool xfconf-query:

xfconf-query --channel mychannel --property /group/property

In theory we could just store our configuration and a sequence of xfconf-query commands, but it turns out different applications use this configuration store in different ways and the underlying data isn’t human-friendly.

Let’s see how a simple panel looks like:

<?xml version="1.0" encoding="UTF-8"?>

<channel name="xfce4-panel" version="1.0">
  <property name="configver" type="int" value="2"/>
  <property name="plugins" type="empty">
    <property name="plugin-1" type="string" value="whiskermenu"/>
    <property name="plugin-2" type="string" value="tasklist"/>
    <property name="plugin-3" type="string" value="pager"/>
  <property name="panels" type="array">
    <value type="int" value="1"/>
    <property name="panel-1" type="empty">
      <property name="plugin-ids" type="array">
        <value type="int" value="1"/>
        <value type="int" value="2"/>
        <value type="int" value="3"/>

This is just one panel with three plugins. When you have dozens of plugins it gets pretty ugly, pretty fast.

I used to edit these XML files by hand, and it’s hell, especially after I’ve used the UI to change the configuration, and after a while of not committing changes I end up reorganizing the whole file just to see what changes I actually made.

The second problem is that you are not supposed to edit these XML files while Xfce is running, so you have to logout and go the console before editing these files, or do some other tricks.

YAML to the rescue

I’m a big fan of configurations using YAML, not only are they much less verbose than XML from the get-go, but they are intended to be read and modified by humans.

With YAML our first configuration is much simpler:

    "/group/property": "foobar"

And the panel:

- plugins:
  - - whiskermenu
  - - tasklist
  - - pager

All we have to do is use the tool xfconf-load, and pass this file as argument. The configuration will be loaded on the fly using D-Bus.

Moreover, if we’ve made changes to the configuration using the UI, there’s another tool to grab the current configuration xfconf-dump, and it will ignore default values in order to present a configuration that is easy to maintain.

Since the entire configuration is in a single file, you can have layers of configuration, so for example you could have common configurations in a common.yml file, and then on top of that have a configuration for your home PC, and another for your work laptop. All you have to do is load first the common configuration, and then the specific one.

Why not give a try? The installation instructions are in the GitHub repository, and there’s an AUR package for Arch Linux.

Here’s a very useful initial configuration:

    "Return": xfce4-terminal
    "b": chromium
    "r": rlaunch
    "r/startup-notify": false
    left_workspace: "Left"
    right_workspace: "Right"
    up_workspace: "Up"
    down_workspace: "Down"
    workspace_1: "1"
    workspace_2: "2"
    workspace_3: "3"
    workspace_4: "4"
    "/general/easy_click": Super
    "/general/cycle_tabwin_mode": 0
    "/general/cycle_preview": false
    "/general/theme": Arc-Darker
    "/compat/LaunchGNOME": true
    "/general/SaveOnExit": false
    "/panels/dark-mode": true
    "/Default/KeyRepeat/Delay": 200
    "/Default/KeyRepeat/Rate": 50
    "/desktop-icons/style": 0
    "/Net/ThemeName": Arc-Darker
    "/Net/IconThemeName": Arc-X-D
    "/Xft/DPI": 108
- size: 30
  length: 100
  length-adjust: false
  position: p=10;x=0;y=0
  position-locked: true
  icon-size: 16
  - - whiskermenu
  - - tasklist
    - grouping: 0
      sort-order: 0
      flat-buttons: true
  - - separator
    - expand: true
      style: 0
  - - windowmenu
  - - separator
    - style: 0
  - - pager
    - rows: 2
  - - separator
    - style: 0
  - - systray
    - icon-size: 16
      square-icons: true
  - - pulseaudio
  - - power-manager-plugin
  - - notification-plugin
  - - clock
    - digital-format: "%R"

Or checkout my current full configuration.


Refactoring JavaScript promise tests

I’ve been writing some tests for promises in QUnit, and I noticed many areas of opportunity to simplify the code.

Tests in QUnit are simple:

test('error check', assert => {
  const error = 'Bad Request';
  assert.equal(error, 'Bad Request');

Not so much when promises are involved:

test('promise', assert => {
  const done = assert.async();

  /* get promise */

  promise.then(result => {
    assert.equal(result, 'success');
  .finally(() => {

We need to call assert.async() in order to let QUnit know that a promise is pending, and call done() after we have processed it.

There’s another way to do it though, by returning a Promise that processes the promise, with then():

test('promise', assert => {
  /* get promise */

  return promise.then(result => {
    assert.equal(result, 'success');

This way QUnit knows to wait for the promise itself. But there’s a simpler way of returning such Promise:

test('promise', async assert => {
  /* get promise */
  assert.equal(await promise, 'success');

Although it may not look like it, this code returns exactly the same Promise as before, but by using async and await the code is much easier to read.

However, sometimes I don’t want the code to succeed, I want it to fail, and with a particular error, in such cases the code becomes complicated again:

test('failing promise', async assert => {
  /* get promise */

  try {
    await promise;
    assert.pushResult({ result: false });
  } catch(e) {
    assert.equal(e, 'Bad Request');

In this case if the promise fails, that’s actually good, and we check the specific error. If the promise succeeds, then we invariably return a failure with assert.pushResult() (which is basically an assert.true(false)).

This code is too convoluted for my taste, fortunately QUnit provides a helper for these cases:

test('bad promise', assert => {
  /* get promise */
  assert.rejects(promise, e => e == 'Bad Request');

There’s two problems with this approach. First, the tests are not symmetrical anymore, some use async, and others use assert.rejects(), if there was a companion assert.resolves(), then I could use the same format for all my tests. The second problem is that if the error is not precisely "Bad Request" QUnit doesn’t show what was expected.

Diving in

Let’s take a look at QUnit’s rejects() method:

rejects( promise, expected, message ) {

  [ expected, message ] = validateExpectedExceptionArgs( expected, message, "rejects" );

  const currentTest = ( this instanceof Assert && this.test ) || config.current;

  const then = promise && promise.then;
  if ( objectType( then ) !== "function" ) {
    const message = "The value provided to `assert.rejects` in " +
      "\"" + currentTest.testName + "\" was not a promise.";

    currentTest.assert.pushResult( {
      result: false,
      message: message,
      actual: promise
    } );


  const done = this.async();

    function handleFulfillment() {
      const message = "The promise returned by the `assert.rejects` callback in " +
      "\"" + currentTest.testName + "\" did not reject.";

      currentTest.assert.pushResult( {
        result: false,
        message: message,
        actual: promise
      } );


    function handleRejection( actual ) {
      let result;
      [ result, expected, message ] = validateException( actual, expected, message );

      currentTest.assert.pushResult( {

        // leave rejection value of undefined as-is
        actual: actual && errorString( actual ),
      } );

Let’s get rid of the noise. First, validateExpectedExceptionArgs simply checks the type of expected, currentTest isn’t really used except to get the tests’s name, we are going to provide a valid then function so no need for an extra check, and in validateException we will only handle one case to simplify the code.

Then it’s much easier to see the main logic.

rejects( promise, expected, message ) {
  const assert = this;
  const done = this.async();

    function handleFulfillment() {
      const message = "The promise returned by the `assert.rejects` callback did not reject.";

      assert.pushResult( {
        result: false,
        message: message,
        actual: promise
      } );


    function handleRejection( actual ) {
      let result;

      try {
        result = {}, actual ) === true;
        expected = null;
      } catch ( e ) {
        expected = String( e );

      assert.pushResult( {

        // leave rejection value of undefined as-is
        actual: actual && String( actual ),
      } );

Let’s cleanup the style and do some trivial simplifications:

rejects(promise, expected, message) {
  const assert = this;
  const done = this.async();

    () => {
      const message = "The promise returned by the `assert.rejects` callback did not reject.";
      assert.pushResult({ result: false, message: message, actual: promise });

    (actual) => {
      let result;

      try {
        result ={}, actual) === true;
        expected = null;
      } catch(e) {
        expected = e;

      assert.pushResult({ result, actual, expected, message });

One very simple simplification is that{}, actual) doesn’t pass a real object, so we can just do expected(actual), and removing the unnecessary catch, we get:

rejects(promise, expected, message) {
  const assert = this;
  const done = this.async();

    () => {
      const message = "The promise returned by the `assert.rejects` callback did not reject.";
      assert.pushResult({ result: false, message: message, actual: promise });

    (actual) => {
      const result = expected(actual);
      assert.pushResult({ result, actual, message });

Using a more standard promise style:

rejects(promise, expected, message) {
  const done = this.async();

  return promise
    .then(() => {
      const message = "The promise returned by the `assert.rejects` callback did not reject.";
      this.pushResult({ result: false, message: message, actual: promise });
    .catch((actual) => {
      const result = expected(actual);
      this.pushResult({ result, actual, message });
    .finally(() => {

Since we are not using the promise as the first argument of the call() function, this isn’t overridden, and therefore there’s no need to save it in the assert variable.

Finally, using async and await:

async rejects(promise, expected, message) {
  const done = this.async();

  try {
    await promise;
    const message = "The promise returned by the `assert.rejects` callback did not reject.";
    this.pushResult({ result: false, message: message, actual: promise });
  } catch(actual) {
    const result = expected(actual);
    this.pushResult({ result, actual, message });


Now it’s much easier to see what the code is actually doing.

The first thing that is obvious is that “actual: promise” doesn’t really make sense, but much more important is the calling logic.

In our test, we would call this function like:

assert.rejects(promise, e => e == 'Bad Request');

That works, but if the exception isn’t exactly "Bad Request", we will simply see “expected: undefined“, because the rejects() function doesn’t know what is the expected value.

There’s an easy way to improve this, simply let the callback throw the assertion itself:

assert.rejects(promise, e => assert.equal(e, 'Bad Request'));

This way the expected value will be shown, and not only that, but the stack trace as well.

not ok 1 original assert.rejects
  message: failed
  severity: failed
  actual  : Not Found
  expected: null
not ok 2 improved assert.rejects
  message: failed
  severity: failed
  actual  : Not Found
  expected: Bad Request
  stack: |
        at test.js:97:39
        at Assert.QUnit.assert.rejects (test.js:27:19)

And the handling code is even simpler:

async rejects(promise, expected) {
  const done = this.async();

  try {
    await promise;
    const message = "The promise returned by the `assert.rejects` callback did not reject.";
    this.pushResult({ result: false, message: message });
  } catch(actual) {


Much much simpler, more useful, and does the exact same thing we would do ourselves with a try...catch, but that’s not all.

If we don’t care about the expected exception and just do assert.rejects(promise), the test would fail because no assertion was performed. All we have to do is check for presence of the callback function, and return true if it isn’t there. While we are at it, name it callback instead of expected, and do some final cleanups.

async rejects(promise, callback) {
  const done = this.async();

  try {
    await promise;
    this.pushResult({ result: false, message: 'The promise did not reject.' });
  } catch(e) {
    if (callback) callback(e);
    else this.pushResult({ result: true });


This is superior to QUnit’s method, and now that it’s clear what it does, the companion is easy to derive:

async resolves(promise, callback) {
  const done = this.async();

  try {
    const actual = await promise;
    if (callback) callback(actual);
    else this.pushResult({ result: true });
  } catch(e) {
    this.pushResult({ result: false, message: e.message });


Now we can do both kinds of tests in a very similar way:

assert.resolves(run_code('ok'), actual => assert.equal(actual, 'success'));
assert.rejects(run_code('error'), e => assert.equal(e, 'Bad Request'));

You can even do both assertions in the same test, while normally you would have to do something like:

const done_resolves = assert.async();

  .then(actual => assert.equal(actual, 'success'))
  .catch(e => assert.pushResult({ result: false, message: e.message }))
  .finally(() => done_resolves());

const done_rejects = assert.async();

  .then(() => assert.pushResult({ result: false }))
  .catch(e => assert.equal(e, 'Bad Request'))
  .finally(() => done_rejects());

Ew! No thank you.

This is why I like refactoring code. Once the code is clean and simple, improving it is actually pleasant.

How not to ban a prolific git developer

On the 28th of July 2021 I received an email from Git’s Project Leadership Committee saying that my recent behavior on the mailing list was found to be in violation of the Code of Conduct. This was a complete surprise to me.

What was particularly surprising to me is the fact that I’m very familiar with the Code of Conduct, not only have I reported CoC violations in the past, but I’m the only person who has sent a patch to try to improve it, not only to the git mailing list, but the upstream project Contributor Covenant as well. It is because of this that I know the document doesn’t demand the examples of positive behavior in the section titled “Our Standards”, so the only objective reason why a project leader could argue that I violated the CoC would be if I had committed an action on the list of examples of unacceptable behavior:

  • The use of sexualized language or imagery, and sexual attention or advances of any kind
  • Trolling, insulting or derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others’ private information, such as a physical or email address, without their explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Did I do any of those things? You will be the judge.

If you don’t like drama then don’t read this post. Although there’s some technical discussion, it’s mostly an analysis of my alleged CoC violations and their context. According to the Project Leadership Committee the following ten emails–which I will list as exhibits–violated the CoC.

Exhibit 1

This is a common debate tactic known as “shifting the burden of proof”.

Ævar does not need to prove that your patch is undesirable, you have to prove that it is desirable.

You have the burden of proof, so you should answer the question.

Felipe Contreras

In this mail I am telling Johannes Schindelin my opinion about what is the default position regarding any patch: the patch is not needed. This is not even contentious. The first thing Junio C Hamano (the maintainer) does is ask: why is this patch needed? If this question is not answered to his satisfaction the patch is rejected. And I’m not the only one that has argued precisely this point:

I don’t need that data. You are proposing a change so it is your duty to support your claim that the change is worthwhile.
Otherwise it’s a change just for the sake of change.

Michal Suchánek

How is my comment remotely close to any of the examples of unacceptable behavior? For that matter, isn’t Michal’s behavior worse? Not to mention that it’s a response to Johannes’s comment, which is objectively way more aggressive:

> Why put this in an ifdef?

Why not? What benefit does this question bring to improving this patch series?

Johannes Schindelin

Johanness is avoiding a well-intentioned question by doubting the good faith of Ævar, and it became even more clear later on in the thread:

You still misunderstand. This is not about any “opinion” of yours, it is about your delay tactics to make it deliberately difficult to finish this patch series, by raising the bar beyond what is reasonable for a single patch series.

And you keep doing it. I would appreciate if you just stopped with all those tangents and long and many replies that do not seem designed to help the patch series stabilize, but do the opposite.

Johannes Schindelin

How am I the bad guy here? And for the record, Ævar is part of the Project Leadership Committee.

Exhibit 2

This is loaded language. You are inserting your opinion into the text.

Don’t. The guidelines are not a place to win arguments.

Note that this sounds ungrammatical and unnatural to some people.

And it sounds ungrammatical because it is ungramatical, not only to native English speakers, but professional linguists.

Felipe Contreras

This is part of a long discussion in which Derrick Stolee attempted to change the guidelines to explicitly avoid any and all gendered language (he/she) for ungendered (they).

One argument that Derrick kept repeating is that only non-native English speakers find the singlar “they” ungrammatical, but I kept repeating that’s not true, and as example I used the test the American Heritage Dictionary used in their usage note regarding “they”:

We thank the anonymous reviewer for their helpful comments.

58% of the panel found the sentence unacceptable. The Usage Panel is comprised of writers, professors, linguists, editors, and multiple Pulitzer Prize winners.

Derrick ignored all my feedback, and for the record he also ignored all the feedback from Ævar, and in fact native speakers in the mailing list stated that they found these sentences ungrammatical too.

Regardless of on which side you are on, Derrick tried to win the argument by adding this to the guidelines:

Note that this sounds ungrammatical and unnatural to readers who learned English in a way that dictated “they” as always plural, especially those who learned English as a second language.

All I said is that he shouldn’t insert his opinion into the guideline, and instead should use something neutral he knows everyone can agree with:

Note that this sounds ungrammatical and unnatural to some people.

Once again, how is stating my opinion “unacceptable”?

Exhibit 3

Yeah, now you are starting to see the problem.

How many more failed attempts do you need to go through before accepting that the approach you thought was feasible is in fact not feasible?

The solution is simple and self-documenting:


Felipe Contreras

In my 8,200-word article git update: the odyssey for a sensible git pull I explored around 13 years of discussions in the mailing list regarding problems with git pull, and the first and more obvious solution is my proposed pull.mode configuration which was a direct competitor to different solutions proposed by Junio C Hamano.

For some reason Junio didn’t accept this proposal in 2013 even though other people were in favor of it. I tried again in 2020 multiple times, and after trying all the approaches other people proposed I became convinced it was the only feasible solution.

Elijah Newren–who we will later see is an important actor in this story–decided to try to fix the problem by himself, but every patch series he sent struck a dead end. It became clear to me that Elijah started to see the problem:

However, even if the above table is filled out, it may be complicated enough that I’m at a bit of a loss about how to update the documentation to explain it short of including the table in the documentation.

Elijah Newren

Yes, that’s precisely the reason why I opted for a solution that was a) easy to understand, b) easy to document, c) easy to test, and d) easy to program: pull.mode.

Elijah ignored all my feedback to all his patch series, which is why he still hasn’t realized his patches will break behavior current users rely on. I ran a poll on r/git and 19% of users responded that they rely on the behavior Elijah plans to break, and 15% said even though they don’t use it, they know what it should do.

All Elijah is achieving by ignoring me is at best wasting time, and at worst hurting users. Either way it’s not my fault.

How is my comment unacceptable? I’m just stating facts and my opinion. Even if my opinion is wrong that doesn’t make my comment “unacceptable”.

Exhibit 4

That is a problem specific for your shop.

The defaults are meant for the majority of users. If a minority of users (who happen to be working under the same umbrella) have a problem with the defaults, they can change the defaults.

Felipe Contreras

I understand why some people might think this is an attack on Randall S. Becker, but it’s really not.

Randall is a contributor that often sends reports when the latest git version breaks something in his platform: NonStop OS. I think it’s fair to say this is a relatively obscure OS. Even a smaller subset are the people that work with git inside his company.

Now, of course git should consider the NonStop platform, and of course it should consider all the users inside Randall’s company, but they are probably not even 0.01% of all git users, so why should 99.99% of users suffer at their expense?

No. The defaults are for 99.99% of users, if people in Randall’s company have a problem with the defaults and want to avoid git rebase like the plague, they can configure git anyway they want. That’s what the configurations are for: for the minority.

My opinion is that the defaults are for the majority.

If anyone has a problem with my opinion, we can debate it, but how is stating my opinion “unacceptable behavior”?

Exhibit 5

I’m sending this stub series because 1. it’s still in ‘seen’ [1], 2. my conflicting version is not in ‘seen’ [2], and 3. brian has not responded to my constructive criticism of his version [3].

Felipe Contreras

In the cover letter of my patch series: doc: asciidoctor: direct man page creation and fixes (brian’s version), I explained the reasons why I was sending it, but mainly it’s because Junio continued to refuse to drop brian’s version and include mine, even though brian himself told Junio to drop his version.

Now, if you look at my comments on the patches you could say I trashed them, but this is not my fault: brian stated plenty of things are just not true. Just on the first patch:

  • We generally require Asciidoctor 1.5, but versions before 1.5.3 didn’t contain proper handling of the apostrophe… Not true
  • [GNU_ROFF] for the DocBook toolchain, as well as newer versions of Asciidoctor, makes groff output an ASCII apostrophe instead of a Unicode apostrophe in text… Not true
  • These newer versions of Asciidoctor (1.5.3 and above) detect groff and do the right thing in all cases… Not true
  • Because Asciidoctor versions before 2.0 had a few problems with man page output… Not true

The changes were correct, but I was the one that originally wrote the patch, brian merely changed the commit message and took authorship because I objected to his text.

I understand that some people have trouble hearing that they are wrong, but this is not my problem. If you send a patch to the mailing list you should be prepared to hear all the ways in which it’s wrong. And brian m. carlson is not some rookie, he is the top #5 contributor to the Git project in the past five years, he shouldn’t need training wheels.

I do not blame brian for all these inaccuracies tough, there’s way too many details in the documentation toolchain, and the only reason why I know he was wrong is that I spent several man-days investigating these details. I explained part of the story in my post: Adventures with man color.

Moreover, if you think I’m lacking tact, remember that this is the second time I’m bringing these issues, the first time brian argued back all my suggestions and did not implement a single one. Eventually he just stopped responding to me.

Now, regarding integration: the “seen” branch is an integration branch which is maintained by Junio and contains all the topic branches that are currently being discussed and Junio is considering merging. Junio picked brian’s branch even though it was full of inaccuracies and in my opinion the commits were not properly split (each commit was doing four to five different things at once).

On the other hand my commit message did not contain inaccuracies, I wrote the original patch, my patches were properly split, contained patches from other people–including Jeff King and Martin Ă…gren–and in addition contained plenty of cleanups and more fixes to the output of the documentation.

If that was not enough, just the commit message of first patch (doc: remove GNU troff workaround) took me several hours of investigation to write.

It’s not my fault that brian decided to stop arguing any further, it’s not my fault that Junio decided to carry brian’s version for two months and ignore my version which was objectively superior. Those are the facts, and I was merely stating them.

Suppose that I’m wrong, let’s suppose that brian’s version is superior, in that case my statement of fact is incorrect. OK, but how is that “unacceptable behavior”?

Exhibit 6

I meant that I meant what he said I meant.

Felipe Contreras

I understand how this clarification I sent to Junio might look like to some people, but it’s simply a convoluted way of saying “what he said”. SZEDER Gábor said “I think you meant X”, I replied “yes, I meant Y, which in practical terms means what you said (X)”, and Junio (who has a habit of rewriting what I write) asked if my previous statement said “yes, yours is better and I’ll use it in an update, thanks”, but I don’t agree with Junio’s restatement.

I meant Y, which in practical terms means what SZEDER said I meant (X). So I meant either of these:

  • Y: Otherwise commands like ‘for-each-ref’ are not completed correctly by __gitcomp_builtin.
  • X: Otherwise options of commands like ‘for-each-ref’ are not completed.

Which one of these is really “better”? I don’t know, and to be honest I don’t care. I’ve been sending this patch for nine months now and in truth the original “Otherwise commands like ‘for-each-ref’ are not completed” is good enough for me.

The important thing is the fix, and the code continues to be broken to this day.

Additionally, it’s not uncommon for Junio to update the commit messages himself, in fact he did so for my last patch (doc: pull: fix rebase=false documentation), even though I sent an updated commit message he ignored my suggested modification and used his own. So why can’t he simply do the same here with a simple “s/commands/options of commands/” as I suggested?

To me this simply looks like an excuse not to merge the series, which I’ve sent eleven times already.

To avoid problems I re-sent the patch series nine minutes after Junio sent his message, and I used SZEDER’s suggestion. That was on June 8, and to this day Junio keeps repeating this on his status mails:

* fc/completion-updates (2021-06-07) 4 commits
 - completion: bash: add correct suffix in variables
 - completion: bash: fix for multiple dash commands
 - completion: bash: fix for suboptions with value
 - completion: bash: fix prefix detection in branch.*

 Command line completion updates.

 Expecting a reroll.
 cf. <60be6f7fa4435_db80d208f2@natae.notmuch>

I’ve already replied multiple times that I did the reroll immediately after, and after that I’ve re-sent the series four times since then.

Now, I understand if Junio took my reply in a way that I did not intend, but why should git users suffer? The problems these patches fix are real. SZEDER Gábor has already reviewed part of the series, and David Aguilar has tested it and he confirmed the problems exist, and the fixes work.

I have plenty more fixes on top of these (41), the reason why I kept this series small was to maximize the possibility of them getting merged, and now it turns out the reason Junio hasn’t merged them is because he didn’t like one comment I said?

To me this seems like pettiness. We are adults, if he found one of my replies objectionable, he could have simply stated so on the mailing list, or he could have sent me a personal reply (he has never done so). He could have said “I don’t like your tone, so I’m going to drop this series”, but instead he made me waste my time resending patches he was never going to merge. He kept his disapproval for himself, and only used it for ammunition to justify a future ban.

The slowness from Junio to accept these and other patches is why I chose to start the git-completion project, which is a fork of the bash and zsh completion stuff. For the record I was the one that started the zsh completion, and I started the bash completion tests in order for the completion stuff to be first-class citizens of the git project, which Junio has refused to accept as well.

Choosing to knowingly hurt git users because he didn’t find one comment palatable does not seem to me to be a behavior fit for a project leader.

I foresee that some people will conclude that I’m being petty too, but I don’t think that’s the case. I found the problems, I wrote the patches, I sent the patches, I addressed the feedback, and I updated the patches with the feedback. I’ve been trying to get them merged for nine months. What more do you want? If Junio has any further problems with the patches, he can just let me know and I’ll address them. But instead he says nothing.

To exemplify even more how Junio’s pettiness is hurting users, Harrison McCullough reported a regression with the __git_complete helper (which I wrote) on June 16, and it was caused by a change from Denton Liu which introduced a variable __git_cmd_idx, but he forgot to initialize it in __git_complete. Initially I fixed the problem by initializing __git_cmd_idx to 1, and Harrison reported that my fix indeed got rid of the issue. Two days later Fabian Wermelinger reported the same issue but sent a patch that initialized __git_cmd_idx to 0. Initially I thought he made a mistake, but upon further reflection I realized that 0 was more correct, so I updated my patch.

My patch is superior because it fixes the regression not only for bash, but for zsh too. In addition it mentions Harrison McCullough reported the issue, and it’s also simpler and more maintainable too. Junio picked Fabian’s patch and ignored my patch, and my feedback, therefore the regression is still present for zsh users. This is a fact.

Ignoring me is objectively hurting users. I’ve resent my fix on top of Fabian’s patch, so all Junio has to do to fix the regression is pick it.

How does the fact that Junio is primed against me make my convoluted statement “unacceptable behavior”?

Exhibit 7

That makes me think we might want a converter that translates (local)main -> (remote)master, and (remote)master -> (local)mail everywhere, so if your eyes have trouble seeing one, you can configure git to simply see the other… Without bothering the rest of the word.

Felipe Contreras

Once again I can see how people might misinterpret what I said here, but it’s nothing nefarious.

The rename of the “master” branch has been one of the most hotly debated topics of late (this has nothing to do with me as I didn’t even participate in the discussion). Even today it’s not entirely clear what’s the future of this proposal. If you are not familiar with the debate, you can read my blog post: Why renaming Git’s master branch is a terrible idea.

But what I attempted to do was achieve what the original poster–Antoine BeauprĂ©–wanted to achieve but in another way. I listened to Antoine, and I proposed a different solution, that’s all.

What Antoine wanted (as I understand it) was to change all his “master” branches to “main” so that he didn’t have to see “master” everywhere. But he wanted to do this properly, so he wrote a python script to address as many renaming issues as he could.

I see some value in what Antoine wanted to do, but he literally said: “I am tired of seeing the name “master” everywhere“, if that was literally the problem, then a mapping of branch names would fix it. And this is not even that foreign to me.

When I wrote git-remote-hg and git-remote-bzr one of the main features people wanted was a way to map branch names. So for example a branch named “default” in Mercurial could be mapped to “master” in Git (this was done by default, but you get the point).

Even if this didn’t help Antoine, I thought this would help other people, say for example that some Linux developers couldn’t manage to convince Linus Torvalds to rename the master branch to “main”, but for some reason they found the name “master” offensive. Well, with my patch they didn’t have to convince Linus, they could simply configure a branch mapping so they “never had to see the name “master”“.

After listening to my reply Antoine did a more adult approach than Junio, and actually replied his discontent to my comment:

I guess that I’ll take that as a “no, not welcome here” and move on…

Antoine Beaupré

This saddened me. It was never my intention to shit on Antoine’s idea, and in fact I never intended to act a representative of the Git community, so I explained very clearly to Antoine that he shouldn’t just give up:

Do not take my response as representative of the views community.

I do believe there’s value in your patch, I’m just not personally interested in exploring it. I don’t see much value in renaming branches, especially on a distributed SCM where the names are replicated in dozens, potentially thousands of repositories. But that’s just me.

Felipe Contreras

However, brian m. carlson either didn’t read my response, or chose to not factor it in, because he replied:

There is a difference between being firm and steadfast, such as when responding to someone who repeatedly advocates an inadvisable technical approach, and being rude and sarcastic, especially to someone who is genuinely trying to improve things, and I think this crosses the line.

brian m. carlson

I wasn’t trying to be “rude and sarcastic”, I was simply trying to suggest a different approach. It did’t even need to be a competing approach, because both approaches could be implemented at the same time. I explained that to brian, did he respond back? No.

Now, even if you remove me from the picture nobody else responded to Antoine, so how exactly did my responses hinder in any way the community?

If you think my response was “rude and sarcastic” I would love to debate that, but the fact of the matter is that only I know what my intentions were, and they were definitely not that.

Moreover, I don’t believe in being offended by proxy. If Antoine BeauprĂ© had a problem with my comment, I would listen to his objection, and even though I think I have already clarified what I meant, I would even consider apologizing to him. But I will not apologize to brian m. carlson who I’m pretty sure got offended by proxy.

Also, for the record, reductio ad absurdum arguments are not uncommon in the Git mailing list, and they don’t necessarily imply anything nefarious. Here’s one recent example from Junio:

I am somewhat puzzled. What does “can imagine” exactly mean and justify this change? A script author may imagine “git cat-file” can be expected to meow, but the command actually does not meow and end up disappointing the author, but that wouldn’t justify a rename of “cat-file” to something else.

Junio C Hamano

Can Junio’s response be considered “rude and sarcastic”? Yes. But you can also assume good faith and presume he didn’t intend to offend anyone and simply tried to prove a point using a ridiculous example.

Can my comment be considered “unacceptable behavior”? In this case I’d say yes, but not necessarily so. You can also give me the benefit of the doubt and simply not assume bad faith (and I can tell you no bad faith was intended).

Elijah Newren

So far the incidents have been pretty sporadic and could easily be reduced to misunderstandings, but the following are not. Elijah Newren decided to wage a personal vendetta against me (literally), and that context is necessary to understand the rest of the evidence.

Spring cleanup challenge

It all started when I sent my git spring cleanup challenge in which I invited git developers to get rid of their carefully crafted git configuration for a month. The idea was to force ourselves to experience git as a newcomer does, and figure out which are the most important configurations we rely on.

This challenge was a success and very quickly we figured out at least two configurations virtually every experienced git developer enables: merge.conflictstyle=diff3 and rerere.enabled=true. Additionally my contention is that git should also have default aliases (e.g br => branch), but there was no clear consensus on that.

Take for example the configuration merge.defaulttoupstream. Back in 2010 while exploring issues with git pull I proposed that git merge should by default merge the upstream branch. I later sent the patch in 2011 which received pretty universal support, and there were interesting comments, like:

I totally agree — this would be a good change[*] — but this suggestion has been made before, and always gets shot down for vaguely silly reasons…

Miles Bader

Just a few days later Jared Hance sent his own version of the patch by introducing merge.defaultupstream to activate the new behavior. Jared sent in total five versions and implemented all the suggestions from everyone, including Junio, but in the end Junio decided to rewrite the whole thing, take authorship, not mention that Jared wrote the original version of the patch (see the final commit), nor the fact that originally the idea came from me.

I personally do not care too much. I had an idea and the idea got implemented, that’s mostly all I cared about. However, it wouldn’t have costed anything to Junio to add “Original-idea-by: Felipe Contreras” as it’s customary (here’s an example).

Then in 2014 I realized that before git 2.0 (which was meant to break backwards compatibility) it was the perfect time to enable merge.defaulttoupstream by default (literally no one disabled it anyway), so I sent a patch for that and other defaults. Junio disagreed and excluded these from 2.0, but in the end he ended up merging them.

So finally in git 2.1 git merge ended up merging the upstream branch by default.

What I find interesting is that in 2021 (7 years later), Ævar–a prominent git developer–didn’t even know that merge.defaulttoupstream changed it’s default value, so he still had merge.defaulttoupstream=true in his configuration. Thanks to my challenge he argued it should be true by default, and thus realized that was already the case. Additionally I noticed the default value was not mentioned in the documentation, so I sent a fix for that.

At that point my relationship with Elijah was amicable, and he decided to join the challenge in his words “to join in on the fun“.


The problems started when I took it upon myself to try to enable merge.conflictstyle=diff3 by default. This task turned out to be much more complicated than I initially thought. Flipping the switch is extremely easy, but once you do that a ton of tests that expect the diff2 format start to fail. No problem, I thought by simply changing merge.conflictstyle to the old value at the beginning of each test file that fails would solve it. Later on each test file can be updated to expect the diff3 format.

It turned out that didn’t work. Many commands completely ignored the configuration merge.conflictstyle, and those are clearly bugs. So even before attempting to change merge.conflictstyle we have to fix those bugs first.

This is what I attempted to do on my patch series: Make diff3 the default conflict style.

Immediately Johannes Sixt pointed out that such change resulted in a very convoluted output for him. This however I think is just bad luck, since never in all my years of using diff3 have I ever seen an output similar to that, and that’s because recursive merges are involved.

Jeff King provided a link to a discussion from 2013 regarding the output of diff3, and that discussion itself lead to other discussions from 2008. It took me a while to read all those discussions, but essentially it boils down to a disagreement between Junio C Hamano and Jeff King (both part of the leadership committee) about whether or not a level higher than XDL_MERGE_EAGER (1) made sense for diff3. Junio argued that the output wasn’t proper, but Jeff argued that even though it wasn’t proper, it was still useful. That lead to Uwe Kleine-König–a Linux kernel developer–to implement zdiff3, which is basically diff3 but without artificially capping the level as Junio did in order for the output to be proper.

When I said maybe we should consider adding this zdiff3 mode, Jeff King mentioned his experience with it:

I had that patch in my daily build for several years, and I would occasionally trigger it when seeing an ugly conflict. IIRC, it segfaulted on me a few times, but I never tracked down the bug. Just a caution in case anybody wants to resurrect it.

Jeff King

My interpretation of that message is extremely crucial. I am very precise with language, both when writing, and reading. So I read what Jeff said exactly how he said it. First, he said in “several years” (1+ years) he occasionally would try this. Then he said of of the times he tried it, it crashed a few times, but crucially he said “if I recall correctly”, so this means he is not sure. Maybe it crashed more than a few times, or maybe it crashed less than a few times, he is not sure.

Whatever the issue was, Jeff did not find it serious enough to track the bug, since he never tracked down the bug in the several years he was using the patch.

Jeff sent his message on a Thursday, on Friday Elijah Newren said he might investigate the zdiff3 stuff, and on Sunday I re-sent the 2013 patch from Uwe. I added a note to the patch stating why I was sending it, and what I did to test it. Essentially I ran the entire test suite using zdiff3 instead of diff3 and everything passed. This implies that if there’s any issue with zdiff3 it probably is not that serious.

I sent the patch at 9:30. One hour later Jeff King replied “I take it you didn’t investigate the segfault I mentioned”. I don’t know how I was supposed to investigate that, other than what I already did: run the whole test suite with zdiff3. Jeff had the idea to recreate all the merges of the git repository using zdiff3 and after 2500 merges you can find one that crashes. This is a good idea, but it never occurred to me to do that.

By 13:00 I replied with a command to replicate the issue very simply using a git merge-file command. By 16:24 I had found the issue and sent a fix.

The next Monday Elijah Newren started to complain:

This is going to sound harsh, but people shouldn’t waste (any more) time reviewing the patches in this thread or the “merge: cleanups and fix” series submitted elsewhere. They should all just be rejected.

Elijah Newren

Elijah provided a list of reasons, none of which were true–like “no attempt was made to test” (I did attempt to test, as I explained in the note of the patch). Additionally he stated that in his opinion my submissions were “egregiously cavalier”.

Egregiously cavalier? I spent several hours on a Sunday trying to find the fix for a patch one of the most prolific git developers didn’t bother to fix for several years, and I actually did it… the very same day.

This was the start of Elijah’s personal vendetta against me. He urged Junio not only to drop Uwe’s patch that I resent, but to drop all my patches:

If I were in charge, at this point I would drop all of Felipe’s patches on the floor (not just the ones from these threads), and not accept any further ones. I am not in charge, though, and you have more patience than me. But I will not be reviewing or responding to Felipe’s patches further. Please do not trust future submissions from him that have an “Acked-by” or “Reviewed-by” from me, including resubmissions of past patches.

Elijah Newren

If that wasn’t enough, Elijah accused me of fabricating his endorsement of my patches:

Troubled enough that I do not want my name used to endorse your changes, particularly when I already pointed out that you have used my name in a false endorsement of your patch and you have now responded to but not corrected that problem (including again just now in this thread), making it appear to me that this was no mistake.

Elijah Newren

This is blatantly false. Elijah very clearly said “Yes, very nice! Thanks.” to one of my patches, and that can be considered endorsement by many people. I do not take those kinds of accusations lightly, so that prompted me to start an investigation into how many “reviewed-by” commit trailers are explicitly given as opposed to inferred, and I found that 38% of them are not explicit.

Not only that, but I found one example a month earlier in which Derrick Stolee took a “Looks good to me” comment from Elijah and used it as “reviewed-by”. Did Elijah accuse Derrick of “false endorsement”? No.

I don’t know what happened to Elijah, nor why from one day to the next he decided to make me his enemy, but clearly he is not being objective. If he was he would be objecting to Derrick’s behavior as well, since he did exactly the same thing as I. Elijah’s responses are clearly emotional, as can be seen from this reply in which he cites nineteen references, some going back to 2014–a dubious debating technique called Gish gallop, but writing this response he didn’t even pause to see that the links matched what he was referring to. Moreover, the first thing he said: “attacking a random bystander like Alex is rather uncalled for”, was completely off-base because my reply wasn’t even addressed to Alex.

I have absolutely nothing against Elijah, but clearly he does have something against me, and the rest of the reports (and probably all the previous ones) are entirely the result of that antagonism.

Exhibit 8

If you didn’t mean this patch to be applied then perhaps add the RFC prefix.

Felipe Contreras

When Alex Henrie took me up on the challenge to try to fix git pull, he created a patch that broke 11 test files, each one with many unit tests broken. This is not a huge deal, some people send patches that break the test suite, but generally when they do that they add the “RFC” (request for comments) prefix to make sure this patch is not picked by Junio.

So, assuming good faith there’s two options a) Alex didn’t see the tests breaking, or b) he knew the tests were breaking, but didn’t know he had to add RFC in that case. I simply said if this was case b), RFC should have been added.

Only a person already primed to presume malice would have seen a problem with my comment.

Exhibit 9

Wouldn’t you consider sending a patch without running ‘make test’ “cavalier”?

Felipe Contreras

This response was directed to Elijah Newren, not Alex Henrie, and I wrote it because I was genuinely curious to see if Elijah was capable of seeing the obvious discrepancy between his response to Alex, and his response to me.

The patch I sent (which wasn’t even my patch):

  • Did not break any tests by itself
  • Did not break any tests by forcing diff3 to be zdiff3
  • The patch was not going to land on the integration branch “seen”
  • The patch doesn’t change anything, so everyone using diff3 wouldn’t see the crashes
  • If the patch was applied, and a person manually enables zdiff3, out of the 16,000 merges in git.git it crashes on 13 of them (0.08%)
  • Has already been tested by multiple people for many years
  • Was fixed hours later

So the patch was relatively safe by any standard.

On the other hand Alex’s patch:

  • Broke a ton of tests by default
  • Broke the existing user interface
  • Hasn’t been used by anyone
  • Was intended to be integrated

Elijah’s response to my patch is that it was “egregiously cavalier”, and his response to Alex’s patch was “thanks for working on this”.

If this is not double standards I don’t know what is.

I did not criticize Alex, I was asking a question to Elijah.

Exhibit 10

When Alex responded to my response, he thought it was directed to him. I explained to him that it wasn’t and I made sure he knew I didn’t blame him for anything:

I do appreciate all contributions, especially if they were done pro bono.

Felipe Contreras

I explained that Elijah’s personal attacks towards me are a different subject that he should probably not attempt to touch, because I didn’t think anything productive could come from that (it’s a matter between Elijah and me).

And just to be crystal clear, I thanked him again:

This has absolutely nothing to do with you, again… I appreciate you used some of your free time to try to improve git pull.

Felipe Contreras

In my opinion all the reports that have anything to do with Elijah Newren are primed and could only be judged “unacceptable” if you assume bad faith (and probably not even in that case).


I have been a moderator on pretty big online communities, therefore I’m familiar on how justice is supposed to be dispensed outside of the justice system.

The single most important thing is the presumption of innocence. The accused should not be considered guilty until evidence against him has been presented and he has had a chance to defend himself. Even in something as silly as Twitch chat, the people that get banned have the opportunity to appeal those decisions.

The leadership committee not only did not allow me to present my case, they weren’t interested in the least, when I asked them about presenting my case their response was “this isn’t a trial”. I was presumed guilty and that’s that.

My punishment was to avoid all interaction with 17 people for three months. That included Junio C Hamano and the entire leadership committee. If I engage in any “unwanted interaction” with any of these people or if I say something similar to the alleged violations above, then I’ll get banned.

Ævar ArnfjörĂ° Bjarmason offered to be my “point of contact”, that means he was the only person I was allowed to ask questions to. While initially Ævar did a good job of keeping an open dialog with me, eventually he stopped responding because he went on vacation.

Other than the initial email I only received one response from the leadership committee.

However Ævar did manage to respond to a few questions for this blog post (as himself):

  1. Do you believe there’s a possibility the PLC might have made a mistake?

Sure, but I think that’s more in the area of how we’d deal with clashes like this generally than that nothing should have been done in this case. Did we do the optimal thing? I don’t know.

  1. Do you believe the PLC could have done more to address this particular situation?

Yes, I think it’s the first case (at least that I’m aware of) under the CoC framework that’s gone this far.

That’s a learning process, one particular thing in this case I think could have gone better is more timely responses to everyone involved. That’s collectively on us (the PLC), harder when there’s little or no experience in dealing with these cases, everyone volunteering their time across different time zones etc.

  1. Do you believe the identify of the person(s) who reported the complaints had an effect on the final verdict?

I wouldn’t mind honestly answering this for my part at least (and not for other PLC members), but feel I can’t currently due to our promise that CoC reports and identities of reporters etc. not be made public unless on their request.

While I’m thankful for Ævar’s responses these did not answer the particulars of what I asked, especially the last one, which he could very well be answered without revealing the identity of the people who reported the “violations”.

At the end of the day I’m still fairly certain that the person who did the vast majority of the reports (if not all of them) was Elijah Newren, and it’s only because he is a big name (#7 contributor in the past 5 years) that those reports were taken seriously. Additionally he sent as many as he could as an eristic technique–Gish gallop–because he knew that way nobody in the leadership team would investigate any single one at depth.

Ultimately I do not blame Elijah: if he thinks my comments violated the code of conduct, he is entitled to believe so and report them. But I do blame the leadership team because they didn’t do their job. They didn’t investigate any of the reports carefully enough, and they did not do the minimum job any judge should do (inside or outside a real courtroom); hear the defendant.

Not only did they not do their job, but they didn’t even want to attempt to do it, and even more… it seems they don’t know what their minimum job is.


Linus Torvalds once said in a TED interview:

What I’m trying to say is we are different. I’m not a people person; it’s not something I’m particularly proud of, but it’s part of me. And one of the things I really like about open source is it really allows different people to work together. We don’t have to like each other — and sometimes we really don’t like each other. Really — I mean, there are very, very heated arguments. But you can, actually, you can find things that — you don’t even agree to disagree, it’s just that you’re interested in really different things.

Linus Torvalds

You may not like my style of communication, but “being abrasive” is not a crime. The question is not “can Felipe be nicer?”, the question is “did Felipe violate the code of conduct?”. I believe any objective observer who carefully investigates any of the reports would have to conclude that no violation actually took place. You would have to believe that I was trolling, insulted people, threw personal attacks, or harassed people; none of that actually took place.

I debate with people all the time, and this particular issue comes up very often, but it’s a fallacy called tone policing. Any time anybody focuses on the way somebody states an argument instead if what the argument itself is, that person is being unproductive.

Paul Graham’s hierarchy of disagreement

I often bring Paul Graham’s hierarchy of disagreement, where tone policing is the third lowest level of “argumentation”, right after ad hominem attacks.

Nobody benefits by my tone being policed, especially the Git community. My patches not only provide much needed improvements to the user interface, but fixes clear bugs that have already been verified and tested, not to mention regressions. Ignoring me only hurts git users.

And even if in your opinion the tone I used in the reports above is not acceptable, it’s worth nothing that these are merely the worst ten instances out of two thousand emails, that’s just 0.5%.

And let’s remember that unlike the vast majority of git developers: I’m doing this work entirely for free.

Moreover, if anyone is violating the code of conduct I would venture to say it’s other members of the community:

  • Being respectful of differing opinions, viewpoints, and experiences

It is very clear other members of the community do not respect my opinions. I have suggested that this particular point should be tolerance, not respect (see Cambridge University votes to safeguard free speech), but even if you lower the requirement from respect to tolerance, not even that is happening: other members do not tolerate my opinions.

Why am I being punished because other members (who happen to be big names) can’t tolerate my opinions?

The Git community claims to be inclusive, but as this incident shows that’s not truly the case since the most important diversity is not welcomed: diversity of thought.

The git staging area, the term literally everyone agrees with

The concept of the staging area is one which many newcomers struggle with when starting to learn git, and the fact that this concept is barely mentioned in the official documentation (i.e. man pages) doesn’t help at all. Instead the official name is “the index” which has absolutely nothing to do with the way users interact with it. That’s the reason most people who teach git prefer to use the term “staging area” instead, and in fact that’s the term used in the best available documentation: the Pro Git book (which is not official), and also pretty much in all online documentation, including tutorials and blog posts (e.g. Atlassian saving changes, code refinery).

Attempts to move away from the incorrect term “the index” towards one that most native and non-native English speakers can grasp without the need for further explanation have been attempted for more than a decade, and even though there’s universal consensus that “staging area” is by far the best alternative, attempts to use it officially in the documentation and user interface have been blocked because one person cannot be convinced.

This is a summary of 13 years of discussions regarding the term “staging area”, similar to my previous post about 13 years of discussions regarding git pull.

git stage and --stage

The first thread happened in 2008, when David Symonds suggested the addition of a new command: git staged. This command was basically an alias for git diff --cached, and it was a result of a discussion in GitTogether ’08. While this command was not considered seriously (and in fact it wasn’t proposed seriously), it did result in the git diff --staged alias. This is the first time I suggested a git stage command.

Soon after Scott Chacon–a famous git trainer and author of the Pro Git book–suggested a git stage command that is basically an alias for git add. This time the proposal was serious and Scott offered a good rationale:

This continues the movement to start referring to the index as a staging area (eg: the –staged alias to ‘git diff’). Also added a doc file for ‘git stage’ that basically points to the docs for ‘git add’.

Neither Jeff King nor Junio C Hamano (the maintainer) objected, so it was immediately merged.

In 2009 David Abrahams suggested the “index” should be called “staging area” as that’s a more friendly name, and he expressed that git stage would have helped his learning curve. Shawn O. Pearce–who unfortunately passed away–gave a pretty good summary of the history of the “index”, which basically resumes to: it’s historical baggage.

Only late last October at the GitTogether did we start to talk about creating a command called “git stage”, because people have started to realize we seem to call it a “staging area” as we train newcomers…

Shawn O. Pearce

Junio objected to Shawn’s history lesson claiming it was “a bit misleading, if not completely incorrect”, and also objected to the way “the outside” referred to the index:

Yeah, you may have to consider the possibility that that particular training lingo is inconsistent with the rest of the system, exactly because it came from outside.

Junio C Hamano

Jakub Narebski–a prolific developer and community member–also said it was a pity git stage wasn’t there from the start, but Junio quickly disagreed.

I simply agreed that “staging area” is a more friendly name.

Felipe’s git stage

Unlike Scott’s git stage, I proposed a more comprehensive command with subcommands, so instead of doing git diff --caced, the user could do git stage diff.

Junio did not like the idea at all:

I do not think these are good ideas at all, as it just spreads more confusion, not less.

Junio C Hamano

I argued that there’s a lot of confusion already: stage, cache, index, etc. And it’s not just me the only one who sees that confusion.

Perhaps not spreading “stage” even wider? That is the newest confusing term that caused the most harm.

Junio C Hamano

Junio proceeded to blame the “git training industry” (e.g. Scott Chacon) for using the term “staging area” instead of “index”:

Later, some outside people started “git training industry” without talking with the git development community and started using a new term “to stage” as a verb to describe “add to the index”. Addition of “git diff –staged” was supposed to lesson the confusion resulted from this mess, but as we can see from your patch it had a reverse effect.

Junio C Hamano

In his opinion to avoid confusion --cached should not have been existed at all, and the command should have been git diff --index-only. He also opined that he should have rejected “stage” as an option name (and a command) (in favor of –index-only).

I thanked Junio for his clarification, but I also explained that there’s no porcelain command (intended for common users) that uses “index” in any way. Moreover, I explored what “cache”,”index”, and “stage” mean in English, and the high-level notion of most people do not match what “cache” and “index” actually mean.

In git it is barely used, mostly on the “documentation industry” probably because it’s easier to understand for most people (even non-native-English speakers).

Felipe Contreras

Markus Heidelberg argued that my proposal didn’t match his mental model:

Not for me. If I want to GET a diff, I want to use a command “diff”, so “git diff” is more obvious.

Markus Heidelberg

I argued that this was a matter of preference, but his preference would not be hindered in any way: we could have both git stage diff and git diff --staged. Additionally I argued that git diff is used in two fundamentally different ways, moreover we already have command/subcommand instances (e.g. git remote add).

Well, it’s a matter of preference, and you would not loose the option to do it the way you like. But actually, “git diff –cached” is a different action; you can’t do “git diff –cached HEAD^..” for example.

Felipe Contreras

Curiously enough when Markus argued back that you could not do git stage diff HEAD^.. either, it was Sverre Rabbelier–another prominent developer back then–who did the mic drop for me:

I rest my case ;). That’s the whole point Felipe is trying to make here.
$ git diff --cached
$ git diff HEAD^..

That’s two different modes of operation with the only difference being a switch (‘–cached’), which changes what is, and what is not valid after that.

Sverre Rabbelier

David Aguilar mentioned an old thread in which an alternative style for git diff was proposed by using STAGE and WORTREE as pseudo commits. Unfortunately a lot of the following discussion was about this proposal, and not mine.

Junio disagreed with this proposal as well, and stated he didn’t think there’s any usability issue:

I do not think there is any usability issue. Why do you think saying STAGE in all capital makes it easier to use instead of saying –cached (or –index-only)? In either way, you need to understand the underlying concept, such as:

Junio C Hamano

Matthieu Moy did think there was usability issue, and so did Octavio Alvarez, and Stefan Karpinski:

There is most definitely a usability issue here. I use git every day and I cannot for the life of me remember all the inconsistent stage-related oddball commands. I have a number of aliases for them (similar to what Felipe is proposing) which are the only way I can remember them. Whenever I find myself using a git repo without those aliases, I have to fire up the man pages. Trying to explain all of this to coworkers that use git—honestly, I don’t even try to go there.

Stefan Karpinski

Nothing materialized out of this discussion.


In a side-thread when I tried to improve the user-manual, and argued that color should be enabled by default (color.ui=auto), Jonathan Nieder–the most prominent developer–asked me what were some of my UI patches that might have been overlooked:

Could you list some UI patches that were overlooked or not properly addressed? Maybe people just forgot about them or were waiting for an updated version, or maybe the problems some solve weren’t articulated clearly yet. I would be glad to help out in any way I can.

Jonathan Nieder

I mentioned my patch for git stage and attempts to standardize --index and --cache onto --stage as examples where there’s clearly a UI problem. But the real problem is that even though it’s clear by virtually everyone that there’s a problem, a path forward doesn’t seem to be there.

Michael J Gruber argued back that basically the reason he decided to disengage with this particular patch series is that “I didn’t seem to be willing to accept advice” (which obviously was not the case, since I had already included all his advice):

Regarding this specific patch series: I took part in the initial discussion, and got frustrated by the original poster’s seemingly unwillingness to accept advice, so I left. I’m not drawing any general conclusions, and please don’t take this as an ad hominem argument. Sometimes it’s simply a matter of mismatching participants.

Michael J Gruber

Additionally he argued that --index and --cache are fine, because two options are needed on some commands, which although true that still doesn’t address the issue of the names, as I argued:

“good” is a very subjective term; I don’t think “they are different” is a good reason. By that logic –only-index and –index-and-working-dir serve the same purpose, just like –gogo and –dance.

Felipe Contreras

Nanako Shiraishi piled on the ad hominem angle:

I don’t think Felipe seriously wants to change them to –gogo vs –dance, but if he made a more constructive proposal, instead of making such a comment whose intended effect is only to annoy people, we may see an improved UI at the end. Proposing “–index-only” vs “–index-too” or even “–stage-only” vs “–stage-too” would have helped him appear to be more serious and constructive and I think your expression “mismatching participants” was a great way to say this.

Nanako Shiraishi

I don’t know how anyone can think that assuming bad faith is a more constructive way to achieve anything, clearly my intention was not to annoy people, but to demonstrate a point. It is a common rhetorical device called reductio ad absurdum, it’s not uncommon on the git mailing list to use it, and even Junio uses it regularly.

My point was that any two names would solve Michael’s concern, the whole issue is that names would those be, and before exploring any names, developers–and in particular Junio–need to accept there is a problem with the existing names. When I explained that to Nanako she didn’t back down:

You have a funny way of saying “I’m sorry, I wasn’t constructive, and my attitude repelled many participants from the discussion”.

Nanako Shiraishi

Which of course is not what I was trying to say at all. Moreover, she said that because “stage” is not very user friendly to her, the word we use is irrelevant, and everyone would agree with her:

I think a proposal to replace the word “index” with “stage” will sound nothing but bike-shedding to anybody, especially after getting familiar with “index” and seeing it taught on many web pages and books.

Nanako Shiraishi

Instead of my usual “blunt” style of communication I tried to appeal to Nanako’s sensitivity and empathy, in order explore the possibility that even though she doesn’t particularly see a useful difference between the words “stage” and “index”, that doesn’t mean other people don’t either:

I’m going to change my usual blunt style and try to be sensitive here: I’m not trying to blame you, or disregard your background. I’m not a native English speaker either (although my tongue language is a romance one, so perhaps I have some advantage), but to me, English is a language of short words, and therefore, the exact word you pick makes a world of difference, and this is something I feel many non-english speakers don’t appreciate. Since we all are communicating in English, I think we should not disregard “subtle” differences in words such as “cached” and “stage” that might not mean much to you, but I think it would to the thousands (or millions) of git users who do understand immediately the meaning of “stage” regardless of their git (or any other SCM) background.

Felipe Contreras

I then explored at great length all the different meanings of the word “stage”, and also to some extent the word “index”. As a reward for my efforts my reply was censored and removed from the mailing list archive. I’ve uploaded the email in its entirety so you can see for yourself if there’s anything objectionable there.

Regardless of on which side you are on, Nanako did not object to the word “stage”, all she did is state that to her the word is irrelevant, and in her opinion I was bikeshedding. As we will later see Nanajo was wrong: most people did not consider it bikeshedding.

Nothing particularly productive came from the rest of the discussion.

Git User’s Survey

Since I pretty much gave up on convincing Junio using arguments, I thought it was a good idea to ask users directly if they thought the term “staging area” was better. So I proposed to Jakub Narebski who spearheaded the Git User’s Surveys, and he thought it was idea. Of course we couldn’t ask the question directly, that would bias the responses, so I proposed simply asking if users used the “index/cache/stage”.

Unfortunately Jakub forgot to add that question.

Consistent terminology

In 2011 Piotr Krukowiecki asked if there was a plan to use a consistent term, instead of three.

Junio once again explained why he did not like “stage”:

In short, “stage” is an unessential synonym that came much later, and that is why we avoid advertising it even in the document of “git diff” too heavily. Unlike the hypothetical –index-only synonym for –cached I mentioned earlier that adds real value by being more descriptive, “staged” does not add much value over what it tried to replace.

Junio C Hamano

Pete Harlan explained why “staging area” is better:

FWIW, when teaching Git I have found that users immediately understand “staging area”, while “index” and “cache” confuse them.

Pete Harlan

Aghiles also agreed with Pete.

Once again I brought up my proposed git stage command with subcommands. This time Michael J Gruber expressed that he liked the idea a lot, but stated that the design deviates from the common form. Of course this isn’t true since we have many commands with subcommands: git branch, git tag, git remote, git stash, git submodule, etc.

Piotr explained that the main point of his email was to have a single name, and although the name was not that important, he advocated for stage:

I’m new to git and a non-native English speaker. “Staging” seems most clear of all of the terms. You may find it differently, but please take into consideration that you are accustomed to it.

Piotr Krukowiecki

Pete Harlan explained that the problem to him was that developers working with Git internals were biased towards the term “index”, while everyone else see absolutely no reason to use the low-level details:

Part of the issue could be that one intimately familiar with Git’s internals may find a process oriented interface irritating (“Why must it say ‘staging area’ when it’s just updating the index?”), while one unfamiliar with the internals has the opposite reaction (“Why must it make me use the internal name of the staging area?”).

Pete Harlan

Pete even suggested a git command developed from scratch with an interface that hides all the nuts and bolts from the user.

Once again Drew Northup explored what the definitions of “cache” and “index” mean in English and compared them what git actually does internally, but I argued that what happens internally is irrelevant for the end user:

Branches and tags are “rthetorical” devices as well. But behind scenes they are just refs. Shall we disregard ‘branch’ and ‘tag’?

No. What Git does behind scenes is irrelevant to the user. What matters is what the device does, not how it is implemented; the implementation might change. “Stage” is the perfect word; both verb and a noun that express a temporary space where things are prepared for their final form.

Felipe Contreras

Additionally Miles Bader argued that magit already uses the label “staging area”, and the shortcut to add changes there is “s”.

Jonathan Nieder pushed back on new terms because they were “established”, however I mentioned that the whole point of git 1.8.0 (which later became 2.0) was to rethink git from scratch, so why not rethink these terms? Junio argued that while rethinking is fine, consensus so far has not been achieved, and thus it could not be considered for 1.8.0.

Jeff King argued that as a native English speaker “staging area” makes perfect sense to him:

So the term “staging area” makes perfect sense to me; it is where we collect changes to make a commit. I am willing to accept that does not to others (native English speakers or no), and that we may need to come up with a better term. But I think just calling it “the stage” is even worse; it loses the concept that it is a place for collecting and organizing.

Jeff King

Drew Northup argued that although the term “staging area” has value, it has limited use for somebody trying to learn how git works.

Phil Hord also agreed “staging area” makes perfect sense:

When we pack up our kayak club for a trip, we stage equipment we’re bringing. Eventually we make a decision about which equipment is going and which is staying. The decision is codified by the equipment we leave in the staging area versus the equipment we remove to local storage. Everyone seems to understand the term when we use it in this context.

Phil Hord

Additionally Jeff King agreed that the internals were not relevant for the mental model of the end user:

But note that it is a mental model. The fact that it is implemented inside the index, along with the stat cache, doesn’t need to be relevant to the user. And the fact that the actual content is in the object store, with sha1-identifiers in the index, is not relevant either. At least I don’t think so, and I am usually of the opinion that we should expose the data structures to the user, so that their mental model can match what is actually happening. But in this case, I think they can still have a pretty useful but simpler mental model.

Jeff King

There were other proposals, such as “the bucket”, or “precommit area”. Matthieu Moy said that as a non-native speaker anything “foo area” helps, not necessarily “staging area”, but Alexey Feldgendler disagreed, at least in Russian “staging area” is better.

Alexei Sholik argued that as a new git user, he thinks term “index” only confuses unprepared readers.

Jonathan Nieder decided to write a summary of the situation:

To summarize: everyone knows what the staging area is, no one seems to know what the index is, and the –cached options are confusing.

Jonathan Nieder

When I tried to explore other proposals like “commit preparation area” I eventually arrived back to the conclusion that there’s no better alternative than “staging area”, Miles Bader agreed:

I don’t why so many people seem to be trying so hard to come with alternatives to “staged” and “staging area”, when the latter are actually quite good; so far all the suggestions have been much more awkward and less intuitive.

Miles Bader

Jonathan Nieder agreed and proposed that we stop discussing, in order to start coding. Curiously enough Junio’s argument for not considering this for 1.8.0 is that there was no consensus, but consensus was reached only a couple of messages later.

However, nothing was coded. refresh

When Scott Chacon presented the new look of in 2012, Junio objected to the fact “staging area” was used prominently (presumably in the Pro Git book):

It seems that you are trying to advocate “staging area” as some sort of official term. I think “it is like a staging area” is a good phrase to use when answering “what is the index?”, but I think repeating it million times without telling the casual readers what its official name is is counterproductive. Don’t do that. It will confuse these same people when they start reading manuals.

Junio C Hamano

Scott disagreed with Junio. The term “staging area” is already quite popular, the only place where it’s not used is in the official documentation (i.e. man pages):

I’m not really trying to advocate it as much as using terminology that is already quite popular. It’s true that it’s not what is used in the man pages, but neither is ‘index’ used consistently – there is ‘cache’ too, in addition to ‘index’ having two meanings – packfile and cache. I’m open to making things clearer, but I just don’t think that changing the terminology to something more technical and vague would be overall less confusing to people.

Scott Chacon

While Junio and Scott mostly discussed where the location of git diff should be, it’s worth noting that merely because Scott disagreed, Junio told him that he should “know better”:

As you are supposed to be one of the top-level Git Teachers, I wish you knew better.  Here is a free Git lesson.

Junio C Hamano

Which Scott understandably found it condescending and unnecessary:

There is absolutely no reason to be this condescending.

Scott Chacon


Since nothing materialized from previous discussions, I decided to write a serious proposal for 1.8.0 (which later became 2.0). The proposal was simple: avoid “index” and “cache” in favor of “stage”.

Philip Oakley stated that he didn’t like the current terms, Matthieu Moy agreed something needs to be done, Zbigniew JÄ™drzejewski-Szmek that it was a very good idea, Mark Lodato agreed with me, and Sebastien Douche that it was an extremely good idea:

+1000. An anecdote: many attendees said to me “I didn’t understand until you explained it that way”. Now I use always the term “stage” in my training (and banned the term index). Far better.

Sebastien Douche

I agree with Felipe that “staging” is the most appropriate term for “adding to the index” in git. As a native English speaker, I have never thought of “to stage” as relating to shipping in any way. To me, by far the most common usage is in real estate. The seller of a home “stages” it by setting up furniture and decorations to make the home as appealing to prospective buyers as possible. Just search on Google for “home staging” and you will get plenty of hits. This usage clearly originates from theater but can be found in other contexts as well.

Mark Lodato

Yeah, I think that this is a very good idea. Having three different terms for this great but relatively obscure idea adds an unnecessary cognitive burden for newcomers to git. ‘stage’ is certainly the best of the three options.

Zbigniew Jędrzejewski-Szmek

Junio decided to find another objection to the word “stage”: that it might not be directly translated:

I didn’t necessarily wanted to use “stage”, it is “sad” because a new word-hunt may be needed for a replacement to “index” (as “stage” may not be a good word for i18n audience), and then we would need to keep “index”, “stage” and that third word as interchangeable terms.

Junio C Hamano

Ævar ArnfjörĂ° Bjarmason–a very prominent developer–did not think that made sense at all:

I don’t think that line of reasoning makes sense at all. We shouldn’t be picking terms in the original English translation of Git that we think would be more amenable to translation.

We should be picking terms that make the most sense in English, and then translators will translate those into whatever makes the most sense in their language.

Ævar Arnfjörð Bjarmason

This is of course a great point, and Junio simply had to agreed and said “OK“.

Since Junio explained what “index” means in Japanese, and did so in a way that only made sense internally, but not how end users actually use it, I decided to point that out:

That’s what git has, internally, but that’s not how high-level users interact with it.

Felipe Contreras

Matthieu Moy agreed with me.

Thiago Farina argued that neither “cache” or “index” helped him understand what the actual concept was, but “precommit” did.

When Jonathan Nieder proposed some concrete changes Junio objected stating that the name shouldn’t be important:

I personally think it is a wrong way of thinking to focus too much on the “name”, though.

Junio C Hamano

I disagreed:

Names are important. Name it ‘jaberwocky’ and people would have a harder time trying to understand what you are talking about. Maybe just by hearing the name they would give up, thinking it’s too advanced, or too obscure, or what not.

Felipe Contreras

Junio kept referring to the concept “that thing”, although not only is “jabberwocky” not very useful, but “index” is in fact worse than “jabberwocky” because it has an understood meaning that does not match how the user interacts with “that thing”. Moreover, at this point is had become extremely clear that everyone agrees that “staging area” is the term in English that most closely resembles what it would be used for, and translators have already explained why that term is fine.

Junio did not respond back.

Official move

Since nothing happened in 2011, and nothing happened in 2012, in 2013 I took it upon myself to actually write the patches to actually start moving to the term “staging area”. While doing so it occurred to me that the difference between --index and --cached in git apply could be determined with a separate option, so --index becomes --staged --work, and --cached becomes --staged --no-work. Since --work would be enabled by default, git apply --index simply becomes git apply --staged. Moreover, I realized the same could be done for git reset, so the unintuitive differences between --mixed, --soft and --hard become much clearer.

Matthieu Moy thanked me for working on this, found interesting my addition of git stage edit which is currently not possible to do with any git command or option, and agreed that --soft, --mixed, and --hard are terrible.

Junio objected, stating:

IIRC, when this was discussed, many non-native speakers had trouble with the verb “to stage”, not just from i18n/l10n point of view.

Junio C Hamano

I explained to Junio that was not the case. Ævar was the only non-native speaker that had a problem with the term “stage”, but not “staging area”, and in fact he had already translated “index” to “the commit area”, not that it really mattered because as Ævar argued: what should be picked is what makes the most sense in English, and how that is best translated is up to the translators. Even Junio eventually agreed.

Again, everyone has agreed that index needs to be renamed, and “staging area” is the best option.

Felipe Contreras

I asked Junio if he could re-read the threads (I provided the links) so I did not have to list the opinion of every person who participated. Junio did not respond to me, so I started writing the list, but asked him once again if he was planing to do that himself so I did not have to waste my time. He did not reply to that either.

So I had to finish the list. Only one person–Drew Northup–did not see the point in changing the name (but did not find any problem with the proposed name either), the rest, nineteen people were in favor of changing things.

If that was not enough, more people joined on this round:

I realize Git is not a democracy, but if the vote of a humble user counts for anything, I agree that “index” is a terrible name.

I was very excited when Felipe first started this thread, since I thought Git might finally fix one of it’s biggest long-standing usability problems. Calling this thing the “index” is like calling an important variable “someValue.” While the name may be technically correct, it’s way too generic to be useful. A name like “staging area” may not capture the whole idea, but at least it provides a good clue about what it does and how you might use it.

If we change this, I’m pretty sure most of the Internet will rejoice. Only a few old-timers will be grumpy, but that’s just because they don’t like change in general. I have never met anybody (outside this thread) who thought the current name was a good idea.

William Swanson

+1 for staging area

Ping Yin

As yet another “just a user”, I’d like to add my enthusiastic support for “to stage” and “staging area”.

Hilco Wijbenga

All three: William, Ping, and Hilco expressed that they were very excited when I started the thread, because of the possibility of finally fixing this long-standing usability problem.

Junio did not reply again.


That’s the point where I stopped trying, but the discussions continued for years:

Is “staging area” still considered as the correct term or has time proven that index is better?

Lars Vogel

Why is “index” better? It is a confusing name, one that has many other unrelated meanings. In particular, many projects managed by git also have an index, but few have a staging area.

David A. Wheeler

That’s an absurd argument. A database product that wants to be used in library systems are forbidden to have “index” because that may be confused with library index cards?

Junio C Hamano

The list


  • Junio C Hamano: “staging area” is a near-sighted and narrow minded term


  • Nanako Shiraishi: the name doesn’t matter
  • Drew Northup: the name doesn’t matter

In favor

  • Felipe Contreras: “staging area” is the correct description
  • Scott Chacon: we should move from “index” to “staging area”
  • Jay Soffian: staging area is better
  • Pete Harlan: “staging area” is good for teaching
  • Aghiles: “staging area” is good for teaching
  • Piotr Krukowiecki: “staging area” makes sense
  • Jonathan Nieder: “staging area” is better than “index”
  • Jeff King: “staging area” makes perfect sense
  • Miles Bader: “staging area” is good
  • Phil Hord: “staging area” is better than index/cache
  • Victor Engmark: maybe “git bucket”
  • David (bouncingcats): maybe “precommit”
  • Alexey Feldgendler: “staging area” translates better into Russian (than precommit)
  • Alexei Sholik: “staging area” is better
  • Zbigniew JÄ™drzejewski-Szmek: “staging area” is better
  • Ævar ArnfjörĂ° Bjarmason: In Icelandic “index/stage” is translated to “the commit area”
  • Sebastien Douche: “stage” is better than “cache”/”index”
  • Thiago Farina: “precommit” is better
  • Mark Lodato: “staging” is the most appropriate
  • Philip Oakley: “staging area” is OK
  • Matthieu Moy: something needs to be done
  • William Swanson: “index” is a terrible name
  • Ping Yin: +1 for staging area
  • Hilco Wijbenga: I enthusiastically support “staging area”
  • Lars Vogel: “staging area” is better
  • David A. Wheeler: “staging area” is less confusing


Even though I started to contribute again in 2019 and I managed to land 81 patches, Junio once again has decided to ignore all my patches.

I decided to update and cleanup the patches from 2013, if you want to give them a try you can do so on my fork: git-fc. It’s just 7 patches that add a new command and useful options:

  • git stage
  • git unstage
  • git stage --add
  • git stage --remove
  • git stage --diff
  • git stage --edit

Additionally a description of the staging area is added to git help stage. What is missing from the 2013 patches are all the --stage options that are basically aliases to --cached. It is mostly legwork but I will eventually add those too.

Even though the Git project is not a democracy, you can cast your vote on this poll: Should git officially use the term “staging area” instead of “the index”? But if you want this to be done there’s only one way: send an email to the Git mailing list and try to convince Junio C Hamano.

Freedom of speech in online communities

I have debated freedom of speech countless times, and it is my contention that today (in 2021) the meaning of that concept is lost.

The idea of freedom of speech didn’t exist as such until censorship started to be an issue, and that was after the invention of the printing press. It was after people starting to argue in favor of censorship that other people started to argue against censorship. Freedom of speech is an argument against censorship.

Today that useful meaning is pretty much lost. Now people wrongly believe that freedom of speech is a right, and only a right, and worse: they equate freedom of speech with the First Amendment, even though freedom of speech existed before such law, and exists in countries other than USA. I wrote about this fallacy in my other blog in the article: The fatal freedom of speech fallacy.

The first problem when considering freedom of speech a right is that it shuts down discussion about what it ought to be. This is the naturalistic fallacy (confusing what is to what ought to be). If we believed that whatever laws regarding cannabis are what we ought to have, then we can’t discuss any changes to the laws, because the answer to everything would be “that’s illegal”. The question is not if X is illegal currently, the question is should it? When James Damore was fired by Google for criticizing Google’s ideological echo chamber, a lot of people argued that Google was correct in firing him, because it was legal, but that completely misses the point: the fact that something is legal doesn’t necessarily mean it’s right (should be illegal).

Today people are not discussing what freedom of speech ought to be.

Mill’s argument

In the past people did debate what freedom of speech ought to be, not in terms of rights, but in terms of arguments. The strongest argument comes from John Stuart Mill which he presented in his seminal work On Liberty.

Mill’s argument consists on three parts:

  1. The censored idea may be right
  2. The censored idea may not be completely wrong
  3. If the censored idea is wrong, it strengthens the right idea

It’s obvious that if an idea is right, society will benefit from hearing it, but if an idea is wrong, Mill argues that it still benefits society.

Truth is antifragile. Like the inmune system it benefits from failed attacks. Just like a bubble boy which is shielded from the environment becomes weak, so do ideas. Even if an idea is right, shielding it from attacks makes the idea weak, because people forget why the idea was right in the first place.

I often put the example of the idea of flat-Earth. Obviously Earth is round, and flat-Earthers are wrong, but is that a justification for censoring them? Mill argues that it’s not. I’ve seen debates with flat-Earthers, and what I find interesting are the arguments trying to defend the round Earth, but even more interesting are the people that fail to demonstrate that the Earth is round. Ask ten people that you know how would they demonstrate that the Earth is round. Most would have less knowledge about the subject than a flat-Earther.

The worst reason to believe something is dogma. If you believe Earth is round because science says so, then you have a weak justification for your belief.

My notion of the round Earth only became stronger after flat-Earth debates.

Censorship hurts society, even if the idea being censored is wrong.

The true victim

A common argument against freedom of speech is that you don’t have the right to make others listen to your wrong ideas, but this commits all the fallacies I mentioned above, including confusing the argument of freedom of speech with the right, and ignores Mill’s argument.

When an idea is being censored, the person espousing this idea is not the true victim. When the idea was that Earth was circling the Sun (and not the other way around as it was believed), Galileo Galilei was not the victim: he already knew the truth: the victim was society. Even when the idea is wrong, like in the case of flat-Earth, the true victim is society, because by discussing wrong ideas everyone can realize by themselves precisely why they are wrong.

XKCD claims the right to free speech means the government can't arrest you for what you say.
XKCD doesn’t know what freedom of speech is

The famous comic author Randall Munroe–creator of XKCD–doesn’t get it either. Freedom of speech is an argument against censorship, not a right. The First Amendment came after freedom of speech was already argued, or in other words: after it was already argued that censorship hurts society. The important point is not that the First Amendment exists, the important point is why.

This doesn’t change if the censorship is overt, or the idea is merely ignored by applying social opprobrium. The end result for society is the same.

Censorship hides truth and weakens ideas. A society that holds wrong and weak ideas is the victim.

Different levels

Another wrong notion is that freedom of speech only applies in public spaces (because that’s where the First Amendment mostly applies), but if you follow Mill’s argument, when Google fired James Damore, the true victim was Google.

The victims of censorship are at all levels: society, organization, group, family, couple.

Even at the level of a couple, what would happen to a couple that just doesn’t speak about a certain topic, like say abortion?

What happens if the company you work for bans the topic of open spaces? Who do you think suffers? The people that want to criticize open spaces, or the whole company?

The First Amendment may apply only at a certain level, but freedom of speech, that is: the argument against censorship, is valid at every level.

Online communities

Organizations that attempt to defend freedom of speech struggle because while they want to avoid censorship, some people simply don’t have anything productive to say (e.g. trolls), and trying to achieve a balance is difficult, especially if they don’t have a clear understanding of what freedom of speech even is.

But my contention is that most of the struggle comes from the misunderstandings about freedom of speech.

If there’s a traditional debate between two people, there’s an audience of one hundred people, and one person in the audience starts to shout facts about the flat-Earth, would removing that person from the venue be a violation of freedom of speech? No. It’s just not part of the format. In this particular format an audience member can ask a question at the end in the Q&A part of the debate. It’s not the idea that is being censored, it’s the manner in which the idea was expressed that is the problem.

The equivalent of society in this case is not hurt by a disruptive person being removed.

Online communities decide in what format they wish to have discussions in, and if a person not following the format is removed, that doesn’t hide novel ideas nor weakens existing ideas. In order words: the argument against censorship doesn’t apply.

But in addition the community can decide which topics are off-topic. It makes no sense to talk about flat-Earth in a community about socialism.

But when a person is following the format, and talking about something that should be on-topic, but such discussion is hindered either by overt censorship (e.g. ban), or social opprobrium (e.g. downvotes), then it is the community that suffers.

Ironically when online communities censor the topic of vaccine skepticism, the only thing being achieved is that the idea becomes weak, that is: the people that believe in vaccines do so for the wrong reasons (even if correct), so they become easy targets for anti-vaxxers. In other words: censorship creates the exact opposite of what it attempts to solve.

Online communities should fight ideas with ideas, not censorship.

How a proper git commit is made (by a git developer)

Throughout my career of 20 years as an open source software developer I’ve seen all kinds of projects with all kinds of standards, but none with higher standards as the Git project (although the Linux kernel comes pretty close).

Being able to land a commit on git.git is a badge of honor that not all developers would be able to get, since it entails a variety of rare skills, like for example the ability to receive criticism after criticism, and the persistence to try many times–sometimes reaching double-digit attempts, not to mention the skill necessary to implement the solution to a problem nobody else saw before–or managed to implement properly–in C, and sometimes the creativity to come up with alternative solutions that address all the criticism received previously.

The purpose of this post is not to say “this is why Git is better than your project” (the Git project has many issues), the purpose is to showcase parts of a fine-tuned development process so you as a developer might consider adopting some yourself.


The first part of the process (if done correctly) is inevitably discussion. The git mailing list is open, anyone can comment on it, and you don’t even need to be subscribed, just send a mail to the address and you will be Cc’ed in all the replies.

This particular story starts with a mail by Mathias Kunter in which he asks why git push fails with certain branches, but only if you don’t specify them. So for example if you are in a branch called “topic”, this would fail:

git push

But not this:

git push origin topic

Why? His question is a valid one.

The first reply comes from me, and I explain to him how the code views the two commands, for which a basic understanding of refspecs is needed. Briefly, a refspec has the form of src:dst, so “test:topic” would take the “test” branch and push it to the remote as “topic”. In the first command above, the refspec would be the equivalent of “topic:” (no destination), while the second command would be “topic:topic”. In other words: git does’t assume what name you want as destination.

Did I know this from the top of my head? No. I had to look at the code to understand what it’s doing before synthesizing my understanding in as succinct as a reply as I possibly could write. I often don’t look at the official documentation because I find it very hard to understand (even for experts), and it’s often inaccurate, or ambiguous.

Notice that I simply answered his question of why the first command fails, and in addition I offered him a solution (with push.default=current git would assume the name of the destination to be the same as the source), but at no point did I express any value judgement as of what was my opinion of what git ought to actually do.

Mathias thanked me for my reply, and pushed back on the solution to use the “current” mode because he thought the “simple” mode (which is the default), should behave the same way in this particular case. For his argument he used the documentation about the simple mode:

When pushing to a remote that is different from the remote you normally pull from, work as current.

This is where the nuance starts to kick in. If the “topic” branch has an upstream branch configured (e.g. “origin/topic”), then git push would behave the same in both the “simple” and “current” modes, however, if no upstream branch is configured (which is usually the case), then it depends on the remote. According to Mathias, if he has no upstream configured, then there’s no “remote that is different from the remote you normally pull from” (the remote you normally pull from in this case is “origin”, because the upstream branch is “origin/topic”), so “simple” should work like “current”.

In my opinion he is 100% right.

Additionally, I have to say we need more users like Mathias, who even though he knows how to fix the issue for himself, he is arguing that this should be fixed for everyone.

Elijah Newren suggested that perhaps the documentation could be changed to explain that this only happens when “you have a remote that you normally pull from”, in other words: when you have configured an upstream branch. But this doesn’t make sense for two reasons. If you don’t have configured an upstream branch, then the other mode that would be used is “upstream”, but “upstream” fails if you don’t have configured an upstream branch, so it would always fail. Secondly, the reason why it doesn’t always fail is that the former is not possible: when you don’t have configured an upstream branch, “origin” is used, therefore you always have a “remote that you normally pull from”.

I explained to Elijah that the remote is never null, since by default it’s “origin”, and suggested a trivially small modification to the documentation to mention that (since even experts like him miss that), but he suggested an even bigger one:

If you have a default remote configured for the current branch and are pushing to a remote other than that one (or if you have no default remote configured and are pushing to a remote other than ‘origin’), then work as ‘current’.

Oh boy! I cannot even being to explain why I find this explanation so wrong on so many levels, but let me start by saying I find this completely unparseable. And this is the biggest problem the official git documentation has: it’s simply translating what the code is doing, but if the code is convoluted, then the documentation is convoluted as well. I shred this paragraph piece by piece and showed why changing the logic would make it much more understandable.

But at this point I got tired of reading the same spaghetti code over and over again. It’s not just that I found the code hard to follow, it’s that I wasn’t actually sure of what it was doing. And so my reorganization patch series began.

The series

I am of the philosophy of code early, and code often. I don’t believe in design as separate from code, I believe the design should evolve as the code evolves. So I just went ahead and wrote the thing. The first version of my patch series reorganized the code so that the “simple” mode was not defined in terms of “current” and “upstream”, but as a separate mode, and then once it became clear what “simple” actually did, redefine “current” in terms of “simple”, rather than the other way around. It consisted of 11 patches:

  1. push: hedge code of default=simple
  2. push: move code to setup_push_simple()
  3. push: reorganize setup_push_simple()
  4. push: simplify setup_push_simple()
  5. push: remove unused code in setup_push_upstream()
  6. push: merge current and simple
  7. push: remove redundant check
  8. push: fix Yoda condition
  9. push: remove trivial function
  10. push: flip !triangular for centralized
  11. doc: push: explain default=simple correctly

I could describe each one of these patches in great detail, and in fact I did: in the commit messages of each of these patches, but just to show an example of the refactorization these patches do, let’s look at patch #7, #8, and #9.

push: remove redundant check

 static int is_workflow_triangular(struct remote *remote)
-	struct remote *fetch_remote = remote_get(NULL);
-	return (fetch_remote && fetch_remote != remote);
+	return remote_get(NULL) != remote;

There is no need to do two checks: A && A != B, because if A is NULL, then NULL != B (B is never NULL), so A != B suffices.

push: fix Yoda condition

 static int is_workflow_triangular(struct remote *remote)
-	return remote_get(NULL) != remote;
+	return remote != remote_get(NULL);

There’s a lot of Yoda conditions in the git code, and it’s very hard for many people to parse what that code is supposed to do in those situations. Here we want to check that the remote we are pushing to is not the same as the remote of the branch, so the order is the opposite of what’s written.

push: remove trivial function

-static int is_workflow_triangular(struct remote *remote)
-	return remote != remote_get(NULL);
 static void setup_default_push_refspecs(struct remote *remote)
 	struct branch *branch = branch_get(NULL);
-	int triangular = is_workflow_triangular(remote);
+	int triangular = remote != remote_get(NULL);

Now that the code is very simple, there’s no need to have a separate function for it.

The rest

Notice that there’s absolutely no functional changes in the three patches above: the code after the patches ends up doing the exactly same as before. The same applies for all the other patches.

There’s three things that I think are important of the overall result: 1. the new function setup_push_simple is a standalone function, so to understand the behavior of the “simple” mode, all you have to do is read that function, 2. the existing function is_workflow_triangular is unnecessary, has the wrong focus, and is inaccurate, and 3. now that it’s easy to follow setup_push_simple, the documentation becomes extremely clear:

pushes the current branch with the same name on the remote.

If you are working on a centralized workflow (pushing to the same repository you pull from, which is typically origin), then you need to configure an upstream branch with the same name.

Now we don’t need to argue about what the “simple” mode does, there’s no confusion, and the documentation while still describing what the code does is easy to understand by anyone, including the original reporter: Mathias.

Great. Our work is done…

Not so fast. This is just the first step. Even though this patch series is perfectly good enough, in the Git project the first version is very rarely accepted; there’s always somebody with a comment.

The first comment comes from Elijah, he mentioned that in patch #3 I mentioned that I merely moved code around, but that wasn’t strictly true, since I did remove some dead code too, and that made it harder to review the patch for him. That wasn’t his only comment, additionally he stated that in patch #4 I should probably add a qualification to explain why branch->refname cannot be different from branch->merge[0]->src, and in patch #10 he said the commit message seemed “slightly funny to [him]” but he agreed it made the code easier to read, and finally in the last patch #11 which updates the documentation:

Much clearer. Well done.

I replied to all of Elijah’s feedback mostly stating that I’m fine with implementing his suggestions, but additionally I explained the reasoning behind changing “!triangular” to “centralized”, and that’s because I heard Linus Torvalds state that the whole purpose of git was to have a decentralized version control system so it makes no sense to have a conditional if (triangular) when that conditional is supposed to be true most of the time.

Additionally Bagas Sanjaya stated that we was fine with the changes to the documentation, since the grammar was fine.


OK, so the first version seemed to be a success in the sense that everyone who reviewed it didn’t find any fatal flaws in the approach, but that doesn’t mean the series is done. In fact, most of the work is yet ahead.

For the next version I decided to split the patch series into two parts. The first part includes the patches necessary to reach the documentation update which was the main goal, and the second part would be everything that makes the code more readable but which is not necessary for the documentation update. For example changing !triangular to centralized is a nice change, but not strictly necessary.

Now, v2 requires more work than v1 because not only do I have to integrate all the suggestions from Elijah, but I have to do it in a way that it’s still a standalone patch series, so anyone who chooses to review v2 but hasn’t seen v1, can understand it. So essentially v2 has to be recreated.

This is why git rebase is so essential, because it allows me to choose which commits to update, and the end result would look like all the suggestions from Elijah had always been there.

But not only that, I have to update the description of the series (the cover letter in git lingo), to explain what the patch series does for the people that haven’t reviewed it before, and in addition explain what changed from v1 to v2 for the people that have already reviewed the first version.

This is where one tool which I think is mostly unknown comes into play: git rangediff. This tool compares two versions of a patch series (e.g. v1 and v2), and then generates a format similar to a diff delineating what changed. It takes into consideration all kinds of changes, including changes to the commit message of every commit, and if there are no changes either in the code or the commit message, that’s shown too.

So essentially people who have already reviewed v1 can simply look at the rangediff for v2 and based on that figure out if they are happy with the new version. The reason why it’s called rangediff is because it receives two ranges as arguments (e.g. master..topic-v1 master..topic-v2).

This time the result was 6 patches. If you check the rangediff you can see that I made no changes in the code, whoever, I changed some of the commit messages, and 5 patches were dropped.

Let’s look at one change from the rangediff for patch #3.

3: de1b621b7e ! 3: d66a442fba push: reorganize setup_push_simple()
    @@ Metadata
      ## Commit message ##
         push: reorganize setup_push_simple()
    -    Simply move the code around.
    +    Simply move the code around and remove dead code. In particular the
    +    'trivial' conditional is a no-op since that part of the code is the
    +    !trivial leg of the conditional beforehand.
         No functional changes.
    +    Suggestions-by: Elijah Newren <>
         Signed-off-by: Felipe Contreras <>
      ## builtin/push.c ##

It should be pretty straightforward to see what I changed to the commit message.

The second part is what is most interesting. Not only does it includes better versions of the 5 patches that were dropped from the previous series, but it includes 10 patches more. So this is something that has to be reviewed from scratch, since it’s completely new.

  1. push: create new get_upstream_ref() helper
  2. push: return immediately in trivial switch case
  3. push: reorder switch cases
  4. push: factor out null branch check
  5. push: only get the branch when needed
  6. push: make setup_push_* return the dst
  7. push: trivial simplifications
  8. push: get rid of all the setup_push_* functions
  9. push: factor out the typical case
  10. push: remove redundant check
  11. push: fix Yoda condition
  12. push: remove trivial function
  13. push: only get triangular when needed
  14. push: don’t get a full remote object
  15. push: rename !triangular to same_remote

I’m not going go through all the different attempts I made locally to arrive to this series, but it was way more than one. I used git rebase over and over to create a sequence of commits to arrive to a clean result, where each commit implemented a logically-independent change that I thought would be easy to review, would not break previous functionality, and would make the code easier to read from the previous state.

The highlights from this series is that I added a new function get_upstream_ref that would be used by both the “simple” and “upstream” modes, but additionally by reorganizing the code I was able to remove all the setup_push_* functions, including the one I introduced: setup_push_simple. By moving everything into the same parent function now it’s easy to see what all the modes do, not only the “simple” mode. Additionally, instead of changing !triangular to centralized, I decided to change it to same_remote. Upon further analysis I realized that you could be in a triangular workflow and yet be pushing into the same repository (that you pull from), and the latter is what the code cared about, not triangular versus centralized. More on that later.

This time much more people commented: Mathias Kunter, Philip Oakley, Ævar Arnfjörð Bjarmason, and Junio C Hamano himself (the maintainer).

Philip Oakley pointed out that the documentation currently doesn’t explain what a triangular workflow is, but that isn’t a problem of this patch series. Ævar ArnfjörĂ° Bjarmason provided suggestions for the first part of the series, but those were not feasible, and overridden by the second part of the series. Mathias Kunter said he actually preferred the explanation of the “simple” mode provided in another part of the documentation, but he incorrectly thought my patches changed the behavior to what he suggested, but that wasn’t the case.

Junio provided comments like:

  • Much cleaner. Nice.
  • Simpler and nicer. Good.
  • Nice.
  • Nicely done.
  • Well done.

If you want to take a look at the change from the most significant patch, check: push: get rid of all the setup_push_* functions.

His only complaints were my use of the word “hedge” (which he wasn’t familiar with), of “move” (when in fact it’s duplicated), “reorder” (when in his opinion “split” is better), and that he prefers adding an implicit break even if the last case of a switch is empty.

I explained to Junio that the word “hedge” has other meanings:

* to enclose or protect with or as if with a dense row of shrubs or low trees: ENCIRCLE

* to confine so as to prevent freedom of movement or action

Some synonyms: block, border, cage, confine, coop, corral, edge, fence, restrict, ring, surround.

So my use of it fits.

For most of his comments I saw no issue implementing them as most didn’t even involve changing the code, but additionally he suggested changing the order so the rename of triangular is done earlier in the series. While I have no problem doing this and in fact I already tried in an interim version of the series, I knew it would entail resolving a ton of conflicts, and worse than that, if later on somebody decides they don’t like the name same_remote I would have to resolve the same amount of conflicts yet again. I said I would would consider this.

Moreover, I recalled the reason why I chose same_remote instead of centralized, and I explained that in detail. Essentially it’s possible to have a triangular workflow that is centralized, if you pull and push to different branches but of the same repository. The opposite of triangular is actually two-way, not centralized.

centralized = ~decentralized
triangular = ~two-way

This triggered a discussion about what actually is a triangular workflow, and that’s one of the benefits of reviewing patches by email: discussions turn into patches, and patches turn into discussions. In total there were 36 comments in this thread (not counting the patches).


For the next version I decided not just to move the same_remote rename sooner in the series as Junio suggested, but actually in the very first patch. Although a little risky, I thought there was a good chance nobody would have a problem with the name same_remote, since my rationale of triangular versus two-way seemed to be solid.

Part 1:

  1. push: rename !triangular to same_remote
  2. push: hedge code of default=simple
  3. push: copy code to setup_push_simple()
  4. push: reorganize setup_push_simple()
  5. push: simplify setup_push_simple()
  6. push: remove unused code in setup_push_upstream()
  7. doc: push: explain default=simple correctly

Part 2:

  1. push: create new get_upstream_ref() helper
  2. push: return immediately in trivial switch case
  3. push: split switch cases
  4. push: factor out null branch check
  5. push: only get the branch when needed
  6. push: make setup_push_* return the dst
  7. push: trivial simplifications
  8. push: get rid of all the setup_push_* functions
  9. push: factor out the typical case
  10. push: remove redundant check
  11. push: remove trivial function
  12. push: only check same_remote when needed
  13. push: don’t get a full remote object

Moving the same_remote patch to the top changed basically the whole series, as can be seen from the rangediff, but the end result is exactly the same, except for one little break.

Junio mentioned that now the end result matches what he had prepared in his repository and soon included then in his “what’s cooking” mails.

* fc/push-simple-updates (2021-06-02) 7 commits

 Some code and doc clarification around "git push".

 Will merge to 'next'.

* fc/push-simple-updates-cleanup (2021-06-02) 13 commits

 Some more code and doc clarification around "git push".

 Will merge to 'next'.

So there it is, unless somebody finds an issue, they will be merged.

To be honest this is not representative of a typical patch series. Usually it takes more than 3 tries to got a series merged, many more. And it’s also not common to find so many low-hanging fruit.

switch (push_default) {
    if (!same_remote)
    if (strcmp(branch->refname, get_upstream_ref(branch, remote->name)))
        die_push_simple(branch, remote);

    if (!same_remote)
        die(_("You are pushing to remote '%s', which is not the upstream of\n"
              "your current branch '%s', without telling me what to push\n"
              "to update which remote branch."),
            remote->name, branch->name);
    dst = get_upstream_ref(branch, remote->name);


refspec_appendf(&rs, "%s:%s", branch->refname, dst);
Documentation/config/push.txt | 13 +++----
builtin/push.c | 91 +++++++++++++++++++------------------------
2 files changed, 47 insertions(+), 57 deletions(-)

The commit

These two patch series cleaned up the code and improved the documentation, but they didn’t actually change any functionality. Now that it’s clear what the code is actually doing, Mathias Kunter’s question is easily answered: why is git push failing with “simple”? Because get_upstream_ref always fails if there’s no configured upstream branch, but should it?

All we have to do is make it not fail. In other words: if there’s no upstream branch configured, just go ahead, and thus it would behave as “current”.

Then this:

If you are working on a centralized workflow (pushing to the same repository you pull from, which is typically origin), then you need to configure an upstream branch with the same name.

Becomes this:

If you are working on a centralized workflow (pushing to the same repository you pull from, which is typically origin), and you have configured an upstream branch, then the name must be the same as the current branch, otherwise this action will fail as a precaution.

Which makes much more sense.

Right now pushing to master works by default:

git clone $central .
git push

But not pushing to a new topic branch:

git clone $central .
git checkout -b topic
git push

It makes no sense to allow pushing “master” which is much more dangerous, but fail pushing “topic”.

This is the patch that changes the behavior to make it sensible, where I explain all of the above, and the change in the code is simple, basically don’t die().

It took 47 days from the moment Mathias sent his question to the point where Junio merged my patches to master, but we are close the finish line, right?

No, we are not.

Convincing Junio of some code refactoring is relatively easy, because it’s simply a matter of showing that 2 + 2 = 4; it’s a technical matter. But convincing him of what git ought to do is much more difficult because it requires changing his opinion using arguments, this part is not technical, but rhetorical.

For reference, even though several years ago I managed to convince everyone that @ is a good shortcut for HEAD, Junio still complains that it “looks ugly both at the UI level and at the implementation level“, so for some reason my arguments failed to convince him.

So can Junio be convinced of this obvious fix? Well, everything is possible, but I wouldn’t be holding my breath, especially since he has decided to ignore me and all my patches.

Either way the patch is good. It’s simple thanks to the many hours I spent cleaning up the code, and benefited from all the input from many reviewers. And of course if anyone still has any comments on it they are free to state them on the mailing list and I’d be happy to address them.

Why is git pull broken?

A lot of people complained that my previous post–git update: the odyssey for a sensible git pull–was too long (really? an article documenting 13 years of discussions was long?), and that a shorter version would be valuable. The problem is that the short version is actually too short:

Do not use git pull.

That’s it, really.

But why? Even thought it’s obvious for me, and many other developers why git pull is broken and should not be used by most users, presumably a lot of people don’t know that, since they continue to use it.

Here it is.


Let’s start by explaining where git pull is not broken.

It was created for maintainers; when you send a pull request, a maintainer is supposed to run git pull on the other side. For this git pull works perfectly fine.

If you are a developer (non-maintainer), and use a topic branch workflow, then you don’t even need git pull.

That leaves developers who work on a centralized workflow (e.g. trunk-based development). The rest of the article is with them in mind, who unfortunately are the vast majority of users, especially novices.

It creates merge commits

What most people want to do is synchronize their local branch (e.g. “master”) with the corresponding remote branch (e.g. “origin/master”), in particular because if they don’t, git push fails with:

To origin
! [rejected] master -> master (non-fast-forward)
error: failed to push some refs to 'origin'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull …') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

OK, so we need to “integrate” the remote changes with git pull, so presumably git pull is the mirror of git push, so it makes sense.

Except it’s not, and git pull was never designed with this use case in mind. The mirror of git push is git fetch which simply pulls the remote changes locally so you can decide later on how to integrate those changes. In Mercurial hg pull is the equivalent of git fetch, so in Mercurial hg push and hg pull are symmetric, but not in git.

At some point in time a path to make git pull be symmetric to git push was delineated, but the maintainer of the Git project considered it “mindless mental masturbation“, so forget about it.

After you have pulled the changes with git fetch, then there’s two possibilities: fast-forward and diverging.


A fast-forward is simple; if the local branch and the remote branch have not diverged, then the former can be easily updated to the latter.

In this case “master” (A) can be fast-forwarded to “origin/master” (C) (only possible if the branches have not diverged).


However, if the branches have diverged, it’s not that easy:

In this case “master” (D) has split of “origin/master” (C) so a new commit (E) is needed to synchronize both.


There’s another more advanced possibility if the branches have diverged:

In this case the diverging commit of the local branch “master” (D) is recreated on top of “origin/master” (C) so the resulting history is linear (as if it never had diverged in the first place and the base of the local branch was C).


OK, so if the branches have diverged you have two options (merge or rebase), which one should you pick? The answer is: it depends.

Some projects prefer a linear history, in those cases you must rebase. Other projects prefer to keep the history intact, so it’s fine if you merge. If you don’t do many changes then most of the time you can fast-forward.

Most experts would do a rebase, but if you are new to git a merge is easier.

We are still nowhere near a universal answer, and what do most people do when the answer is not clear? Nothing. By default git pull does a merge, so that’s what most people end up doing by omission, but that’s not always right.

So that’s the first problem: git pull creates a merge commit by default, when it shouldn’t. People should be doing git fetch instead and then decide whether to merge or rebase if the branches have diverged (a fast-forward is not possible).

Merges are created in the wrong order

Let’s say the project allows merges, in that case it’s OK to just do git pull (since the default action is merge) right?


This is what git pull does by default: a merge commit. However, it’s merging “origin/master” (C) into “master” (D), but upstream is the remote repository, not the local one.

The order is wrong:

This is a correct merge: the local “master” (D) is merged into the remote “origin/master” (C). A similar result would happen if you had created a topic branch for D, and then merged that into “master”

In git, merge commits are commits with more than one parent, and the order matters. In the example above the first parent of E is C, and the second one is D. To refer to the first parent you do master^1, the second is master^2.

Proper history

Does it really matter which is the first parent? Yes it does.

Correct vs. incorrect order

In the correct history (left) it’s clear how the different topic branches are integrated into “master” (blue). Visualization tools (e.g. gitk) are able to represent such history nicely. Additionally you can do git log --first-parent to traverse only the main commits (blue).

In the incorrect history (right) the merges are a mess. It’s not clear what merged into what, visualization tools will show a mess, and git log --first-parent will traverse the wrong commits (green ones).

Better conflict resolution

If that wasn’t enough, at the time of resolving conflicts it makes more sense to think of integrating your changes to upstream (“origin/master”) rather than the other way around. Mergetools like meld would present the flow correctly: from right to the middle.


Update: In the original version of the article I only concentrated on the facts, and I didn’t include the opinion of other developers, but since there seems to be a lot of people ignoring the facts, and distrusting my judgement, I’ve decided to list some of the developers who agree git pull isn’t doing what it should be doing (at least by default, for non-maintainers).


So every time you do a merge, you do it wrong. The only way to use git pull correctly is to configure it to always do a rebase, but since most newcomers don’t know what a rebase is, that’s hardly a universal solution.

The proper solution is my proposal for a git update command that creates merge commits with the correct order of the parents, does only fast-forwards by default, and can be properly configured.

So there you have it. Now you know why git pull is definitely broken and should not be used. It was never intended to be used by normal users, only maintainers.

Do git fetch instead, and then decide how to integrate your changes to the remote branch if necessary.

git update: the odyssey for a sensible git pull

While git pull kind of works, over the years a slew of people have pointed out many flaws, and proposed several fixes. Some have even suggested that newcomers should be discouraged from using the command (as they often use it wrongly), and others to remove it entirely.

I spent several days digging through the entire history of the Git mailing list in order to document all the discussions related to git pull and its default mode. It’s a long story, so grab some coffee (a whole pot of it).

Update: I wrote a shorter article that only talks about why git pull is broken.


The entire post is going to be talking about fast-forward merges so I’ll attempt to explain them briefly.

Basically when your history and the remote history have diverged, like:


a fast-forward merge is not possible: you need to create a real merge that will have two parents: C and B.

The fast-forward case is much simpler:


While you can create a merge, in this case it’s not necessary: the “master” branch pointer can simply move forward to B.

The whole story is about what happens when a novice does git pull when a fast-forward is not possible.


The story begins in 2008 with a request from Sverre Hvammen Johansen to add the --ff-only option to git merge (then called git-merge), since in his workflow a certain branch was only allowed to fast-forward.

A few weeks later we get the first patch to implement the --ff=only merge strategy. Unfortunately the patch was more complicated that it needed to be, and it wasn’t merged.

Sverre gave it another go a few months later. Junio C. Hamano (the maintainer) said “with vastly improved documentation and justification compared to the previous rounds, I am beginning to actually like this series”. Unfortunately even though Sverre sent several rounds, they were never merged. Good attempt by Sverre.

Another attempt was made at the beginning of 2009 by Yuval Kogman, this time only adding the --ff-only option. What it did was simple: if a fast-forward is possible then do it, if not then fail. Everyone was in favor. Initial versions of the patch had some issues, those were fixed, but unfortunately the last iteration received no feedback. Thanks for trying though.

A few months later, the question was asked “how do you fast-forward your local dev to sync up with origin/dev?“. If the previous patch was merged, the answer would have been simple: git merge --ff-only, but since that wasn’t the case, Junio replied with:

git push . origin/dev dev

This is not user-friendly if you ask me, and later on turned out it wasn’t even correct.

Months later Randal L. Schwartz brings up the same issue and demands a simple command that can be told to users in the #git IRC channel who constantly asked this question. It is pointed out such thing has been implemented twice already, but never merged.

To which Junio replied:

Do you mean twice they were both found lacking, substandard, useless, uninteresting, buggy, incorrect, or all of the above?

Geez! Eyvind was just trying to help. They clearly were not useless nor uninteresting, since many people had rallied in favor of such a feature, and yet more joined saying +1, even in this very thread.

The first big one

At the end of the year appears the first big discussion about the asymmetry of pull and push. Thomas Rast makes very bold suggestions, like deprecating git fetch completely. These drastic changes would have worked using the suggestion of Wesley J. Landaker to add a configuration, and deprecation warnings.

This proposal was not taken seriously; Junio considered the suggestion to turn git pull into git fetchmindless mental masturbation“, and did not see how git pull and git push could ever be symmetric.

On the other hand an interesting piece of data was provided by Björn Steinbrink. He investigated 10 days of IRC activity in the #git channel and found plenty of users confused by git pull, many expected not only there to be a --ff-only option, but this behavior to be the default.

Aside from --ff-only, other main ideas were:

  1. Add a --merge option to git pull
  2. Add a pull.merge configuration

Björn Gustavsson used this big thread as inspiration to implement --ff-only for both git merge and git pull.

Finally. After three implementations, Junio accepted the idea and merged the patches in October 2009.

Begging for a configuration

Not long after the introduction of git pull --ff-only (6 months later), the first request for a configuration to make it the default appears in 2010. Peter Kjellerstedt argues that pull.options and merge.options configurations would fit him well. Not only that, but he thinks --ff-only should be the default. Nothing comes of this.

In 2013 people in the Linux Kernel Mailing List found issues with git pull and Linus Torvalds brought them to the Git mailing list. Linus argued that depending on whether you are a maintainer or a submaintainer you might want to do either --ff-only, or a true merge. An alias could do the trick (for pull --ff-only), but asked the Git community for ideas.

One of the interesting ideas came from Theodore Ts’o; have a per-repository configuration. Junio did follow this thread closely, but nothing happened.

Only days later, Ramkumar Ramachandra introduces the idea of a pull.default configuration which ideally could be specified on a repository-specific basis. Ramkumar doesn’t seem to have pursued his idea further.

Second big one

A couple of months later Andreas Krey triggered a huge discussion when he pointed out that the order of the parents when doing a merge was wrong. What does that mean?

So this is what git pull does:

The new merge commit D merges B into C, this is fine if you are a maintainer, and instead of “origin/master” you are pulling something like “john/topic” into your “master” branch, in that case your “master” (C) is the first parent. But it’s not OK for normal developers (i.e. non-maintainers).

A normal developer would want to do the opposite: to merge his “master” (C) to the upstream “origin/master” (B), so the first parent is “origin/master” (B). This is not a trivial thing. The order of the parents completely changes how history is visualized, and even traversed, plus it’s important when resolving conflicts, along many other things.

Junio’s response:

Don’t do that, then.

For the record: there’s no way to tell git pull not to reverse the parents, and the four people that expressed an opinion said they would rather have the order reversed (John Szakmeister, Jeremy Rosen, John Keeping, and of course Andreas Krey).

Junio said he would not be opposed to an option to reverse the order of the parents, but as you’ll see later on, when I sent a proposal to do precisely that he ignored it.

Yet again another suggestion to make --ff-only the default, this time from John Keeping. This round I chimed in, and so did many others. Junio was not convinced this was the way to go, because integrators would be affected:

It is not about passing me, but about not hurting users like kernel folks we accumulated over 7-8 years.

Since I have worked on the Linux kernel, I explained to Junio that most lieutenants (submaintainers) don’t do git pull; they send pull requests to Linus, so essentially Linus is the only person that does git pull as an integrator (as Junio does). Linus is certainly capable of configuring git pull to something other than --ff-only (which I don’t know why Junio switched for --rebase in his responses).

That is not something I can agree or disagree without looping somebody whose judgement I can trust from the kernel circle ;-).

That somebody was none other than Linus Torvalds, who agreed with us:

It would be a horrible mistake to make “rebase” the default, because it’s so much easier to screw things up that way.

That said, making “[--ff-only]” the default, and then if that fails, saying [a big warning message], [would make sense].

Linus suggested it would be even better to make this configuration per-branch, which is similar to the suggestion of Theodore Ts’o’s idea of making it per-repository.

Eventually, using Mercurial as inspiration, I suggested the following error message:

The pull was not fast-forward, please either merge or rebase.

This seemed to land on Junio, who finally saw why --ff-only made sense:

Initially I said limiting “git pull” to “--ff-only” by default did not make sense, but when we view it that way, I now see how such a default makes sense.

John Keeping came with another good idea by using a hypothetical command git update as an example. This command, unlike git pull would have the correct order of the parents, and would be similar to svn update, and others. While John didn’t propose such command, eventually I did.

After Junio realized the value of --ff-only he event sent a patch integrating Linus’ suggestions and mine. However, I came up with a simpler approach that in my opinion was better, and that triggered another big discussion.

Third big one

My response to Junio’s patch triggered yet another big discussion.

Richard Hansen argued that git pull was straight up dangerous, and he recommended this instead:

git remote update -p
git merge --ff-only @{u}'

He suggested people have that as a git up alias.

In September of 2013 I sent a patch series to add a pull.mode configuration which can be set to merge-ff-only. It was completely ignored, however Junio soon after made a similar proposal on the other thread: pull.integrate with an option of fail back. I argued that my proposal of pull.mode was better, and other people agreed.

Another point I argued is that we needed a deprecation warning before changing the default mode, something like “in the future you would have to choose a merge or a rebase, merging automatically for now”.

Jeff King provided some useful information from GitHub trainers, essentially they didn’t like the rebase option by default because most people don’t know what a rebase is. This is what prompted me to add the following line to the warning:

If unsure, run “git pull --merge“.

v3 of my patch series did receive some feedback, but it got stuck with trivial details. I tried to fix them for v4, but didn’t receive any feedback at all.

Moving on

At the beginning of 2014 David Aguilar sent a couple of patches to add the pull.ff configuration, and Junio simply merged them with no comment.

This ignored all the previous feedback, and doesn’t add a configuration per-repository, or per-branch, but well… At least there’s a configuration now (sort of).

Pull is Evil (first huge one)

In April of 2014 Marat Radchenko pointed out a list of problems 20 newcomers in his organization faced after trying everyday tasks. Most of the problems came from the order of the parents in git pull. Marc Branchaud pretty early on renamed the thread to “Pull is Evil” since in his opinion git pull is pretty much broken, due to the fact that different people use it in different ways.

I think the pull command should be deprecated and quietly retired as a failed experiment.

Junio brought up again the idea of a git update command, which would demarcate the behavior meant for integrators, and normal developers.

Another interesting idea came from W. Trevor King: add an --interactive option (like git add or git rebase) so the user can choose to a) merge, b) rebase, or c) fast-forward.

I pointed that this has been discussed before, everyone agrees the default should change, and I had already sent the patches. Matthieu Moy rejected the notion that everyone was in favor, since he himself wasn’t, but I pointed that he was the only person not in favor:

But in fact, he was in favor, not with changing the default, but with the configuration I proposed. So he wasn’t against all my patches, just one.

Either way I still think Matthieu Moy is wrong. Yes he is right in questioning why ask if to merge or rebase if most newcomers would not want to know at that point in time, but the answer is simple: they don’t have to know, they can simply do git pull --merge and find out later why that --merge is necessary. Exactly the same thing happens with git commit --all.

Pull is Mostly Evil (Pull is Evil part 2)

Marc Branchaud din’t find a good subthread to reply to, so he started yet another thread. In it he suggested that since git pull can only do the right thing if configured properly, it should be an advanced command that must be configured to be used, and when it’s not configured it does nothing by default.

Junio once more pondered about a git update command that would separate the two clearly distinct ways of using git pull.

Being frustrated with the lack of follow-up I made the prediction that the consensus and the proposals will likely be ignored and nothing would change:

And it has been discussed before. If history is any indication, it will be discussed again.

Spoiler: I was right.

Jeff King did not like my comment, and also stated that he wasn’t sure the majority of users would want the parents on a merge done with git pull to be the other way. This despite many big threads started precisely because the order of the parents was wrong, and many developers stating that this was the only reason why they didn’t use git pull, nor did they recommend newcomers to use it. I offered to point him to the many endless discussions in the mailing list.

Richard Hansen wrote a comprehensive description of the issues, and how they could be solved with different new commands: git update and git integrate, as well as other options.

The rest of the discussion was pretty much Richard Hansen and me hashing out different use-cases and ways to deal with them.


The discussion did not go without hiccups. In particular David Kastrup and James Denholm threw some personal attacks at me that had absolutely nothing to do with the discussion. Back then I was much more abrasive that I am today, in many instances throughout the entire history of the mailing list (including these discussions) the present me would have reacted differently, but that doesn’t change the fact that I’ve always focused on the code, not so much the feelings of participants of the discussion.

I quickly ignored them and focused on the code itself.

One more try

Since this issue popped yet again, I sent v5, and a v6 of my pull.mode approach. They did receive some amount of feedback, but in particular there was a comment from Richard Hansen that showed yet another issue with git pull.

Supposing we are currently on the branch “master”, and the upstream branch is “origin/master”.

  1. git pull --rebase (rebase master onto origin/master)
  2. git pull --merge (merge origin/master into master)
  3. git pull --rebase github john (rebase master onto github/john)
  4. git pull --merge github john (merge github/john into master)

Half of these are wrong. #1 is correct, since when we do git pull --rebase we want to update our “master” to “origin/master”, so we rebase on top of, but #2 is wrong, because when we use --merge we want a similar thing: we want to merge “master” to “origin/master”, not the other way around. This is what Andreas Krey pointed out, which started the second big thread. But in addition #3 is wrong too, because when we specify a repository and a branch to pull from, we want that branch to rebase on top of master, not the other way around.

So basically it’s all wrong.


After listening and responding to everyone, all the issues with git pull were clear on my mind, but I realized not everyone had the patience, the stamina, or the time to go through all the responses, so I wrote a summary.

I reduced the main problems with git pull to two:

  1. It does unwanted merges
  2. It does merges with reversed parents

The deeper issue is that git pull is in fact used in two very different ways: to integrate and to update. So when you try to fix the defaults for one, you stumble into unintended consequences for the other. While I do see clearly in my mind how git pull can be disentangled in a way that the defaults would work correctly for everyone, it is pointless if I’m the only one who sees that, because I can’t convince anyone (except perhaps Richard Hansen).

Instead I chose an escape hatch, a completely new command: git update, which would integrate the input from everyone in order to do what git pull can’t do cleanly.

By default git update would be essentially the same as git pull --ff-only (or more like my pull.mode=ff-only), which does solve the main issue: newcomers won’t do real merges by mistake. But it also solves the issue about the order of the parents:

  • git update --merge (merge “master” into “origin/master”)
  • git pull --merge github john (merge “github/john” into “master)

Not only that, but we could fix the order of the parents for rebase too:

  • git update --rebase (rebase “master” onto “origin/master”)
  • git pull --rebase github john (rebase “github/john” onto “master”)

Now everything is fixed. Additionally the default mode for update could be configured with update.mode, and the mode for pull with pull.mode (e.g. update.mode=rebase, pull.mode=merge). Not only that, but this could be configured per-branch (as Linus Torvalds and Theodore Ts’o suggested) with branch.master.updatemode and branch.master.pullmode. This was not only a proposal, I wrote the patches for all of that.

All users agreed my proposed git update was good, as Philippe Vaucher stated:

I know patches from Felipe are sensitive subject and that a lot of people probably simply ignore him entirely, however I think this patch deserves more attention.

No developer commented on it.

Moving on

Nothing resulted from the big Pull is Evil thread, except after it was finished, Jim Garrison asked how would pull do the wrong thing? Junio sent a comprehensive reply and Stephen P. Smith’s proposed to pretty much copy-paste the email from Junio into the list of howtos, and was accepted without any pushback. Not only was the fist version accepted, but Junio himself fixed the patch so no v2 was necessary from Stephen.

Unnecessary merge

At the beginning of 2018 Linus Torvalds once again contacted the mailing list due to an “unnecessary merge” in the Linux kernel. He explained the difference between a merge and a “back-merge”, and why in one you want a fast-forward, but not the other. Unfortunately git doesn’t know which situation we are in.

He even suggested the following configuration:

   git config --global alias.update pull --ff-only

This is precisely what my proposed git update did by default. Linus Torvalds even used the exact same name.

Junio then proceeds to suggest a per-repository configuration pullFF=only, when in fact Linus had already proposed a per-branch configuration back in 2013, and I had already implemented it. Nothing happens.

Why merge?

In 2020 Robert Dailey asks why does git pull use merge by default, he argues most people prefer rebases. Once again Elijah Newren argues that it would be desirable to consider a transition to ff-only by default. Once again Konstantin Tokarev tells Junio that in reality what newbies often do is wrong merges, and that ff-only would be better. Robert Dailey also gets convinced that ff-only is better. Alex Henrie mentions a patch he wrote but was reluctant to send it in order to avoid a big discussion.

The warning

Alex Henrie finally decided to send his patch which adds a warning telling users that running git pull without specifying to either merge or rebase is discouraged. Initially Alex wrote the warning to state that the default behavior would change in a future version of git, but Junio objected to that because there was “no consensus”:

Sorry for not catching this in the earlier round, but I do not think anybody argued, let alone the community came to a concensus on doing, such a strong move. Am I mistaken?

After five rounds of comments from Junio, he accepted the patch.

Interestingly enough this change immediately caused people to ask what it was:

Also, we have this lovely comment on GitHub:

Thanks for confusing everyone instead of just doing the default strategy when no parameter is given.

Imagine when every (CLI) application warns the user about not having each default value within the configuration file or given as parameter, despite the user is OK with (or does not care about) the default behaviour.

And we have a bug report from Wolfgang Fahl completely in German. Roughly translated: the bug is the warning.

Clearly people did not like this warning.

Pick the right default

On November of 2020 VĂ­t Ondruch complained about the warning and stated that the Git project should choose a sensible default and not bother users with a warning.

I’d like to keep my trust in Git upstream, but this is was not trustworthy decision IMO.

This started because of a bug report on Fedora.

The main argument from Junio is that there was no perfect default:

There is no clear “default that would suit everybody”.

I pointed out that this is a false dichotomy because the options are not 1) implement the perfect solution, and 2) do nothing; we have another option 3) implement a better solution than the current one, and we all know what that solution is: ff-only. Plus we had the patches for that.

Jeff King pointed out that now there’s a pull.ff=only configuration, and he didn’t see how my pull.mode=ff-only was different. However, I explained that my version provides a more friendly error message, also works correctly with pull.rebase=true, and in addition allows a default mode that throws a more correct warning than the current one.

Junio proposed to make git pull --rebase honor pull.ff=only and fail if the pull is not fast-forward:

And then making an unconfigured pull.ff to default to pull.ff=only may give a proper failure whether you merge or rebase. I dunno.

This is obviously the wrong thing to do. As I explained to both Junio and Jeff, if you make git pull --rebase respect that, and you make pull.ff=only the default, then git pull --rebase will always fail (unless it’s a fast-forward case).

Once again I stated what should be the default.

  • git pull (fail by default unless it’s a fast-forward)
  • git pull --merge (force a merge [unless it’s a fast-forward,
    depending on pull.ff])
  • git pull --rebase (force a rebase [unless it’s a fast-forward,
    depending on pull.ff])

VĂ­t Ondruch agreed.

Both Jeff and Alex Henrie argued that what my pull.mode does could be achieved with pull.rebase and pull.ff, and that’s the direction Junio wanted to take. I wasn’t convinced, and at this point I started to convert my patches from 2014 which was no easy task because before git pull was written in shell, but now it had been converted to C, so essentially I had to rewrite everything from scratch. Alex started to write his approach too.

Junio welcomed the competition:

Let’s see how well the comparison between two approaches play out.

Pretty quickly I realized what they suggested was not possible. If I removed my pull.mode and made pull.ff=only the default, then git pull --merge would fail. We want git pull fail by default, but not git pull --merge. To make this approach possible we would need to change the semantics of pull.ff=only, and that didn’t seem a clean solution to me.


In order to avoid all the hurdles from 2013 I decided to write the simplest patch possible that would achieve the desired result. Instead of creating a new pull.mode=ff-only, I reused an existing configuration, and named the mode pull.rebase=ff-only. This obviously was not ideal–since it left the interface in an inconsistent state (didn’t fix all the issues my original patch did)–but it was better than the current situation.

Junio responded that the name looked funny:

It looks quite funny for that single variable that controls the way how pull works, whether rebase or merge is used, is pull.REBASE.

Yes Junio, that’s why I suggested pull.mode instead, but that patch was never merged.

Raymond E. Pasco also agreed ff-only was the way to go.

pull.mode revampled

My first attempt to revive my old series recreated most of the same old behavior, but not all. It only received one comment for a tiny correction on the documentation.

Since my patches for pull.mode were ignored again, on my v2 I sent a bunch of improvements that were not 100% related to my approach, and instead of using my proposed pull.mode, I changed the semantics of --ff-only, which was not my preferred solution, and I clearly demarcated that particular patch with “tentative“. Elijah Newren provided a ton of valuable feedback, and I incorporated most of it, but there was one point where I did not agree:

If unsure, run “git pull --merge“.

I believe that escape hatch is useful, even if it undermines the value of the warning because 1) the warning must be temporary anyway 2) people are complaining about the warning already, 3) input from GitHub trainers already showed many people don’t even know what a rebase is, and 4) the merge is happening already anyway.

I explained my point of view to Elijah using an analogy. If a passenger in a plane is ignoring the safety demonstration he is doing something wrong, however, he should be completely free to do it, it’s his volition. The crew should allow him to ignore the demonstration. Similarly git users should be able to ignore the warning, and they can skip it by simply doing git pull --merge.

I argued that even if he didn’t agree on that particular line of that particular patch, the rest of the changes in the series don’t suddenly make the situation worse. He said he didn’t agree.

Junio used my analogy to liken users doing git pull --merge when they shouldn’t to be the same as passengers that should be removed from a plane:

A team member who runs “pull --[no-]rebase” pushes the result, which may be a new history in a wrong shape, back to the shared repository probably falls into a different category. … Or perhaps in the same public hazard category?

I don’t see how one line in a temporary warning can potentially create a “public hazard”. It’s users doing wrong merges what creates problems, and removing one line from the warning wouldn’t magically fix that. The warning–which if we take Stack Overflow and reddit as indication–isn’t even properly understood by many users, and doesn’t stop the code from creating a merge anyway. If these merges did indeed create a “public hazard”, shouldn’t git simply not do them by default? In other words: make ff-only the default.

Sadly I did not think of that argument at that time, I merely responded that what constitutes a public hazard is up to each individual project to determine, not git. Which is true.

On the other hand he said something that was completely untrue:

I hope “his comment” does not refer to what I said. Redefining the semantics of --ff-only (but not --ff=no and friends) to make it mutually incompatible with merge/rebase was what Felipe wanted to do, not me.

That’s not true. I wanted to introduce pull.mode=ff-only, not change the semantics of --ff-only. But as I had already explained before, --ff-only cannot be made the default without changing its semantics. The patch that introduced pull.mode was ignored (again), so I took a shot at making --ff-only work, and the only way is changing its semantics, which I did in a patch clearly labeled as “tentative” because I was not advocating for that.

And to make it crystal clear I repeated it yet again:

I don’t want to change the semantics of --ff-only; I want to introduce a new “pull.mode=ff-only“.

Moreover, I explained once again that this patch series was merely part 1. In part 2 I would introduce pull.mode. And there’s more parts after that.

For some reason Junio didn’t seem to understand what I was proposing. I explained that with pull.mode=ff-only I wanted to introduce a simple error message, but with pull.mode=default a warning to give users a heads up.

I quite don’t get it. They say the same thing. At that point the user needs to choose between merge and rebase to move forward and ff-only would not be a sensible choice to offer the user.

Other people like Jacob Keller did not have any issue understanding my proposal.

After multiple rounds eventually Junio gets my proposal, but now he doesn’t see the point of taking it slow instead of flipping the default to ff-only right away. Junio said that he assumed everyone has already configured pull.rebase therefore nobody would notice the change. I disagreed and pointed why this whole thread started: because a user refused to configure git and expected git to do the right thing–choosing a useful default.

Junio responded as he did at the start:

Which is basically asking for impossible, and I do not think it is a good idea to focus our effort to satisfy such a request in general. There is no useful default that suits everybody in this particular case.

This is once again the Nirvana fallacy. Nobody asked for a perfect solution, merely a better default.

Additionally he argued the pull.mode=ff-only would only be useful for users that don’t use git “for real”:

But for anybody who uses git for real (read: produces his or her own history), it would be pretty much a useless default that forces the user to say rebase or merge every time ‘git pull’ is run.

I mean… didn’t he just argue that he wanted to make ff-only the default right away? Literally in his last reply. Very confusing.

I showed to him a real example of me using git pull --ff-only “for real”. Additionally I asked him in the most polite way I could a question that should be very valid at this point:

How often do you send pull requests?

He didn’t answer.

Since once again Junio stated that he didn’t see a good reason why certain classes of users would want to configure pull.mode=ff-only, I offered to do some mail archaeology which lead to the creation of this blog post. It is obvious this is needed.

Once again Junio misinterpreted what I was proposing, and once again I explained in great detail. This time however Junio correctly identified that without pull.mode the user cannot explicitly ask for the default behavior, and that in my opinion is a crucial thing the user must be able to do in order to test this proposed new default. Moreover, I explained that the name is irrelevant, it could be pull.mode=ff-only, or pull.rebase=ff-only, or pull.ff=only, but one option should turn on the new behavior we are proposing to users, and currently no option does.

To cut to the chase I explained that we want:

  1. git pull to eventually fail in non-fast-forward cases
  2. A grace period before that switch is flipped

And so far no other proposal but mine achieved that.

Since Junio insisted pull.mode was not necessary (which I had already demonstrated multiple times it is), I asked him a poignant question to unequivocally demonstrate that:

So, after your hypothetical patch, there would be no difference between:

git -c pull.rebase=no -c pull.ff=only pull


git -c advice.pullnonff=false pull


Junio answered that both would error out, which is correct. But then he has to concede my point, because:

  • git -c pull.ff=only pull (fail)
  • git -c pull.ff=only pull --merge (fail)

If both fail we cannot tell the user:

Not a fast-forward, either merge or rebase.

Because the user would do git pull --merge and that command would fail as well because it will try to do git merge --ff-only and it’s not a fast-forward. That’s it. pull.ff=only can’t be used for this. Period.

Junio did not reply.

Part 2 was completely ignored.

When Junio sent his regular mail with the status of all patch series he didn’t include any of my proposals. Since Junio himself was the one that asked for Alex and me to provide our competing proposals I replied to him asking for the status of the ff-only topic. He didn’t reply.

Alex’s approach

Junio asked for both my approach and Alex’s approach which he sent in two patches. One changing the warning to state that pull.ff=only will be the new default, and the other actually changing the default.

Junio did not think the approach was correct, and neither did I. When I explained why, Junio mis-read my response, but this time he himself caught the mistake, and then said the following:

If we instead introduced a separate command, say “git update“, that is “fetch followed by rebase” (just like “git pull” is “fetch followed by merge”), to rebase your work on top of updated upstream, there wouldn’t be a need for us to be having this discussion.

Yes, if my suggestion to add git update was heard we wouldn’t be having this discussion (yet again).

But to move forward I argued that there’s three things to do:

  1. Improve the annoying warning
  2. Consider changing the semantics of --ff-only, or implement pull.mode=ff-only
  3. Consider a new “git update” command

Since my patches for pull.mode were not being reviewed, I suggested to focus only on #1 for now.

Jacob Keller chimed in and also agreed the order of the parents with git pull was a significant problem.

I also reminded Junio of the purpose of the patch, to which he replied:

Sorry, I know you keep repeating that “keep in mind”, but I do not quite see why anybody needs to keep that in mind. Has a concensus that the repurposed --ff-only should be the default been established?

This is the title of the patch from Alex we were supposed to be reviewing “pull: default pull.ff to "only" when pull.rebase is not set either“. The whole point is to make pull.ff=only the default, and in that case git pull --merge would fail, which is not what we want. Unless the semantics of pull.ff=only are changed, which is not my proposal, my proposal (pull.mode=ff-only) is better because it doesn’t need that change.

He did not reply again.

Alex stated that he had trouble following the discussion and simply left it at the table.

A fast-forward

In v3 I focused on fixing all the issues without committing to any particular approach. However, Junio objected to my use of the word “fast-forward”:

… I find the phrase “in a fast-forward way” a bit awkward. Perhaps use the ‘fast-forward’ as a verb

I argued that if fast-forward was a verb, we would have git fast-forward command (which I actually think is a good idea), but currently we don’t have it, so fast-forward is a modifier. In particular if git merge is a verb, then --ff is an adverb.

Elijah disagreed:

A square is a special type of a rectangle, but that doesn’t make “square” an adjective; both square and rectangle are nouns.

Since I’m deeply interested in language, this is a discussion I could really weigh in:

Words have multiple definitions. The word “square” is both a noun, and an adjective. It’s perfectly fine to say “square corner”.

Moreover, I found plenty of instances in the documentation which use fast-forward as an adverb. Even the git-merge man page uses “fast-forward merge”.

  • non-fast-forward update
  • fast-forward merges
  • fast-forward check
  • “fast-forward-ness”
  • fast-forward situation
  • fast-forward push
  • non-fast-forward pushes
  • non-fast-forward reference
  • fast-forward cases
  • “fast-forward” merge

This time I managed to convince him.

However it didn’t escape to me the fact that the documentation already talks of “fast-forward update”, so this suggests that the git update command could fill a whole already present in the documentation.

Everything at once

Since Junio had trouble figuring out my proposal in full, I decided to send the 19 patches together at once, and in addition I wrote a detailed explanation of what I was proposing to happen at every step of the way.

Is it more clear what is my proposal?

No reply.

Improve the default warning

Since I didn’t see much hope in fixing git pull‘s default, for v5 I decided to drop everything except what fixes the most urgent issue: the warning on every single pull.

Junio objected to the name of a variable I used: default_mode. He said the word “mode” was overused, and he proposed a “more focused” word, he came with rebase_unspecified. Additionally he didn’t want this variable to be global, even though there were already 38 global variables, so why did he feel that 39 was too much? I have no idea.

Moreover he made some comments on the test changes:

We are merely allowing fast-forward merges without unnecessary merge commits, but we are faced to merge c1 into c2, which is not ff. The command goes ahead and merges anyway, but why shouldn’t we be seeing the message? I am puzzled.

Now, for the most part I haven’t gone into code in this post, but I think it’s important you feel a little bit of my pain here.

git reset --hard c2 &&
test_config pull.ff true &&
git pull . c1 2>err &&
test_i18ngrep ! "Pulling without specifying how to reconcile" err

The name of the test gives a hint to what it’s trying to do “pull.rebase not set and pull.ff=true“. As you can see there’s no pull.rebase configuration (or --rebase), but there is pull.ff=true. So clearly it has something to do with that pull.ff.

What this test is doing is trying to pull c1 into c2, and since the history has diverged the merge cannot be a fast-forward, therefore the code should throw a warning…

except that only happens if the user has not specified any --ff* option (--ff, --no-ff, or --ff-only), either through to command line, or from the pull.ff configuration, and the test does set pull.ff=true. Therefore there’s no warning.

This is behavior Alex Henrie added and Junio accepted with no comment. In my opinion this behavior is wrong, but that’s what the code does at this point, and even Junio did not expect this, that’s why he was puzzled.

I explained to him that the code only checks for opt_ff, and I tried to fix that behavior in v1, v2, v3, and v4 of my patch series, but he did not comment on those patches.

So the tests used to do a fast-forward from c0 to c1, and that would normally throw a warning (as all pulls did), but because pull.ff=true is set, it doesn’t. This test written by Alex Henrie ensures that pull.ff=true doesn’t generate a warning. But after my patch doing a fast-forward from c0 to c1 would not generate a warning anyway, so the pull.ff=true doesn’t do anything, and therefore the test doesn’t do anything. That’s why I changed the test to start from c2, because then the merge is not fast-forward, and a warning is thrown unless pull.ff=true is set, which is what the test was supposed to be testing in the first place.

To show Junio that the tests are doing what they originally intended to do, I modified all the tests to remove the pull.ff configurations:

git reset --hard c2 &&
git pull . c1 2>err &&
test_i18ngrep ! "Pulling without specifying how to reconcile" err

Now the test fails.

Junio was not convinced. He stated that the original test was checking that the fast-forward would work fine, but that’s not true since from the test we can see that the only thing it’s doing is checking there’s no error message. The code could very well have done a real merge instead of a fast-forward, and the test would still pass. I explained to him that the tests that actually check for the fast-forward are in another file. This particular test is only about the warning message.

Junio claimed to know what the original author of the test intended to do and what he cared about, and that was to test that pulling c1 into c0 issued a message, except none of the tests actually did that, they checked the opposite: that the message is not there. So now that my patch skips the message, the tests pass, but for the wrong reason.

git reset --hard c0 &&
git pull . c1 2>err &&
test_i18ngrep ! "Pulling without specifying how to reconcile" err

If instead of starting from c2 we start from c0 as the original code, then the pull is fast forward, and there’s no message, so this test pass. But this test was supposed to be checking the output with pull.ff=true, that’s why it was called “pull.rebase not set and pull.ff=true“, if you remove the configuration it still passes. So in the fast-forward case what is this test supposed to be doing if it cannot possibly fail now? Nothing.

I explained that and in addition I said that I’m not in the habit of attempting to read minds, what the test are trying to do is very clear from the code.

Junio replied.

You do not have to read mind. What is written in the test is clear enough: run “git pull --rebase . c1” into c0 and see what happens.

Yes, that was precisely my point: no need to read minds (although my reading of the code is different).

Do you feel my pain? I don’t mind going as many rounds as necessary to improve a patch series, even ten or more, but these rounds are improving nothing. The name of the variable doesn’t matter, because in a few patches later I remove it, and even these tests for the warning don’t matter because eventually we want to remove the warning completely.

And this is only one test, of one the patches, of one of simplest patch series.

But fine, if Junio wants to keep a bunch of tests that literally test nothing, I’ll oblige. For v6 I left the useless tests untouched and added the real ones separately.

Junio isn’t entirely happy

Junio sent v7 himself, stating he wasn’t happy with the way the --ff* options were handled. He suggested that these options should imply a merge, but I showed situations in which that would become tricky.

Additionally he wrote a patch to remove my global variable, and seized the opportunity to state:

There is no reason to make the code deliberately worse by ignoring improvement offered on the list.

Really? I did not ignore the comments, I responded to all of them, but I disagreed with some. And so if I disagree with any of Junio’s suggestions for improvement I’m making the code “deliberately worse”? If I have to do what he says, then those are not suggestions, those are orders. Not to mention the v7 he is sending contains the patches I sent in v6, which have the useless tests–a change he suggested–and even though I was 100% against, I included it.

Anyway, since Junio didn’t want to discuss his proposal to imply merges any further, finally he merged this patch series.

After this patch the warning becomes much less annoying. So that’s progress.

What happened to the rest of the patches? Nothing.


After this latest attempt to fix git pull, I distanced myself from the project for a few months, and then came back. I decided to resend some of my patches, but now separately, to minimize the possibility of them being rejected.

  1. Cleanups
  2. Documentation
  3. --merge

Of these only #1 was merged (though not yet in “master”). #2 generated yet another discussion of the semantics of “fast-forward” but nobody was against the changes themselves. #3 everyone is in favor, but no comment from Junio.

If you think any of these will eventually be merged you are more hopeful than I am. Also, there’s this comment from Junio: (emphasis mine):

If some contributors get too annoying to be worth our time interacting with, it is OK to ignore them.

He seems to be referring to me.

So where does that leave us? The order of the parents is still wrong. And the default mode still isn’t fast-forward-only. Regardless of what you may think of my communication style, I’m the one that has actually sent patches to fix all these issues. And unlike Junio and many other git developers–I’m not being paid to work on this, I’m doing this altruistically (I don’t even use git pull myself). If I’m not the person to fix this, who is? I left the project for about 6 years and nobody fixed it.

Recently I challenged Alex Henrie to send a patch to try to fix the situation. He took me up on the challenge, and when I reviewed his patch he immediately realized that more work is needed (as I predicted he would find).

What to do now?

The final solution

While writing this post I realized just how many suggestions were proposed, but not implemented, so I decided to try once again to fix the situation, but now consider absolutely all the feedback in a holistic way.

Here’s there result.

git fast-forward

The documentation about what a fast-forward is, and how to attempt one, is scattered all over. Additionally the simplest way to attempt one is with git merge --ff-only which isn’t user-friendly, and some people don’t even consider a fast-forward to be a modifier of a merge (i.e. adverb).

By having git fast-forward command we solve both problems and fast-forward in the git context now becomes a verb: something independent of git merge.

Additionally there’s an advice for newcomers:

hint: Diverging branches can't be fast-forwarded, you need to either:
hint:   git merge
hint: or:
hint:   git rebase
hint: For more information check "git help fast-forward".
hint: Disable this message with "git config advice.diverging false"

Advanced users can disable this specific advice (or all advice).

As the advice says, anyone who wants to learn more about fast-forwards can simply do git help fast-forward.

git update

I converted my old tool from shell to C, and although it doesn’t have yet all the features, it does have plenty of functionality.

git update without arguments simply does git fetch + git fast-forward. If the fast-forward fails, we get the above advice. This solves the first problem that newcomers often do merges by mistake.

Also, it’s possible to do git update --merge (or configure update.mode=merge), and on those cases the order of the parents is reversed properly, as everyone wants (everyone that is not an maintainer).

There’s also git update --rebase, and also a per-branch a configuration branch.<name>.updateMode.


And of course all the other patches that improve git pull, including pull.mode=fast-forward, a better warning that shows that in the future git pull will do fast-forwards by default, along with a per-branch configuration branch.<name>.pullMode.

Also, thanks to git fast-forward we also get the same divergence advice, which will be permanent, but easy to turn off.

In total it’s 33 patches, and one additional to flip the default in the future (after users have had a chance to configure pull.mode to whatever they prefer).

The patches have been sent, but I wouldn’t hold my breath.

Sadly this is where the story ends, even though everyone has agreed on what is the path forward “for some reason” the patches will not be merged, and new users will be forever doomed to keep making mistakes because git pull was not designed for them.

Wait a second…

Git is a distributed version control system. You are not forced to use Junio’s repository, you can use mine. Everything I’ve talked about is fixed there. If you are interested why not give it a try?

Note: Some people are thinking that the purpose of this article is to point fingers, but that could not farther from the truth. The only reason names are named is to be able to follow the discussions (or at least attempt to). I am not seeking to assign any blame to any one person for any action or inaction. In any discussion disagreements are to be expected, and it doesn’t have to mean anything more than that.

Adventures with man color

As it’s usually the case with me, a simple fix sends me to an unending rabbit hole of complex issues. And this was no exception.

It all started when I tried to help the Git project to move towards asciidoctor, a program that generates documentation from text files using a markdown language. The initial project was asciidoc, but it’s in a bit of a rot. The original asciidoc is written in Python (a language I detest), and the new asciidoctor in Ruby (a language I love), so I clearly saw an edge.

The problem starts with a feature asciidoctor has, but not asciidoc: generate man pages. Of course asciidoc can generate man pages, but it does so by first generating a docbook XML, and then docbook stylesheets can be used to convert those to man pages. The same can be done with asciidoctor, but additionally there’s an option to generate man pages directly.

Using docbook is very slow; generating man pages directly is way faster.

When I sent the initial patch (very trivial), a Git developer mentioned some “groff issues”. Apparently if your system uses groff (GNU troff), you are supposed to build git with GNU_ROFF=1:

Define GNU_ROFF if your target system uses GNU groff. This forces apostrophes to be ASCII so that cut&pasting examples to the shell will work.

It’s actually not “GNU groff”, but “GNU troff”, so this option is wrongly named, but regardless; virtually nobody is using it.

My Arch Linux system doesn’t build git with GNU_ROFF, it does use groff, and yet I don’t see any issues with apostrophes… So what’s going on?

The context

The original commit “Quote ‘ as \(aq in manpages” comes from 2009, and mentions the problem comes from “docbook/xmlto”, and apparently only affects groff. In the original thread “quote in help code example” you can see they mention the output of the git filter-branch command specifically, but when I look at that man page, it looks fine.

So I started to dig in, and for starters I don’t even know what groff is.

I tried different versions of asciidoc, they worked fine. Then I tried different versions of docbook stylesheets, they worked fine. I even tried different versions of asciidoctor, to see when the fixed the issue; the all worked fine. Weird.

So I decided to compile groff myself and find the point where it started to fail. First I tried a version two years ago, and it did not compile correctly, so I had to do some hacking to make it compile, after I did, everything worked fine. I continued going more and more into the past, fixing the compilation issues, and not finding any problem. I reached 2006, and still did not see any change.

This was strange. Clearly at some point in time, with some combination of tools there was a problem, but I couldn’t find either of those.

Then I decided to manually modify a man page, and put quotes directly… Bingo. I could see that ' was actually rendered as `, but what caused it?

I then moved forward in versions and to my surprise they all had this issue, even the most recent version of groff.

What on Earth is going on here?

While doing this I noticed something different from my compiled version of groff, and Arch Linux’s version. In the compiled version of groff I saw links rendered as blue. When I saw the generated man pages I saw \m[blue], but I assumed that was for some other kind of troff program or something totally unrelated. But no, here I was seeing blue, but not with all the groff binaries.

So I tried to build groff with the same options as Arch Linux… Still blue. After trying a few things I eventually found a difference: Arch Linux installs a file /usr/share/groff/site-tmac/man.local, if I remove that file the blue color returns. Inside the file there’s:

\" Shut off SGR by default (groff colors)
\" Require GROFF_SGR envvar defined to turn it on
if '\V[GROFF_SGR]'' \
  output x X tty: sgr 0

That’s it! If you export GROFF_SGR=1 on Arch Linux, you see man pages with colors, just like my compiled version. The reason my compiled version does this by default is that it doesn’t have Arch Linux’s man.local file.


If you google GROFF_SGR you find that it’s not properly documented. Some distributions such as Debian and Arch Linux do disable groff’s colors, but they don’t document doing so. Debian “fixed it”. However, I don’t think most people are going to read the entirety of grotty’s man page, not even a little bit, so that doesn’t help, even if you are running Debian–where it’s documented.

However, if you read the man page, you will find another variable: GROFF_NO_SGR. Unlike GROFF_SGR, this one is standard, and it’s respected in all distributions.

This reminded me of trick I learned while reading Arch Linux’s installation guide:

man() {
    LESS_TERMCAP_md=$'\e[01;31m' \
    LESS_TERMCAP_me=$'\e[0m' \
    LESS_TERMCAP_so=$'\e[01;44;33m' \
    LESS_TERMCAP_se=$'\e[0m' \
    LESS_TERMCAP_us=$'\e[01;32m' \
    LESS_TERMCAP_ue=$'\e[0m' \
    command man "$@"

This code automatically converts parts of man pages to color (e.g. bold and underline), which looks much better than normal man pages, but it turns out it only works if you have groff’s SGR disabled, so… In other distributions you need to do GROFF_NO_SGR=1, for the above to work.

Cool. We found something.

Back to apostrophes

There’s something else in Arch Linux’s man.local:

char \' \N'39'

This converts \' to ', instead of groff’s default: \(aa (acute accent: ´). This is the reason why I could not reproduce the problem: Arch Linux was hiding it. However, this is the wrong way of fixing it. There is a reason groff developers decided to pick \(aa; they know better. Distribution packagers should not be overriding this.

Why did they do this? The change came due to task FS#9643 – man PKGBUILD shows slanted single quotes. This happened in 2008, which suggests there was indeed an issue around that time, and pacman documentation was built with asciidoc too.

Arch Linux fixed the issue in the wrong way, though. Debian chose a different path. In their bug report #507673 Shouldn’t parse ‘ to \’ they discussed the issue at length, and they correctly identified that the issue was in docbook-xsl (not groff), and if you are going to convert ' it should be to \(aq, but that would only work in groff. They also found that Pod::man had a portable solution:

.ie \n(.g .ds Aq (aq
.el .ds Aq '

This creates an alias: from Aq to (aq, but only when the program is groff, in all other programs it gets converted to '.

This is the correct solution in the correct layer, and generates the proper output everywhere.

But to check that it is the correct solution it would behoove us to understand what groff actually is. groff (or GNU troff) is a document formatting system; it receives text mixed with commands, and generates documents. A man page is just one of the many types of documents it can generate. It can for example generate a PDF.

So, let’s write a simple groff document:

single quotes: 'text'
single quoted quotes: \'text\'
apostrophe quote: \(aqtext\(aq

We can generate a PDF document using groff -T pdf test.groff > test.pdf. But this of course is not what we ultimately want, we want to generate a man page, in the same way as man does. To do that we need to specify the output device as utf8: groff -T utf8 test.groff > test.txt. This generates the following:

single quotes ': ’text’
single quoted quotes \': ´text´
apostrophe quote \(aq: 'text'

As you can see the output text is quite different from the input text; that’s what groff does. But this is only the case on a utf-8 system, if you specify the ascii output, then all the quotes above get translated to simply '.

The output with \(aq is correct in both utf-8 and ascii. And if we add the Aq alias:

apostrophe quote alias \(Aq: 'text'

That is indeed correct. The Debian fix seems to work. To make sure we would need a non-GNU troff, like in Solaris, but alas, I don’t have access to something like that, so I’m just going to assume it works in other systems (as other people reported it did).

This proper fix was eventually picked by docbook in 2010: Fixed bug #2412738 (apostrophe escaping) by applying the submitted patch. If you take a look at the code of git-filter-branch.1 you can see the fixed code in action:

git filter-branch --tree-filter *\(Aqrm filename*\(Aq HEAD

Therefore both Arch Linux’s and git’s workarounds are not necessary anymore. Yet they remain there ten years later.

Digging deeper

OK, so we found out what the issue was, and how it got fixed: in docbook, and also unnecessarily in git and Arch Linux (three different levels). But what caused it? Going back to groff from 2006 didn’t cause the issue, so what happened?

By looking back at docbook stylesheet’s history with git blame, I found out the commit that caused the issue: Reverted necessary escaping of backslash, dot, and dash. This commit happened in 2007, and it was made due to an internal limitation of docbook’s architecture.

So from 2007 to 2010 docbook stylesheets were generating wrong man pages, different projects worked-around the issue in different ways, but today–in 2021–these workarounds are not needed, and yet they remain in place.

Back to Git

After all this investigation I sent a patch to the Git project (doc: remove GNU troff workaround) to remove the GNU_ROFF option which clearly was not needed since at least ten years. But I also sent a comment about what I found regarding GROFF_SGR, and the trick to colorize man pages. In reply I received a suggestion to implement the LESS_TERMCAP trick into git help (which is basically an alias for man).

So I sent a patch (help: colorize man pages), and a big discussion propped up (typical due to the bike-shed effect). In that thread it was mentioned: “why not let the user configure man to do this?”. The problem is that you have too many moving parts; groff, man, git, less, distribution configurations, environment variables, aliases, workarounds, docbook and asciidoc bugs… And of course the thing that started it all: asciidoctor.

But it made me think: what is indeed the best way to configure man to do this?

After several days of investigation, and several days of trying options I arrived to what I think is the actual solution.


export MANPAGER="less -R --use-color -Dd+r -Du+b"
export MANROFFOPT="-c"

Unlike Arch Linux’s hack, the -D arguments to less are much more succinct, and they allow adding color (in addition to the style (e.g. underlined)), not removing information. So --color=d+r (long option for -D) converts d (Bold text) to r (red), and the + signifies add color (i.e. don’t remove the bold attribute). Moreover --use-color adds other colors to the less interface; the prompt is cyan, searches are in green, and warnings in yellow.

And instead of the the ugly GROFF_SGR=1, we can tell man to pass -c to groff.

So the full command is:

groff -T utf8 -m man -c git-filter-branch.1 | less -R --use-color -Dd+r -Du+b

No man involved. This is way simpler… Why is nobody using this?

After I updated my patch and other people tested it, it became clear it didn’t always work. In particular older versions of less did not have the -D options (at least not for Linux). So I checked the history of less and I found out that they enabled -D for Linux systems in 2021.

No wonder everyone is still using the LESS_TERMCAP_* variables. Nobody knows of the new option, because it’s too new!

So the patch to remove the GNU_ROFF option in Git (totally necessary in 2021) is there. And so are the updated Arch Linux instructions to use the new -D flags of less.

If you want to properly colorize man pages, you do this:

export MANPAGER="less -R --use-color -Dd+r -Du+b"
export MANROFFOPT="-c"

If you want to colorize other similar documentation (like Ruby’s documentation):

export RI_PAGER="less -R --use-color -Dd+r -Du+b"

And so on. And if you want less to format everthing:

export LESS='-RXF --use-color -Dd+r$Du+b'

That’s it. If you are running a recent enough version of less, everything works perfectly with a simple configuration.

Oh, also, I realized asciidoctor didn’t have the portable fix, so I sent them a patch that is now merged. I found an issue with less colors and searches that is fixed now. And I reported their unnecessary workaround to Arch Linux too.