Cambios en la distribución del ingreso en México (2018-2020)

El Producto Interno Bruto suele ser la única métrica en la que muchos se enfocan para medir el progreso económico de un país, pero cualquier análisis a cualquier nivel demuestra que no es suficiente, incluso para medir un pantalón necesitas al menos dos números. Por si fuera poco, cuando ésta métrica disminuye, muchos tienden a culpar a una sóla persona (el presidente), cuando es responsabilidad de todo el país, y una infinidad de variables externas.

En el caso de México en el período del 2018 al 2020, tres situaciones independientes coincidieron: una desaceleración económica global, una pandemia significativa, y un presidente disidente.

Mi argumento principal en contra del PIB, y cambios en el PIB, es que debido a la desigualdad presente en todos los países del mundo, los incrementos o disminuciones no afectan a la toda población de la misma forma, y más aún en México que es de los países más desiguales del mundo.

Gracias a la Encuesta Nacional de Ingresos y Gastos de los Hogares (ENIGH) del Instituto Nacional de Estadística y Geografía (INEGI) podemos ver cómo éstas situaciones han afectado a diferentes grupos dentro del país.

Global

Antes de comenzar a ver los cambios específicos para México, hay que entender que hubo cambios globales que afectaron a todos los países. Globalmente la pandemia COVID-19 causó una contracción del PIB global del 3.2%, así que aunque el PIB de México se contrajo 8.3% en el 2020, en términos relativos con el resto del mundo fue del 5.3%, pero comparado con otros países de Latinoamérica cuya contracción fue de 6.5%, en términos relativos es 1.9%.

Una contracción del 1.9% aún así no es deseable, pero ¿qué tanto de eso es culpa de México?, y ¿qué tanto es de situaciones externas? Nadie lo puede saber.

Cambios del PIB en varios países de Latinoamérica (2021 estimado)

Un argumento común es que los problemas en México ya estaban siendo visibles en 2019, pero en 2019 ya existía una desaceleración global y el crecimiento global del PIB fue sólo del 1.7%, y en Latinoamérica -0.2%. México siguió exactamente el mismo patrón que la región: -0.2%.

Fuente: World Bank – June 2021 Global Economic Prospects.

Desigualdad y promedio

La mayoría de la gente no sabe la diferencia entre media y mediana–que normalmente no es un problema porque muchas distribuciones son normales, o simétricas, y la media y la mediana son lo mismo. Sin embargo ese no es el caso con los ingresos, que están distribuidos de forma desigual.

Por ejemplo, si 10 personas tienen un ingreso de $10,000, el promedio (media) es $10,000, y la mediana también es $10,000. Eso no cambia si una persona recibe $9,000 y otra $11,000.

Sin embargo si 10 personas tienen un ingreso de $10,000 y una persona $100,000, el promedio es de $20,000 pero la mediana sigue siendo de $10,000. Para determinar la mediana ordenas a las 11 personas de menor a mayor y la persona que esté en medio es la mediana. Es decir la mitad de la gente tiene un ingreso menor, y la otra mitad mayor.

¿Qué pasa si se reduce el ingreso de la persona que recibe $100,000? Suponiendo que baja a $80,000 el promedio se reduce a $18,000, pero la mediana no, sigue siendo $10,000. En este caso el ingreso de las otras 10 personas no cambia en lo más mínimo, a pesar de que el promedio cambió significativamente.

Eso es lo que sucede en las verdaderas distribuciones: una contracción del 5% del PIB no se traduce en una disminución del 5% de los ingresos para todo mundo.

Cuantiles

La mediana divide una población en dos grupos: los que tienen un ingreso menor, y los que tienen un ingreso mayor. Sin embargo la población se puede dividir en más grupos, por ejemplo cuartiles. La mediana es 1 número que divide a la población en 2 grupos, los cuartiles son 3 números que dividen a la población en 4 grupos. El primer cuartil determina el 25% de la población que recibe menos ingresos, y por lo tanto 75% reciben más.

Si dividimos a la población en 10 grupos, necesitamos 9 números, y a esos números se les llaman deciles. Por ejemplo si el primer decil es $5,000, eso significa que 10% de la población tiene un ingreso menor, y 90% un ingreso mayor. Si el último decil es $30,000, eso significa que 90% de la población tiene un ingreso menor, y 10% un ingreso mayor.

Nueve deciles que dividen a la población en diez grupos

Es posible dividir a la población con más granularidad, por ejemplo centiles:

99 centiles que dividen a la población en 100 grupos

Mientras más granularidad, más visible es la desigualdad, sobre todo en los grupos con más ingresos. En este caso el centil de 1% es $2,074, mientras que el de 99% es de $83,616. Los otros cuantiles de menos granularidad siguen siendo visibles, por ejemplo el decil 9 ($33,596) es el centil 90%, y la mediana ($12,803) es el centil 50%.

Al separar el decil 9 en 9 centiles podemos ver que la diferencia del 90% ($33,596) al 99% ($83,616) es bastante significativa, y ahí no termina la granularidad, ya que el top 1% a su vez se puede dividir aún más, y el top 10% del top 1% (99.9%) es $206,082. Y así sucesivamente.

La distribución del ingreso es como un fractal: siempre puedes hacer más zoom, y mientras más zoom hagas más desigualdad vas a ver (hasta que llegues al nivel de individuos). Ninguna gráfica ni ninguna métrica te pueden hacer ver la verdadera situación, para eso es necesario usar fórmulas continuas, pero eso está fuera del alcance de éste artículo. Aquí sólo vamos a explorar los cambios que sufrieron los diferentes cuantiles, que inevitablemente es un análisis limitado.

Encuesta

El INEGI realiza una encuesta para determinar de forma estadística los ingresos y los gastos de la población. Se realiza en todo el país de forma aleatoria y anónima cada dos años. La última encuesta fue realizada en el 2020 y ha sido la más grande hasta la fecha, entrevistando a más de 89,000 hogares.

Yo tengo muchos años analizando estas encuestas en mi tiempo libre, y ya que soy programador he ido desarrollando herramientas que me permiten agilizar éstos análisis usando los datos brutos. Los resultados de mis herramientas concuerdan con los resultados que publica el INEGI, pero además yo hago cálculos más granulares y ciertas correcciones.

Por ejemplo el INEGI calcula el índice de Gini usando deciles y el resultado es 41.48, pero usando centiles el resultado es 42.56, y usando toda la muestra es 42.60. Para más detalles sobre las discrepancias escribí otro blog post. Mientras más granularidad el índice aumenta, y eso significa más desigualdad.

Otro cambio es que yo estoy ajustando a inflación. Por ejemplo el INEGI reporta un ingreso promedio mensual por hogar de $16,770, pero eso es a precios del 2020, ajustando a un 4.87% de inflación a hoy (Julio 2021), es $17,586.

Desafortunadamente el INEGI no reporta la métrica más importante: la mediana, que es $12,802 (ya ajustada a inflación). Es decir que en el 2020 la mitad de los hogares en México recibían un ingreso menor a $12,802, y la mitad más.

Otro problema es que el INEGI no reporta los verdaderos deciles, sino el promedio por decil. Es decir el primer decil (bottom 10%) es $4,878, eso significa que el 10% de los hogares en México tienen un ingreso menor, pero como ya vimos anteriormente, la desigualdad no termina ahí, hay hogares tienen mucho menos ingreso, por eso el promedio del 10% de los hogares más pobres es de $3,474, pero ese número no nos ayuda mucho.

Si quieres saber si tu hogar es de los más ricos o de los más pobres necesitas la mediana (decil 5), pero el INEGI te da diez grupos, no nueve, entonces la mediana está entre el grupo 5 ($11,665) y 6 ($14,020), ninguno de esos dos grupos te sirve.

La razón por la que es mejor utilizar hogares que personas (per cápita), es el concepto llamado economía de escala. Es más conveniente vivir con otras personas y compartir gastos, por ejemplo comprar un sólo microóndas que usen dos personas, o hacer comida una vez para cuatro personas a que cuatro personas hagan su propia comida. Un ingreso de $20,000 en un hogar de 4 personas no se traduce literalmente a un ingreso de $5,000 por persona, sobre todo si dos de ellas son niños.

Las dos gráficas anteriores de deciles y centiles son ingresos mensuales por hogar en el 2020 a precios del 2021. No son números precisos, ya que dependen de la aleatoridad de la muestra, y la precisión de las respuestas en la encuesta. Pero es lo más cercano que tenemos a la realidad.

Aquí puedes ver la presentación de los resultados del INEGI.

Cambios

Del 2018 al 2020 México perdió 8.5% de su PIB, pero eso ¿a qué se traduce en términos reales de la gente que vive en el país?

Ya vimos que el top 10% recibía un ingreso de $33,595 en el 2020, pero en el 2018 era $35,178 (a precios del 2021), es decir una disminución del 4.5%. Pero no es lo mismo en todos los deciles, el bottom 10% disminuyó de $4,891 a $4,878, es decir sólo el 0.26% ($13). Pero eso no es todo, como ya vimos no todos los del bottom 10% son iguales, ellos a su vez tienen su propia desigualdad, el promedio disminuyó de $3,388 a $3,474 lo cuál es -2.53%… Eso no es una disminución: aumentó.

¿Cómo es posible que con una disminución del 8.5% del PIB el ingreso promedio de los hogares más pobres incremente?

Política social.

Gracias a las políticas de Andrés Manuel López Obrador el índice Gini de México disminuyó de 43.78 a 42.60. Eso significa que la desigualdad disminuyó. Es difícil comprender qué significa eso en términos reales, pero es algo que la gente más pobre sí lo nota.

Pero como ya vimos antes, el análisis no termina ahí, por que siempre hay más granularidad:

Los centiles más pobres vieron aumentos de sus ingresos del 8%, mientras que los más ricos una disminución del 13%, de $165,284 a $143,846.

¿A alguien aún le sorprende por qué la clase ultra rica está en contra de Obrador?

El promedio bajó de $18,445 a $17,586 (-4.66%), pero como ya ha sido explicado, eso no representa lo que la mayoría de los mexicanos experimentaron. La mediana bajó de $13,201 a $12,802 (-3.02%), eso es más representativo, pero aún así no pinta la imagen completa.

Cabe mencionar que ésto es en un período de crisis. El PIB está proyectado a incrementarse 5% en el 2021, así como el top 1% sufrió pérdidas desproporcionadas (pobres ¿cómo sobrevivieron con sólo $144K al mes?), así ellos disfrutarán ganancias desproporcionadas (aunque probablemente no las mismas que hubieran disfrutado en una administración neoliberal).

Así es que no, el mito neoliberal que la marea creciente hace a todos los barcos elevarse, es simplemente falso. Cada barco sube o baja de forma independiente, e incluso cuando todos los barcos se elevan, no todos lo hacen de la misma forma.

Más diferencias

Las diferencias entre 2018 y 2020 no se limitan a los cuantiles, sino también el tipo de ingreso, el tipo de localidad, y más. Por ejemplo los ingresos en base a jubilaciones, pensiones, becas y beneficios gubernamentales se incrementaron significativamente, mientras que los ingresos por rentas disminuyeron.

Sin embargo el cambio más drástico es que las localidades urbanas experimentaron una disminución del 8.0%, mientras que las localidades rurales un aumento del 3.6%. Esto aún se puede desmenuzar más, por que las localidades rurales también se pueden dividir por deciles y el bottom 10% experimentó un aumento del 16.6%.

Las diferencias no terminan ahí. También podemos ver que la gente que estudió hasta secundaria percibió un aumento en sus ingresos, mientras que el resto de la gente una disminución.

También hay diferencias por entidad federativa. CDMX, Baja California Sur, Jalisco y Quintana Roo experimentaron una disminución drástica, mientras que Bala California, Chihuahua, Durango y Zacatecas aumentos.

Cualquier persona que trate de pintar con brocha gorda la situación económica del país va a cometer imprecisiones, por que si hay algo que define al país es su riqueza cultural, variación, y diversidad. Los promedios son simples, pero imprecisos.

Conclusión

¿Le fue bien o mal a México en el período del 2018 al 2020? Si las opciones son blanco o negro, la respuesta es mal. Pero una vez que comenzamos a ver matices podemos ver que no le fue tan mal como Argentina, e igual de mal que Ecuador. A pesar de que le fue mal, no le fue tan mal, sobretodo en comparación con el resto de la región. Cuando vemos matices de grises podemos ver que relativamente sólo le fue un poco mal, y si consideramos la desaceleración global antes de la pandemia, y la crisis económica provocada por la pandemia, eso no es necesariamente culpa del país.

¿Le fue igual de mal a todos los mexicanos? Si vemos un sólo número como el PIB, por definición no hay otra respuesta más que , pero el PIB es una métrica burda que no pinta la imagen completa. Considerando la desigualdad ya existente, y la distribución del ingreso, podemos ver que la respuesta correcta es no: a los ricos les fue mal, pero a los pobres les fue bien, o al menos no tan mal. Además depende del estado, de la localidad, del tipo de localidad, e incluso del nivel de educación.

Todos vivimos en una burbuja. Si estás leyendo éste artículo probablemente estás en la burbuja equivalente al top 10% que es un hogar con un ingreso de $34K mensuales, tus modelos son el top 1% que recibe $84K, y estás siendo manipulado por el top 0.1% que recibe más de $206K. Pero el 90% de la gente en México está en burbujas diferentes donde el resultado de las diferentes crisis no ha sido tan malo (-3%), a pesar de una crisis económica global. Es por eso que la aprobación presidencial sigue siendo 61%, arriba de Peña Nieto, Vicente Fox, y Ernesto Zedillo al mismo punto de su presidencia.

Usar la reducción del 8.5% del PIB en los primeros dos años de la administración de Obrador como base para evaluar el desempeño y determinar lo que pasará en los siguientes 4 años es un análisis cutre que absolutamente nada tiene que ver con la experiencia de la gran mayoría de la población en México, y no toma en cuenta la situación global que no sólo va a cambiar en el futuro, sino que ya cambió.

How not to ban a prolific git developer

On the 28th of July 2021 I received an email from Git’s Project Leadership Committee saying that my recent behavior on the mailing list was found to be in violation of the Code of Conduct. This was a complete surprise to me.

What was particularly surprising to me is the fact that I’m very familiar with the Code of Conduct, not only have I reported CoC violations in the past, but I’m the only person who has sent a patch to try to improve it, not only to the git mailing list, but the upstream project Contributor Covenant as well. It is because of this that I know the document doesn’t demand the examples of positive behavior in the section titled “Our Standards”, so the only objective reason why a project leader could argue that I violated the CoC would be if I had committed an action on the list of examples of unacceptable behavior:

  • The use of sexualized language or imagery, and sexual attention or advances of any kind
  • Trolling, insulting or derogatory comments, and personal or political attacks
  • Public or private harassment
  • Publishing others’ private information, such as a physical or email address, without their explicit permission
  • Other conduct which could reasonably be considered inappropriate in a professional setting

Did I do any of those things? You will be the judge.

If you don’t like drama then don’t read this post. Although there’s some technical discussion, it’s mostly an analysis of my alleged CoC violations and their context. According to the Project Leadership Committee the following ten emails–which I will list as exhibits–violated the CoC.

Exhibit 1

This is a common debate tactic known as “shifting the burden of proof”.

Ævar does not need to prove that your patch is undesirable, you have to prove that it is desirable.

You have the burden of proof, so you should answer the question.

https://www.logicallyfallacious.com/logicalfallacies/Shifting-of-the-Burden-of-Proof

Felipe Contreras

In this mail I am telling Johannes Schindelin my opinion about what is the default position regarding any patch: the patch is not needed. This is not even contentious. The first thing Junio C Hamano (the maintainer) does is ask: why is this patch needed? If this question is not answered to his satisfaction the patch is rejected. And I’m not the only one that has argued precisely this point:

I don’t need that data. You are proposing a change so it is your duty to support your claim that the change is worthwhile.
Otherwise it’s a change just for the sake of change.

Michal Suchánek

How is my comment remotely close to any of the examples of unacceptable behavior? For that matter, isn’t Michal’s behavior worse? Not to mention that it’s a response to Johannes’s comment, which is objectively way more aggressive:

> Why put this in an ifdef?

Why not? What benefit does this question bring to improving this patch series?

Johannes Schindelin

Johanness is avoiding a well-intentioned question by doubting the good faith of Ævar, and it became even more clear later on in the thread:

You still misunderstand. This is not about any “opinion” of yours, it is about your delay tactics to make it deliberately difficult to finish this patch series, by raising the bar beyond what is reasonable for a single patch series.

And you keep doing it. I would appreciate if you just stopped with all those tangents and long and many replies that do not seem designed to help the patch series stabilize, but do the opposite.

Johannes Schindelin

How am I the bad guy here? And for the record, Ævar is part of the Project Leadership Committee.

Exhibit 2

This is loaded language. You are inserting your opinion into the text.

Don’t. The guidelines are not a place to win arguments.

Note that this sounds ungrammatical and unnatural to some people.

And it sounds ungrammatical because it is ungramatical, not only to native English speakers, but professional linguists.

Felipe Contreras

This is part of a long discussion in which Derrick Stolee attempted to change the guidelines to explicitly avoid any and all gendered language (he/she) for ungendered (they).

One argument that Derrick kept repeating is that only non-native English speakers find the singlar “they” ungrammatical, but I kept repeating that’s not true, and as example I used the test the American Heritage Dictionary used in their usage note regarding “they”:

We thank the anonymous reviewer for their helpful comments.

58% of the panel found the sentence unacceptable. The Usage Panel is comprised of writers, professors, linguists, editors, and multiple Pulitzer Prize winners.

Derrick ignored all my feedback, and for the record he also ignored all the feedback from Ævar, and in fact native speakers in the mailing list stated that they found these sentences ungrammatical too.

Regardless of on which side you are on, Derrick tried to win the argument by adding this to the guidelines:

Note that this sounds ungrammatical and unnatural to readers who learned English in a way that dictated “they” as always plural, especially those who learned English as a second language.

All I said is that he shouldn’t insert his opinion into the guideline, and instead should use something neutral he knows everyone can agree with:

Note that this sounds ungrammatical and unnatural to some people.

Once again, how is stating my opinion “unacceptable”?

Exhibit 3

Yeah, now you are starting to see the problem.

How many more failed attempts do you need to go through before accepting that the approach you thought was feasible is in fact not feasible?

The solution is simple and self-documenting:

pull.mode={fast-forward,merge,rebase}

Felipe Contreras

In my 8,200-word article git update: the odyssey for a sensible git pull I explored around 13 years of discussions in the mailing list regarding problems with git pull, and the first and more obvious solution is my proposed pull.mode configuration which was a direct competitor to different solutions proposed by Junio C Hamano.

For some reason Junio didn’t accept this proposal in 2013 even though other people were in favor of it. I tried again in 2020 multiple times, and after trying all the approaches other people proposed I became convinced it was the only feasible solution.

Elijah Newren–who we will later see is an important actor in this story–decided to try to fix the problem by himself, but every patch series he sent struck a dead end. It became clear to me that Elijah started to see the problem:

However, even if the above table is filled out, it may be complicated enough that I’m at a bit of a loss about how to update the documentation to explain it short of including the table in the documentation.

Elijah Newren

Yes, that’s precisely the reason why I opted for a solution that was a) easy to understand, b) easy to document, c) easy to test, and d) easy to program: pull.mode.

Elijah ignored all my feedback to all his patch series, which is why he still hasn’t realized his patches will break behavior current users rely on. I ran a poll on r/git and 19% of users responded that they rely on the behavior Elijah plans to break, and 15% said even though they don’t use it, they know what it should do.

All Elijah is achieving by ignoring me is at best wasting time, and at worst hurting users. Either way it’s not my fault.

How is my comment unacceptable? I’m just stating facts and my opinion. Even if my opinion is wrong that doesn’t make my comment “unacceptable”.

Exhibit 4

That is a problem specific for your shop.

The defaults are meant for the majority of users. If a minority of users (who happen to be working under the same umbrella) have a problem with the defaults, they can change the defaults.

Felipe Contreras

I understand why some people might think this is an attack on Randall S. Becker, but it’s really not.

Randall is a contributor that often sends reports when the latest git version breaks something in his platform: NonStop OS. I think it’s fair to say this is a relatively obscure OS. Even a smaller subset are the people that work with git inside his company.

Now, of course git should consider the NonStop platform, and of course it should consider all the users inside Randall’s company, but they are probably not even 0.01% of all git users, so why should 99.99% of users suffer at their expense?

No. The defaults are for 99.99% of users, if people in Randall’s company have a problem with the defaults and want to avoid git rebase like the plague, they can configure git anyway they want. That’s what the configurations are for: for the minority.

My opinion is that the defaults are for the majority.

If anyone has a problem with my opinion, we can debate it, but how is stating my opinion “unacceptable behavior”?

Exhibit 5

I’m sending this stub series because 1. it’s still in ‘seen’ [1], 2. my conflicting version is not in ‘seen’ [2], and 3. brian has not responded to my constructive criticism of his version [3].

Felipe Contreras

In the cover letter of my patch series: doc: asciidoctor: direct man page creation and fixes (brian’s version), I explained the reasons why I was sending it, but mainly it’s because Junio continued to refuse to drop brian’s version and include mine, even though brian himself told Junio to drop his version.

Now, if you look at my comments on the patches you could say I trashed them, but this is not my fault: brian stated plenty of things are just not true. Just on the first patch:

  • We generally require Asciidoctor 1.5, but versions before 1.5.3 didn’t contain proper handling of the apostrophe… Not true
  • [GNU_ROFF] for the DocBook toolchain, as well as newer versions of Asciidoctor, makes groff output an ASCII apostrophe instead of a Unicode apostrophe in text… Not true
  • These newer versions of Asciidoctor (1.5.3 and above) detect groff and do the right thing in all cases… Not true
  • Because Asciidoctor versions before 2.0 had a few problems with man page output… Not true

The changes were correct, but I was the one that originally wrote the patch, brian merely changed the commit message and took authorship because I objected to his text.

I understand that some people have trouble hearing that they are wrong, but this is not my problem. If you send a patch to the mailing list you should be prepared to hear all the ways in which it’s wrong. And brian m. carlson is not some rookie, he is the top #5 contributor to the Git project in the past five years, he shouldn’t need training wheels.

I do not blame brian for all these inaccuracies tough, there’s way too many details in the documentation toolchain, and the only reason why I know he was wrong is that I spent several man-days investigating these details. I explained part of the story in my post: Adventures with man color.

Moreover, if you think I’m lacking tact, remember that this is the second time I’m bringing these issues, the first time brian argued back all my suggestions and did not implement a single one. Eventually he just stopped responding to me.

Now, regarding integration: the “seen” branch is an integration branch which is maintained by Junio and contains all the topic branches that are currently being discussed and Junio is considering merging. Junio picked brian’s branch even though it was full of inaccuracies and in my opinion the commits were not properly split (each commit was doing four to five different things at once).

On the other hand my commit message did not contain inaccuracies, I wrote the original patch, my patches were properly split, contained patches from other people–including Jeff King and Martin Ågren–and in addition contained plenty of cleanups and more fixes to the output of the documentation.

If that was not enough, just the commit message of first patch (doc: remove GNU troff workaround) took me several hours of investigation to write.

It’s not my fault that brian decided to stop arguing any further, it’s not my fault that Junio decided to carry brian’s version for two months and ignore my version which was objectively superior. Those are the facts, and I was merely stating them.

Suppose that I’m wrong, let’s suppose that brian’s version is superior, in that case my statement of fact is incorrect. OK, but how is that “unacceptable behavior”?

Exhibit 6

I meant that I meant what he said I meant.

Felipe Contreras

I understand how this clarification I sent to Junio might look like to some people, but it’s simply a convoluted way of saying “what he said”. SZEDER Gábor said “I think you meant X”, I replied “yes, I meant Y, which in practical terms means what you said (X)”, and Junio (who has a habit of rewriting what I write) asked if my previous statement said “yes, yours is better and I’ll use it in an update, thanks”, but I don’t agree with Junio’s restatement.

I meant Y, which in practical terms means what SZEDER said I meant (X). So I meant either of these:

  • Y: Otherwise commands like ‘for-each-ref’ are not completed correctly by __gitcomp_builtin.
  • X: Otherwise options of commands like ‘for-each-ref’ are not completed.

Which one of these is really “better”? I don’t know, and to be honest I don’t care. I’ve been sending this patch for nine months now and in truth the original “Otherwise commands like ‘for-each-ref’ are not completed” is good enough for me.

The important thing is the fix, and the code continues to be broken to this day.

Additionally, it’s not uncommon for Junio to update the commit messages himself, in fact he did so for my last patch (doc: pull: fix rebase=false documentation), even though I sent an updated commit message he ignored my suggested modification and used his own. So why can’t he simply do the same here with a simple “s/commands/options of commands/” as I suggested?

To me this simply looks like an excuse not to merge the series, which I’ve sent eleven times already.

To avoid problems I re-sent the patch series nine minutes after Junio sent his message, and I used SZEDER’s suggestion. That was on June 8, and to this day Junio keeps repeating this on his status mails:

* fc/completion-updates (2021-06-07) 4 commits
 - completion: bash: add correct suffix in variables
 - completion: bash: fix for multiple dash commands
 - completion: bash: fix for suboptions with value
 - completion: bash: fix prefix detection in branch.*

 Command line completion updates.

 Expecting a reroll.
 cf. <60be6f7fa4435_db80d208f2@natae.notmuch>

I’ve already replied multiple times that I did the reroll immediately after, and after that I’ve re-sent the series four times since then.

Now, I understand if Junio took my reply in a way that I did not intend, but why should git users suffer? The problems these patches fix are real. SZEDER Gábor has already reviewed part of the series, and David Aguilar has tested it and he confirmed the problems exist, and the fixes work.

I have plenty more fixes on top of these (41), the reason why I kept this series small was to maximize the possibility of them getting merged, and now it turns out the reason Junio hasn’t merged them is because he didn’t like one comment I said?

To me this seems like pettiness. We are adults, if he found one of my replies objectionable, he could have simply stated so on the mailing list, or he could have sent me a personal reply (he has never done so). He could have said “I don’t like your tone, so I’m going to drop this series”, but instead he made me waste my time resending patches he was never going to merge. He kept his disapproval for himself, and only used it for ammunition to justify a future ban.

The slowness from Junio to accept these and other patches is why I chose to start the git-completion project, which is a fork of the bash and zsh completion stuff. For the record I was the one that started the zsh completion, and I started the bash completion tests in order for the completion stuff to be first-class citizens of the git project, which Junio has refused to accept as well.

Choosing to knowingly hurt git users because he didn’t find one comment palatable does not seem to me to be a behavior fit for a project leader.

I foresee that some people will conclude that I’m being petty too, but I don’t think that’s the case. I found the problems, I wrote the patches, I sent the patches, I addressed the feedback, and I updated the patches with the feedback. I’ve been trying to get them merged for nine months. What more do you want? If Junio has any further problems with the patches, he can just let me know and I’ll address them. But instead he says nothing.

To exemplify even more how Junio’s pettiness is hurting users, Harrison McCullough reported a regression with the __git_complete helper (which I wrote) on June 16, and it was caused by a change from Denton Liu which introduced a variable __git_cmd_idx, but he forgot to initialize it in __git_complete. Initially I fixed the problem by initializing __git_cmd_idx to 1, and Harrison reported that my fix indeed got rid of the issue. Two days later Fabian Wermelinger reported the same issue but sent a patch that initialized __git_cmd_idx to 0. Initially I thought he made a mistake, but upon further reflection I realized that 0 was more correct, so I updated my patch.

My patch is superior because it fixes the regression not only for bash, but for zsh too. In addition it mentions Harrison McCullough reported the issue, and it’s also simpler and more maintainable too. Junio picked Fabian’s patch and ignored my patch, and my feedback, therefore the regression is still present for zsh users. This is a fact.

Ignoring me is objectively hurting users. I’ve resent my fix on top of Fabian’s patch, so all Junio has to do to fix the regression is pick it.

How does the fact that Junio is primed against me make my convoluted statement “unacceptable behavior”?

Exhibit 7

That makes me think we might want a converter that translates (local)main -> (remote)master, and (remote)master -> (local)mail everywhere, so if your eyes have trouble seeing one, you can configure git to simply see the other… Without bothering the rest of the word.

Felipe Contreras

Once again I can see how people might misinterpret what I said here, but it’s nothing nefarious.

The rename of the “master” branch has been one of the most hotly debated topics of late (this has nothing to do with me as I didn’t even participate in the discussion). Even today it’s not entirely clear what’s the future of this proposal. If you are not familiar with the debate, you can read my blog post: Why renaming Git’s master branch is a terrible idea.

But what I attempted to do was achieve what the original poster–Antoine Beaupré–wanted to achieve but in another way. I listened to Antoine, and I proposed a different solution, that’s all.

What Antoine wanted (as I understand it) was to change all his “master” branches to “main” so that he didn’t have to see “master” everywhere. But he wanted to do this properly, so he wrote a python script to address as many renaming issues as he could.

I see some value in what Antoine wanted to do, but he literally said: “I am tired of seeing the name “master” everywhere“, if that was literally the problem, then a mapping of branch names would fix it. And this is not even that foreign to me.

When I wrote git-remote-hg and git-remote-bzr one of the main features people wanted was a way to map branch names. So for example a branch named “default” in Mercurial could be mapped to “master” in Git (this was done by default, but you get the point).

Even if this didn’t help Antoine, I thought this would help other people, say for example that some Linux developers couldn’t manage to convince Linus Torvalds to rename the master branch to “main”, but for some reason they found the name “master” offensive. Well, with my patch they didn’t have to convince Linus, they could simply configure a branch mapping so they “never had to see the name “master”“.

After listening to my reply Antoine did a more adult approach than Junio, and actually replied his discontent to my comment:

I guess that I’ll take that as a “no, not welcome here” and move on…

Antoine Beaupré

This saddened me. It was never my intention to shit on Antoine’s idea, and in fact I never intended to act a representative of the Git community, so I explained very clearly to Antoine that he shouldn’t just give up:

Do not take my response as representative of the views community.

I do believe there’s value in your patch, I’m just not personally interested in exploring it. I don’t see much value in renaming branches, especially on a distributed SCM where the names are replicated in dozens, potentially thousands of repositories. But that’s just me.

Felipe Contreras

However, brian m. carlson either didn’t read my response, or chose to not factor it in, because he replied:

There is a difference between being firm and steadfast, such as when responding to someone who repeatedly advocates an inadvisable technical approach, and being rude and sarcastic, especially to someone who is genuinely trying to improve things, and I think this crosses the line.

brian m. carlson

I wasn’t trying to be “rude and sarcastic”, I was simply trying to suggest a different approach. It did’t even need to be a competing approach, because both approaches could be implemented at the same time. I explained that to brian, did he respond back? No.

Now, even if you remove me from the picture nobody else responded to Antoine, so how exactly did my responses hinder in any way the community?

If you think my response was “rude and sarcastic” I would love to debate that, but the fact of the matter is that only I know what my intentions were, and they were definitely not that.

Moreover, I don’t believe in being offended by proxy. If Antoine Beaupré had a problem with my comment, I would listen to his objection, and even though I think I have already clarified what I meant, I would even consider apologizing to him. But I will not apologize to brian m. carlson who I’m pretty sure got offended by proxy.

Also, for the record, reductio ad absurdum arguments are not uncommon in the Git mailing list, and they don’t necessarily imply anything nefarious. Here’s one recent example from Junio:

I am somewhat puzzled. What does “can imagine” exactly mean and justify this change? A script author may imagine “git cat-file” can be expected to meow, but the command actually does not meow and end up disappointing the author, but that wouldn’t justify a rename of “cat-file” to something else.

Junio C Hamano

Can Junio’s response be considered “rude and sarcastic”? Yes. But you can also assume good faith and presume he didn’t intend to offend anyone and simply tried to prove a point using a ridiculous example.

Can my comment be considered “unacceptable behavior”? In this case I’d say yes, but not necessarily so. You can also give me the benefit of the doubt and simply not assume bad faith (and I can tell you no bad faith was intended).

Elijah Newren

So far the incidents have been pretty sporadic and could easily be reduced to misunderstandings, but the following are not. Elijah Newren decided to wage a personal vendetta against me (literally), and that context is necessary to understand the rest of the evidence.

Spring cleanup challenge

It all started when I sent my git spring cleanup challenge in which I invited git developers to get rid of their carefully crafted git configuration for a month. The idea was to force ourselves to experience git as a newcomer does, and figure out which are the most important configurations we rely on.

This challenge was a success and very quickly we figured out at least two configurations virtually every experienced git developer enables: merge.conflictstyle=diff3 and rerere.enabled=true. Additionally my contention is that git should also have default aliases (e.g br => branch), but there was no clear consensus on that.

Take for example the configuration merge.defaulttoupstream. Back in 2010 while exploring issues with git pull I proposed that git merge should by default merge the upstream branch. I later sent the patch in 2011 which received pretty universal support, and there were interesting comments, like:

I totally agree — this would be a good change[*] — but this suggestion has been made before, and always gets shot down for vaguely silly reasons…

Miles Bader

Just a few days later Jared Hance sent his own version of the patch by introducing merge.defaultupstream to activate the new behavior. Jared sent in total five versions and implemented all the suggestions from everyone, including Junio, but in the end Junio decided to rewrite the whole thing, take authorship, not mention that Jared wrote the original version of the patch (see the final commit), nor the fact that originally the idea came from me.

I personally do not care too much. I had an idea and the idea got implemented, that’s mostly all I cared about. However, it wouldn’t have costed anything to Junio to add “Original-idea-by: Felipe Contreras” as it’s customary (here’s an example).

Then in 2014 I realized that before git 2.0 (which was meant to break backwards compatibility) it was the perfect time to enable merge.defaulttoupstream by default (literally no one disabled it anyway), so I sent a patch for that and other defaults. Junio disagreed and excluded these from 2.0, but in the end he ended up merging them.

So finally in git 2.1 git merge ended up merging the upstream branch by default.

What I find interesting is that in 2021 (7 years later), Ævar–a prominent git developer–didn’t even know that merge.defaulttoupstream changed it’s default value, so he still had merge.defaulttoupstream=true in his configuration. Thanks to my challenge he argued it should be true by default, and thus realized that was already the case. Additionally I noticed the default value was not mentioned in the documentation, so I sent a fix for that.

At that point my relationship with Elijah was amicable, and he decided to join the challenge in his words “to join in on the fun“.

zdiff3

The problems started when I took it upon myself to try to enable merge.conflictstyle=diff3 by default. This task turned out to be much more complicated than I initially thought. Flipping the switch is extremely easy, but once you do that a ton of tests that expect the diff2 format start to fail. No problem, I thought by simply changing merge.conflictstyle to the old value at the beginning of each test file that fails would solve it. Later on each test file can be updated to expect the diff3 format.

It turned out that didn’t work. Many commands completely ignored the configuration merge.conflictstyle, and those are clearly bugs. So even before attempting to change merge.conflictstyle we have to fix those bugs first.

This is what I attempted to do on my patch series: Make diff3 the default conflict style.

Immediately Johannes Sixt pointed out that such change resulted in a very convoluted output for him. This however I think is just bad luck, since never in all my years of using diff3 have I ever seen an output similar to that, and that’s because recursive merges are involved.

Jeff King provided a link to a discussion from 2013 regarding the output of diff3, and that discussion itself lead to other discussions from 2008. It took me a while to read all those discussions, but essentially it boils down to a disagreement between Junio C Hamano and Jeff King (both part of the leadership committee) about whether or not a level higher than XDL_MERGE_EAGER (1) made sense for diff3. Junio argued that the output wasn’t proper, but Jeff argued that even though it wasn’t proper, it was still useful. That lead to Uwe Kleine-König–a Linux kernel developer–to implement zdiff3, which is basically diff3 but without artificially capping the level as Junio did in order for the output to be proper.

When I said maybe we should consider adding this zdiff3 mode, Jeff King mentioned his experience with it:

I had that patch in my daily build for several years, and I would occasionally trigger it when seeing an ugly conflict. IIRC, it segfaulted on me a few times, but I never tracked down the bug. Just a caution in case anybody wants to resurrect it.

Jeff King

My interpretation of that message is extremely crucial. I am very precise with language, both when writing, and reading. So I read what Jeff said exactly how he said it. First, he said in “several years” (1+ years) he occasionally would try this. Then he said of of the times he tried it, it crashed a few times, but crucially he said “if I recall correctly”, so this means he is not sure. Maybe it crashed more than a few times, or maybe it crashed less than a few times, he is not sure.

Whatever the issue was, Jeff did not find it serious enough to track the bug, since he never tracked down the bug in the several years he was using the patch.

Jeff sent his message on a Thursday, on Friday Elijah Newren said he might investigate the zdiff3 stuff, and on Sunday I re-sent the 2013 patch from Uwe. I added a note to the patch stating why I was sending it, and what I did to test it. Essentially I ran the entire test suite using zdiff3 instead of diff3 and everything passed. This implies that if there’s any issue with zdiff3 it probably is not that serious.

I sent the patch at 9:30. One hour later Jeff King replied “I take it you didn’t investigate the segfault I mentioned”. I don’t know how I was supposed to investigate that, other than what I already did: run the whole test suite with zdiff3. Jeff had the idea to recreate all the merges of the git repository using zdiff3 and after 2500 merges you can find one that crashes. This is a good idea, but it never occurred to me to do that.

By 13:00 I replied with a command to replicate the issue very simply using a git merge-file command. By 16:24 I had found the issue and sent a fix.

The next Monday Elijah Newren started to complain:

This is going to sound harsh, but people shouldn’t waste (any more) time reviewing the patches in this thread or the “merge: cleanups and fix” series submitted elsewhere. They should all just be rejected.

Elijah Newren

Elijah provided a list of reasons, none of which were true–like “no attempt was made to test” (I did attempt to test, as I explained in the note of the patch). Additionally he stated that in his opinion my submissions were “egregiously cavalier”.

Egregiously cavalier? I spent several hours on a Sunday trying to find the fix for a patch one of the most prolific git developers didn’t bother to fix for several years, and I actually did it… the very same day.

This was the start of Elijah’s personal vendetta against me. He urged Junio not only to drop Uwe’s patch that I resent, but to drop all my patches:

If I were in charge, at this point I would drop all of Felipe’s patches on the floor (not just the ones from these threads), and not accept any further ones. I am not in charge, though, and you have more patience than me. But I will not be reviewing or responding to Felipe’s patches further. Please do not trust future submissions from him that have an “Acked-by” or “Reviewed-by” from me, including resubmissions of past patches.

Elijah Newren

If that wasn’t enough, Elijah accused me of fabricating his endorsement of my patches:

Troubled enough that I do not want my name used to endorse your changes, particularly when I already pointed out that you have used my name in a false endorsement of your patch and you have now responded to but not corrected that problem (including again just now in this thread), making it appear to me that this was no mistake.

Elijah Newren

This is blatantly false. Elijah very clearly said “Yes, very nice! Thanks.” to one of my patches, and that can be considered endorsement by many people. I do not take those kinds of accusations lightly, so that prompted me to start an investigation into how many “reviewed-by” commit trailers are explicitly given as opposed to inferred, and I found that 38% of them are not explicit.

Not only that, but I found one example a month earlier in which Derrick Stolee took a “Looks good to me” comment from Elijah and used it as “reviewed-by”. Did Elijah accuse Derrick of “false endorsement”? No.

I don’t know what happened to Elijah, nor why from one day to the next he decided to make me his enemy, but clearly he is not being objective. If he was he would be objecting to Derrick’s behavior as well, since he did exactly the same thing as I. Elijah’s responses are clearly emotional, as can be seen from this reply in which he cites nineteen references, some going back to 2014–a dubious debating technique called Gish gallop, but writing this response he didn’t even pause to see that the links matched what he was referring to. Moreover, the first thing he said: “attacking a random bystander like Alex is rather uncalled for”, was completely off-base because my reply wasn’t even addressed to Alex.

I have absolutely nothing against Elijah, but clearly he does have something against me, and the rest of the reports (and probably all the previous ones) are entirely the result of that antagonism.

Exhibit 8

If you didn’t mean this patch to be applied then perhaps add the RFC prefix.

Felipe Contreras

When Alex Henrie took me up on the challenge to try to fix git pull, he created a patch that broke 11 test files, each one with many unit tests broken. This is not a huge deal, some people send patches that break the test suite, but generally when they do that they add the “RFC” (request for comments) prefix to make sure this patch is not picked by Junio.

So, assuming good faith there’s two options a) Alex didn’t see the tests breaking, or b) he knew the tests were breaking, but didn’t know he had to add RFC in that case. I simply said if this was case b), RFC should have been added.

Only a person already primed to presume malice would have seen a problem with my comment.

Exhibit 9

Wouldn’t you consider sending a patch without running ‘make test’ “cavalier”?

Felipe Contreras

This response was directed to Elijah Newren, not Alex Henrie, and I wrote it because I was genuinely curious to see if Elijah was capable of seeing the obvious discrepancy between his response to Alex, and his response to me.

The patch I sent (which wasn’t even my patch):

  • Did not break any tests by itself
  • Did not break any tests by forcing diff3 to be zdiff3
  • The patch was not going to land on the integration branch “seen”
  • The patch doesn’t change anything, so everyone using diff3 wouldn’t see the crashes
  • If the patch was applied, and a person manually enables zdiff3, out of the 16,000 merges in git.git it crashes on 13 of them (0.08%)
  • Has already been tested by multiple people for many years
  • Was fixed hours later

So the patch was relatively safe by any standard.

On the other hand Alex’s patch:

  • Broke a ton of tests by default
  • Broke the existing user interface
  • Hasn’t been used by anyone
  • Was intended to be integrated

Elijah’s response to my patch is that it was “egregiously cavalier”, and his response to Alex’s patch was “thanks for working on this”.

If this is not double standards I don’t know what is.

I did not criticize Alex, I was asking a question to Elijah.

Exhibit 10

When Alex responded to my response, he thought it was directed to him. I explained to him that it wasn’t and I made sure he knew I didn’t blame him for anything:

I do appreciate all contributions, especially if they were done pro bono.

Felipe Contreras

I explained that Elijah’s personal attacks towards me are a different subject that he should probably not attempt to touch, because I didn’t think anything productive could come from that (it’s a matter between Elijah and me).

And just to be crystal clear, I thanked him again:

This has absolutely nothing to do with you, again… I appreciate you used some of your free time to try to improve git pull.

Felipe Contreras

In my opinion all the reports that have anything to do with Elijah Newren are primed and could only be judged “unacceptable” if you assume bad faith (and probably not even in that case).

Judgement

I have been a moderator on pretty big online communities, therefore I’m familiar on how justice is supposed to be dispensed outside of the justice system.

The single most important thing is the presumption of innocence. The accused should not be considered guilty until evidence against him has been presented and he has had a chance to defend himself. Even in something as silly as Twitch chat, the people that get banned have the opportunity to appeal those decisions.

The leadership committee not only did not allow me to present my case, they weren’t interested in the least, when I asked them about presenting my case their response was “this isn’t a trial”. I was presumed guilty and that’s that.

My punishment was to avoid all interaction with 17 people for three months. That included Junio C Hamano and the entire leadership committee. If I engage in any “unwanted interaction” with any of these people or if I say something similar to the alleged violations above, then I’ll get banned.

Ævar Arnfjörð Bjarmason offered to be my “point of contact”, that means he was the only person I was allowed to ask questions to. While initially Ævar did a good job of keeping an open dialog with me, eventually he stopped responding because he went on vacation.

Other than the initial email I only received one response from the leadership committee.

However Ævar did manage to respond to a few questions for this blog post (as himself):

  1. Do you believe there’s a possibility the PLC might have made a mistake?

Sure, but I think that’s more in the area of how we’d deal with clashes like this generally than that nothing should have been done in this case. Did we do the optimal thing? I don’t know.

  1. Do you believe the PLC could have done more to address this particular situation?

Yes, I think it’s the first case (at least that I’m aware of) under the CoC framework that’s gone this far.

That’s a learning process, one particular thing in this case I think could have gone better is more timely responses to everyone involved. That’s collectively on us (the PLC), harder when there’s little or no experience in dealing with these cases, everyone volunteering their time across different time zones etc.

  1. Do you believe the identify of the person(s) who reported the complaints had an effect on the final verdict?

I wouldn’t mind honestly answering this for my part at least (and not for other PLC members), but feel I can’t currently due to our promise that CoC reports and identities of reporters etc. not be made public unless on their request.

While I’m thankful for Ævar’s responses these did not answer the particulars of what I asked, especially the last one, which he could very well be answered without revealing the identity of the people who reported the “violations”.

At the end of the day I’m still fairly certain that the person who did the vast majority of the reports (if not all of them) was Elijah Newren, and it’s only because he is a big name (#7 contributor in the past 5 years) that those reports were taken seriously. Additionally he sent as many as he could as an eristic technique–Gish gallop–because he knew that way nobody in the leadership team would investigate any single one at depth.

Ultimately I do not blame Elijah: if he thinks my comments violated the code of conduct, he is entitled to believe so and report them. But I do blame the leadership team because they didn’t do their job. They didn’t investigate any of the reports carefully enough, and they did not do the minimum job any judge should do (inside or outside a real courtroom); hear the defendant.

Not only did they not do their job, but they didn’t even want to attempt to do it, and even more… it seems they don’t know what their minimum job is.

Conclusion

Linus Torvalds once said in a TED interview:

What I’m trying to say is we are different. I’m not a people person; it’s not something I’m particularly proud of, but it’s part of me. And one of the things I really like about open source is it really allows different people to work together. We don’t have to like each other — and sometimes we really don’t like each other. Really — I mean, there are very, very heated arguments. But you can, actually, you can find things that — you don’t even agree to disagree, it’s just that you’re interested in really different things.

Linus Torvalds

You may not like my style of communication, but “being abrasive” is not a crime. The question is not “can Felipe be nicer?”, the question is “did Felipe violate the code of conduct?”. I believe any objective observer who carefully investigates any of the reports would have to conclude that no violation actually took place. You would have to believe that I was trolling, insulted people, threw personal attacks, or harassed people; none of that actually took place.

I debate with people all the time, and this particular issue comes up very often, but it’s a fallacy called tone policing. Any time anybody focuses on the way somebody states an argument instead if what the argument itself is, that person is being unproductive.

Paul Graham’s hierarchy of disagreement

I often bring Paul Graham’s hierarchy of disagreement, where tone policing is the third lowest level of “argumentation”, right after ad hominem attacks.

Nobody benefits by my tone being policed, especially the Git community. My patches not only provide much needed improvements to the user interface, but fixes clear bugs that have already been verified and tested, not to mention regressions. Ignoring me only hurts git users.

And even if in your opinion the tone I used in the reports above is not acceptable, it’s worth nothing that these are merely the worst ten instances out of two thousand emails, that’s just 0.5%.

And let’s remember that unlike the vast majority of git developers: I’m doing this work entirely for free.

Moreover, if anyone is violating the code of conduct I would venture to say it’s other members of the community:

  • Being respectful of differing opinions, viewpoints, and experiences

It is very clear other members of the community do not respect my opinions. I have suggested that this particular point should be tolerance, not respect (see Cambridge University votes to safeguard free speech), but even if you lower the requirement from respect to tolerance, not even that is happening: other members do not tolerate my opinions.

Why am I being punished because other members (who happen to be big names) can’t tolerate my opinions?

The Git community claims to be inclusive, but as this incident shows that’s not truly the case since the most important diversity is not welcomed: diversity of thought.

The git staging area, the term literally everyone agrees with

The concept of the staging area is one which many newcomers struggle with when starting to learn git, and the fact that this concept is barely mentioned in the official documentation (i.e. man pages) doesn’t help at all. Instead the official name is “the index” which has absolutely nothing to do with the way users interact with it. That’s the reason most people who teach git prefer to use the term “staging area” instead, and in fact that’s the term used in the best available documentation: the Pro Git book (which is not official), and also pretty much in all online documentation, including tutorials and blog posts (e.g. Atlassian saving changes, code refinery).

Attempts to move away from the incorrect term “the index” towards one that most native and non-native English speakers can grasp without the need for further explanation have been attempted for more than a decade, and even though there’s universal consensus that “staging area” is by far the best alternative, attempts to use it officially in the documentation and user interface have been blocked because one person cannot be convinced.

This is a summary of 13 years of discussions regarding the term “staging area”, similar to my previous post about 13 years of discussions regarding git pull.

git stage and --stage

The first thread happened in 2008, when David Symonds suggested the addition of a new command: git staged. This command was basically an alias for git diff --cached, and it was a result of a discussion in GitTogether ’08. While this command was not considered seriously (and in fact it wasn’t proposed seriously), it did result in the git diff --staged alias. This is the first time I suggested a git stage command.

Soon after Scott Chacon–a famous git trainer and author of the Pro Git book–suggested a git stage command that is basically an alias for git add. This time the proposal was serious and Scott offered a good rationale:

This continues the movement to start referring to the index as a staging area (eg: the –staged alias to ‘git diff’). Also added a doc file for ‘git stage’ that basically points to the docs for ‘git add’.

Neither Jeff King nor Junio C Hamano (the maintainer) objected, so it was immediately merged.

In 2009 David Abrahams suggested the “index” should be called “staging area” as that’s a more friendly name, and he expressed that git stage would have helped his learning curve. Shawn O. Pearce–who unfortunately passed away–gave a pretty good summary of the history of the “index”, which basically resumes to: it’s historical baggage.

Only late last October at the GitTogether did we start to talk about creating a command called “git stage”, because people have started to realize we seem to call it a “staging area” as we train newcomers…

Shawn O. Pearce

Junio objected to Shawn’s history lesson claiming it was “a bit misleading, if not completely incorrect”, and also objected to the way “the outside” referred to the index:

Yeah, you may have to consider the possibility that that particular training lingo is inconsistent with the rest of the system, exactly because it came from outside.

Junio C Hamano

Jakub Narebski–a prolific developer and community member–also said it was a pity git stage wasn’t there from the start, but Junio quickly disagreed.

I simply agreed that “staging area” is a more friendly name.

Felipe’s git stage

Unlike Scott’s git stage, I proposed a more comprehensive command with subcommands, so instead of doing git diff --caced, the user could do git stage diff.

Junio did not like the idea at all:

I do not think these are good ideas at all, as it just spreads more confusion, not less.

Junio C Hamano

I argued that there’s a lot of confusion already: stage, cache, index, etc. And it’s not just me the only one who sees that confusion.

Perhaps not spreading “stage” even wider? That is the newest confusing term that caused the most harm.

Junio C Hamano

Junio proceeded to blame the “git training industry” (e.g. Scott Chacon) for using the term “staging area” instead of “index”:

Later, some outside people started “git training industry” without talking with the git development community and started using a new term “to stage” as a verb to describe “add to the index”. Addition of “git diff –staged” was supposed to lesson the confusion resulted from this mess, but as we can see from your patch it had a reverse effect.

Junio C Hamano

In his opinion to avoid confusion --cached should not have been existed at all, and the command should have been git diff --index-only. He also opined that he should have rejected “stage” as an option name (and a command) (in favor of –index-only).

I thanked Junio for his clarification, but I also explained that there’s no porcelain command (intended for common users) that uses “index” in any way. Moreover, I explored what “cache”,”index”, and “stage” mean in English, and the high-level notion of most people do not match what “cache” and “index” actually mean.

In git it is barely used, mostly on the “documentation industry” probably because it’s easier to understand for most people (even non-native-English speakers).

Felipe Contreras

Markus Heidelberg argued that my proposal didn’t match his mental model:

Not for me. If I want to GET a diff, I want to use a command “diff”, so “git diff” is more obvious.

Markus Heidelberg

I argued that this was a matter of preference, but his preference would not be hindered in any way: we could have both git stage diff and git diff --staged. Additionally I argued that git diff is used in two fundamentally different ways, moreover we already have command/subcommand instances (e.g. git remote add).

Well, it’s a matter of preference, and you would not loose the option to do it the way you like. But actually, “git diff –cached” is a different action; you can’t do “git diff –cached HEAD^..” for example.

Felipe Contreras

Curiously enough when Markus argued back that you could not do git stage diff HEAD^.. either, it was Sverre Rabbelier–another prominent developer back then–who did the mic drop for me:

I rest my case ;). That’s the whole point Felipe is trying to make here.
$ git diff --cached
$ git diff HEAD^..

That’s two different modes of operation with the only difference being a switch (‘–cached’), which changes what is, and what is not valid after that.

Sverre Rabbelier

David Aguilar mentioned an old thread in which an alternative style for git diff was proposed by using STAGE and WORTREE as pseudo commits. Unfortunately a lot of the following discussion was about this proposal, and not mine.

Junio disagreed with this proposal as well, and stated he didn’t think there’s any usability issue:

I do not think there is any usability issue. Why do you think saying STAGE in all capital makes it easier to use instead of saying –cached (or –index-only)? In either way, you need to understand the underlying concept, such as:

Junio C Hamano

Matthieu Moy did think there was usability issue, and so did Octavio Alvarez, and Stefan Karpinski:

There is most definitely a usability issue here. I use git every day and I cannot for the life of me remember all the inconsistent stage-related oddball commands. I have a number of aliases for them (similar to what Felipe is proposing) which are the only way I can remember them. Whenever I find myself using a git repo without those aliases, I have to fire up the man pages. Trying to explain all of this to coworkers that use git—honestly, I don’t even try to go there.

Stefan Karpinski

Nothing materialized out of this discussion.

user-manual

In a side-thread when I tried to improve the user-manual, and argued that color should be enabled by default (color.ui=auto), Jonathan Nieder–the most prominent developer–asked me what were some of my UI patches that might have been overlooked:

Could you list some UI patches that were overlooked or not properly addressed? Maybe people just forgot about them or were waiting for an updated version, or maybe the problems some solve weren’t articulated clearly yet. I would be glad to help out in any way I can.

Jonathan Nieder

I mentioned my patch for git stage and attempts to standardize --index and --cache onto --stage as examples where there’s clearly a UI problem. But the real problem is that even though it’s clear by virtually everyone that there’s a problem, a path forward doesn’t seem to be there.

Michael J Gruber argued back that basically the reason he decided to disengage with this particular patch series is that “I didn’t seem to be willing to accept advice” (which obviously was not the case, since I had already included all his advice):

Regarding this specific patch series: I took part in the initial discussion, and got frustrated by the original poster’s seemingly unwillingness to accept advice, so I left. I’m not drawing any general conclusions, and please don’t take this as an ad hominem argument. Sometimes it’s simply a matter of mismatching participants.

Michael J Gruber

Additionally he argued that --index and --cache are fine, because two options are needed on some commands, which although true that still doesn’t address the issue of the names, as I argued:

“good” is a very subjective term; I don’t think “they are different” is a good reason. By that logic –only-index and –index-and-working-dir serve the same purpose, just like –gogo and –dance.

Felipe Contreras

Nanako Shiraishi piled on the ad hominem angle:

I don’t think Felipe seriously wants to change them to –gogo vs –dance, but if he made a more constructive proposal, instead of making such a comment whose intended effect is only to annoy people, we may see an improved UI at the end. Proposing “–index-only” vs “–index-too” or even “–stage-only” vs “–stage-too” would have helped him appear to be more serious and constructive and I think your expression “mismatching participants” was a great way to say this.

Nanako Shiraishi

I don’t know how anyone can think that assuming bad faith is a more constructive way to achieve anything, clearly my intention was not to annoy people, but to demonstrate a point. It is a common rhetorical device called reductio ad absurdum, it’s not uncommon on the git mailing list to use it, and even Junio uses it regularly.

My point was that any two names would solve Michael’s concern, the whole issue is that names would those be, and before exploring any names, developers–and in particular Junio–need to accept there is a problem with the existing names. When I explained that to Nanako she didn’t back down:

You have a funny way of saying “I’m sorry, I wasn’t constructive, and my attitude repelled many participants from the discussion”.

Nanako Shiraishi

Which of course is not what I was trying to say at all. Moreover, she said that because “stage” is not very user friendly to her, the word we use is irrelevant, and everyone would agree with her:

I think a proposal to replace the word “index” with “stage” will sound nothing but bike-shedding to anybody, especially after getting familiar with “index” and seeing it taught on many web pages and books.

Nanako Shiraishi

Instead of my usual “blunt” style of communication I tried to appeal to Nanako’s sensitivity and empathy, in order explore the possibility that even though she doesn’t particularly see a useful difference between the words “stage” and “index”, that doesn’t mean other people don’t either:

I’m going to change my usual blunt style and try to be sensitive here: I’m not trying to blame you, or disregard your background. I’m not a native English speaker either (although my tongue language is a romance one, so perhaps I have some advantage), but to me, English is a language of short words, and therefore, the exact word you pick makes a world of difference, and this is something I feel many non-english speakers don’t appreciate. Since we all are communicating in English, I think we should not disregard “subtle” differences in words such as “cached” and “stage” that might not mean much to you, but I think it would to the thousands (or millions) of git users who do understand immediately the meaning of “stage” regardless of their git (or any other SCM) background.

Felipe Contreras

I then explored at great length all the different meanings of the word “stage”, and also to some extent the word “index”. As a reward for my efforts my reply was censored and removed from the mailing list archive. I’ve uploaded the email in its entirety so you can see for yourself if there’s anything objectionable there.

Regardless of on which side you are on, Nanako did not object to the word “stage”, all she did is state that to her the word is irrelevant, and in her opinion I was bikeshedding. As we will later see Nanajo was wrong: most people did not consider it bikeshedding.

Nothing particularly productive came from the rest of the discussion.

Git User’s Survey

Since I pretty much gave up on convincing Junio using arguments, I thought it was a good idea to ask users directly if they thought the term “staging area” was better. So I proposed to Jakub Narebski who spearheaded the Git User’s Surveys, and he thought it was idea. Of course we couldn’t ask the question directly, that would bias the responses, so I proposed simply asking if users used the “index/cache/stage”.

Unfortunately Jakub forgot to add that question.

Consistent terminology

In 2011 Piotr Krukowiecki asked if there was a plan to use a consistent term, instead of three.

Junio once again explained why he did not like “stage”:

In short, “stage” is an unessential synonym that came much later, and that is why we avoid advertising it even in the document of “git diff” too heavily. Unlike the hypothetical –index-only synonym for –cached I mentioned earlier that adds real value by being more descriptive, “staged” does not add much value over what it tried to replace.

Junio C Hamano

Pete Harlan explained why “staging area” is better:

FWIW, when teaching Git I have found that users immediately understand “staging area”, while “index” and “cache” confuse them.

Pete Harlan

Aghiles also agreed with Pete.

Once again I brought up my proposed git stage command with subcommands. This time Michael J Gruber expressed that he liked the idea a lot, but stated that the design deviates from the common form. Of course this isn’t true since we have many commands with subcommands: git branch, git tag, git remote, git stash, git submodule, etc.

Piotr explained that the main point of his email was to have a single name, and although the name was not that important, he advocated for stage:

I’m new to git and a non-native English speaker. “Staging” seems most clear of all of the terms. You may find it differently, but please take into consideration that you are accustomed to it.

Piotr Krukowiecki

Pete Harlan explained that the problem to him was that developers working with Git internals were biased towards the term “index”, while everyone else see absolutely no reason to use the low-level details:

Part of the issue could be that one intimately familiar with Git’s internals may find a process oriented interface irritating (“Why must it say ‘staging area’ when it’s just updating the index?”), while one unfamiliar with the internals has the opposite reaction (“Why must it make me use the internal name of the staging area?”).

Pete Harlan

Pete even suggested a git command developed from scratch with an interface that hides all the nuts and bolts from the user.

Once again Drew Northup explored what the definitions of “cache” and “index” mean in English and compared them what git actually does internally, but I argued that what happens internally is irrelevant for the end user:

Branches and tags are “rthetorical” devices as well. But behind scenes they are just refs. Shall we disregard ‘branch’ and ‘tag’?

No. What Git does behind scenes is irrelevant to the user. What matters is what the device does, not how it is implemented; the implementation might change. “Stage” is the perfect word; both verb and a noun that express a temporary space where things are prepared for their final form.

Felipe Contreras

Additionally Miles Bader argued that magit already uses the label “staging area”, and the shortcut to add changes there is “s”.

Jonathan Nieder pushed back on new terms because they were “established”, however I mentioned that the whole point of git 1.8.0 (which later became 2.0) was to rethink git from scratch, so why not rethink these terms? Junio argued that while rethinking is fine, consensus so far has not been achieved, and thus it could not be considered for 1.8.0.

Jeff King argued that as a native English speaker “staging area” makes perfect sense to him:

So the term “staging area” makes perfect sense to me; it is where we collect changes to make a commit. I am willing to accept that does not to others (native English speakers or no), and that we may need to come up with a better term. But I think just calling it “the stage” is even worse; it loses the concept that it is a place for collecting and organizing.

Jeff King

Drew Northup argued that although the term “staging area” has value, it has limited use for somebody trying to learn how git works.

Phil Hord also agreed “staging area” makes perfect sense:

When we pack up our kayak club for a trip, we stage equipment we’re bringing. Eventually we make a decision about which equipment is going and which is staying. The decision is codified by the equipment we leave in the staging area versus the equipment we remove to local storage. Everyone seems to understand the term when we use it in this context.

Phil Hord

Additionally Jeff King agreed that the internals were not relevant for the mental model of the end user:

But note that it is a mental model. The fact that it is implemented inside the index, along with the stat cache, doesn’t need to be relevant to the user. And the fact that the actual content is in the object store, with sha1-identifiers in the index, is not relevant either. At least I don’t think so, and I am usually of the opinion that we should expose the data structures to the user, so that their mental model can match what is actually happening. But in this case, I think they can still have a pretty useful but simpler mental model.

Jeff King

There were other proposals, such as “the bucket”, or “precommit area”. Matthieu Moy said that as a non-native speaker anything “foo area” helps, not necessarily “staging area”, but Alexey Feldgendler disagreed, at least in Russian “staging area” is better.

Alexei Sholik argued that as a new git user, he thinks term “index” only confuses unprepared readers.

Jonathan Nieder decided to write a summary of the situation:

To summarize: everyone knows what the staging area is, no one seems to know what the index is, and the –cached options are confusing.

Jonathan Nieder

When I tried to explore other proposals like “commit preparation area” I eventually arrived back to the conclusion that there’s no better alternative than “staging area”, Miles Bader agreed:

I don’t why so many people seem to be trying so hard to come with alternatives to “staged” and “staging area”, when the latter are actually quite good; so far all the suggestions have been much more awkward and less intuitive.

Miles Bader

Jonathan Nieder agreed and proposed that we stop discussing, in order to start coding. Curiously enough Junio’s argument for not considering this for 1.8.0 is that there was no consensus, but consensus was reached only a couple of messages later.

However, nothing was coded.

git-scm.com refresh

When Scott Chacon presented the new look of git-scm.com in 2012, Junio objected to the fact “staging area” was used prominently (presumably in the Pro Git book):

It seems that you are trying to advocate “staging area” as some sort of official term. I think “it is like a staging area” is a good phrase to use when answering “what is the index?”, but I think repeating it million times without telling the casual readers what its official name is is counterproductive. Don’t do that. It will confuse these same people when they start reading manuals.

Junio C Hamano

Scott disagreed with Junio. The term “staging area” is already quite popular, the only place where it’s not used is in the official documentation (i.e. man pages):

I’m not really trying to advocate it as much as using terminology that is already quite popular. It’s true that it’s not what is used in the man pages, but neither is ‘index’ used consistently – there is ‘cache’ too, in addition to ‘index’ having two meanings – packfile and cache. I’m open to making things clearer, but I just don’t think that changing the terminology to something more technical and vague would be overall less confusing to people.

Scott Chacon

While Junio and Scott mostly discussed where the location of git diff should be, it’s worth noting that merely because Scott disagreed, Junio told him that he should “know better”:

As you are supposed to be one of the top-level Git Teachers, I wish you knew better.  Here is a free Git lesson.

Junio C Hamano

Which Scott understandably found it condescending and unnecessary:

There is absolutely no reason to be this condescending.

Scott Chacon

1.8.0

Since nothing materialized from previous discussions, I decided to write a serious proposal for 1.8.0 (which later became 2.0). The proposal was simple: avoid “index” and “cache” in favor of “stage”.

Philip Oakley stated that he didn’t like the current terms, Matthieu Moy agreed something needs to be done, Zbigniew Jędrzejewski-Szmek that it was a very good idea, Mark Lodato agreed with me, and Sebastien Douche that it was an extremely good idea:

+1000. An anecdote: many attendees said to me “I didn’t understand until you explained it that way”. Now I use always the term “stage” in my training (and banned the term index). Far better.

Sebastien Douche

I agree with Felipe that “staging” is the most appropriate term for “adding to the index” in git. As a native English speaker, I have never thought of “to stage” as relating to shipping in any way. To me, by far the most common usage is in real estate. The seller of a home “stages” it by setting up furniture and decorations to make the home as appealing to prospective buyers as possible. Just search on Google for “home staging” and you will get plenty of hits. This usage clearly originates from theater but can be found in other contexts as well.

Mark Lodato

Yeah, I think that this is a very good idea. Having three different terms for this great but relatively obscure idea adds an unnecessary cognitive burden for newcomers to git. ‘stage’ is certainly the best of the three options.

Zbigniew Jędrzejewski-Szmek

Junio decided to find another objection to the word “stage”: that it might not be directly translated:

I didn’t necessarily wanted to use “stage”, it is “sad” because a new word-hunt may be needed for a replacement to “index” (as “stage” may not be a good word for i18n audience), and then we would need to keep “index”, “stage” and that third word as interchangeable terms.

Junio C Hamano

Ævar Arnfjörð Bjarmason–a very prominent developer–did not think that made sense at all:

I don’t think that line of reasoning makes sense at all. We shouldn’t be picking terms in the original English translation of Git that we think would be more amenable to translation.

We should be picking terms that make the most sense in English, and then translators will translate those into whatever makes the most sense in their language.

Ævar Arnfjörð Bjarmason

This is of course a great point, and Junio simply had to agreed and said “OK“.

Since Junio explained what “index” means in Japanese, and did so in a way that only made sense internally, but not how end users actually use it, I decided to point that out:

That’s what git has, internally, but that’s not how high-level users interact with it.

Felipe Contreras

Matthieu Moy agreed with me.

Thiago Farina argued that neither “cache” or “index” helped him understand what the actual concept was, but “precommit” did.

When Jonathan Nieder proposed some concrete changes Junio objected stating that the name shouldn’t be important:

I personally think it is a wrong way of thinking to focus too much on the “name”, though.

Junio C Hamano

I disagreed:

Names are important. Name it ‘jaberwocky’ and people would have a harder time trying to understand what you are talking about. Maybe just by hearing the name they would give up, thinking it’s too advanced, or too obscure, or what not.

Felipe Contreras

Junio kept referring to the concept “that thing”, although not only is “jabberwocky” not very useful, but “index” is in fact worse than “jabberwocky” because it has an understood meaning that does not match how the user interacts with “that thing”. Moreover, at this point is had become extremely clear that everyone agrees that “staging area” is the term in English that most closely resembles what it would be used for, and translators have already explained why that term is fine.

Junio did not respond back.

Official move

Since nothing happened in 2011, and nothing happened in 2012, in 2013 I took it upon myself to actually write the patches to actually start moving to the term “staging area”. While doing so it occurred to me that the difference between --index and --cached in git apply could be determined with a separate option, so --index becomes --staged --work, and --cached becomes --staged --no-work. Since --work would be enabled by default, git apply --index simply becomes git apply --staged. Moreover, I realized the same could be done for git reset, so the unintuitive differences between --mixed, --soft and --hard become much clearer.

Matthieu Moy thanked me for working on this, found interesting my addition of git stage edit which is currently not possible to do with any git command or option, and agreed that --soft, --mixed, and --hard are terrible.

Junio objected, stating:

IIRC, when this was discussed, many non-native speakers had trouble with the verb “to stage”, not just from i18n/l10n point of view.

Junio C Hamano

I explained to Junio that was not the case. Ævar was the only non-native speaker that had a problem with the term “stage”, but not “staging area”, and in fact he had already translated “index” to “the commit area”, not that it really mattered because as Ævar argued: what should be picked is what makes the most sense in English, and how that is best translated is up to the translators. Even Junio eventually agreed.

Again, everyone has agreed that index needs to be renamed, and “staging area” is the best option.

Felipe Contreras

I asked Junio if he could re-read the threads (I provided the links) so I did not have to list the opinion of every person who participated. Junio did not respond to me, so I started writing the list, but asked him once again if he was planing to do that himself so I did not have to waste my time. He did not reply to that either.

So I had to finish the list. Only one person–Drew Northup–did not see the point in changing the name (but did not find any problem with the proposed name either), the rest, nineteen people were in favor of changing things.

If that was not enough, more people joined on this round:

I realize Git is not a democracy, but if the vote of a humble user counts for anything, I agree that “index” is a terrible name.

I was very excited when Felipe first started this thread, since I thought Git might finally fix one of it’s biggest long-standing usability problems. Calling this thing the “index” is like calling an important variable “someValue.” While the name may be technically correct, it’s way too generic to be useful. A name like “staging area” may not capture the whole idea, but at least it provides a good clue about what it does and how you might use it.

If we change this, I’m pretty sure most of the Internet will rejoice. Only a few old-timers will be grumpy, but that’s just because they don’t like change in general. I have never met anybody (outside this thread) who thought the current name was a good idea.

William Swanson

+1 for staging area

Ping Yin

As yet another “just a user”, I’d like to add my enthusiastic support for “to stage” and “staging area”.

Hilco Wijbenga

All three: William, Ping, and Hilco expressed that they were very excited when I started the thread, because of the possibility of finally fixing this long-standing usability problem.

Junio did not reply again.

Afterwards

That’s the point where I stopped trying, but the discussions continued for years:

Is “staging area” still considered as the correct term or has time proven that index is better?

Lars Vogel

Why is “index” better? It is a confusing name, one that has many other unrelated meanings. In particular, many projects managed by git also have an index, but few have a staging area.

David A. Wheeler

That’s an absurd argument. A database product that wants to be used in library systems are forbidden to have “index” because that may be confused with library index cards?

Junio C Hamano

The list

Against

  • Junio C Hamano: “staging area” is a near-sighted and narrow minded term

Neutral

  • Nanako Shiraishi: the name doesn’t matter
  • Drew Northup: the name doesn’t matter

In favor

  • Felipe Contreras: “staging area” is the correct description
  • Scott Chacon: we should move from “index” to “staging area”
  • Jay Soffian: staging area is better
  • Pete Harlan: “staging area” is good for teaching
  • Aghiles: “staging area” is good for teaching
  • Piotr Krukowiecki: “staging area” makes sense
  • Jonathan Nieder: “staging area” is better than “index”
  • Jeff King: “staging area” makes perfect sense
  • Miles Bader: “staging area” is good
  • Phil Hord: “staging area” is better than index/cache
  • Victor Engmark: maybe “git bucket”
  • David (bouncingcats): maybe “precommit”
  • Alexey Feldgendler: “staging area” translates better into Russian (than precommit)
  • Alexei Sholik: “staging area” is better
  • Zbigniew Jędrzejewski-Szmek: “staging area” is better
  • Ævar Arnfjörð Bjarmason: In Icelandic “index/stage” is translated to “the commit area”
  • Sebastien Douche: “stage” is better than “cache”/”index”
  • Thiago Farina: “precommit” is better
  • Mark Lodato: “staging” is the most appropriate
  • Philip Oakley: “staging area” is OK
  • Matthieu Moy: something needs to be done
  • William Swanson: “index” is a terrible name
  • Ping Yin: +1 for staging area
  • Hilco Wijbenga: I enthusiastically support “staging area”
  • Lars Vogel: “staging area” is better
  • David A. Wheeler: “staging area” is less confusing

Today

Even though I started to contribute again in 2019 and I managed to land 81 patches, Junio once again has decided to ignore all my patches.

I decided to update and cleanup the patches from 2013, if you want to give them a try you can do so on my fork: git-fc. It’s just 7 patches that add a new command and useful options:

  • git stage
  • git unstage
  • git stage --add
  • git stage --remove
  • git stage --diff
  • git stage --edit

Additionally a description of the staging area is added to git help stage. What is missing from the 2013 patches are all the --stage options that are basically aliases to --cached. It is mostly legwork but I will eventually add those too.

Even though the Git project is not a democracy, you can cast your vote on this poll: Should git officially use the term “staging area” instead of “the index”? But if you want this to be done there’s only one way: send an email to the Git mailing list and try to convince Junio C Hamano.

Freedom of speech in online communities

I have debated freedom of speech countless times, and it is my contention that today (in 2021) the meaning of that concept is lost.

The idea of freedom of speech didn’t exist as such until censorship started to be an issue, and that was after the invention of the printing press. It was after people starting to argue in favor of censorship that other people started to argue against censorship. Freedom of speech is an argument against censorship.

Today that useful meaning is pretty much lost. Now people wrongly believe that freedom of speech is a right, and only a right, and worse: they equate freedom of speech with the First Amendment, even though freedom of speech existed before such law, and exists in countries other than USA. I wrote about this fallacy in my other blog in the article: The fatal freedom of speech fallacy.

The first problem when considering freedom of speech a right is that it shuts down discussion about what it ought to be. This is the naturalistic fallacy (confusing what is to what ought to be). If we believed that whatever laws regarding cannabis are what we ought to have, then we can’t discuss any changes to the laws, because the answer to everything would be “that’s illegal”. The question is not if X is illegal currently, the question is should it? When James Damore was fired by Google for criticizing Google’s ideological echo chamber, a lot of people argued that Google was correct in firing him, because it was legal, but that completely misses the point: the fact that something is legal doesn’t necessarily mean it’s right (should be illegal).

Today people are not discussing what freedom of speech ought to be.

Mill’s argument

In the past people did debate what freedom of speech ought to be, not in terms of rights, but in terms of arguments. The strongest argument comes from John Stuart Mill which he presented in his seminal work On Liberty.

Mill’s argument consists on three parts:

  1. The censored idea may be right
  2. The censored idea may not be completely wrong
  3. If the censored idea is wrong, it strengthens the right idea

It’s obvious that if an idea is right, society will benefit from hearing it, but if an idea is wrong, Mill argues that it still benefits society.

Truth is antifragile. Like the inmune system it benefits from failed attacks. Just like a bubble boy which is shielded from the environment becomes weak, so do ideas. Even if an idea is right, shielding it from attacks makes the idea weak, because people forget why the idea was right in the first place.

I often put the example of the idea of flat-Earth. Obviously Earth is round, and flat-Earthers are wrong, but is that a justification for censoring them? Mill argues that it’s not. I’ve seen debates with flat-Earthers, and what I find interesting are the arguments trying to defend the round Earth, but even more interesting are the people that fail to demonstrate that the Earth is round. Ask ten people that you know how would they demonstrate that the Earth is round. Most would have less knowledge about the subject than a flat-Earther.

The worst reason to believe something is dogma. If you believe Earth is round because science says so, then you have a weak justification for your belief.

My notion of the round Earth only became stronger after flat-Earth debates.

Censorship hurts society, even if the idea being censored is wrong.

The true victim

A common argument against freedom of speech is that you don’t have the right to make others listen to your wrong ideas, but this commits all the fallacies I mentioned above, including confusing the argument of freedom of speech with the right, and ignores Mill’s argument.

When an idea is being censored, the person espousing this idea is not the true victim. When the idea was that Earth was circling the Sun (and not the other way around as it was believed), Galileo Galilei was not the victim: he already knew the truth: the victim was society. Even when the idea is wrong, like in the case of flat-Earth, the true victim is society, because by discussing wrong ideas everyone can realize by themselves precisely why they are wrong.

XKCD claims the right to free speech means the government can't arrest you for what you say.
XKCD doesn’t know what freedom of speech is

The famous comic author Randall Munroe–creator of XKCD–doesn’t get it either. Freedom of speech is an argument against censorship, not a right. The First Amendment came after freedom of speech was already argued, or in other words: after it was already argued that censorship hurts society. The important point is not that the First Amendment exists, the important point is why.

This doesn’t change if the censorship is overt, or the idea is merely ignored by applying social opprobrium. The end result for society is the same.

Censorship hides truth and weakens ideas. A society that holds wrong and weak ideas is the victim.

Different levels

Another wrong notion is that freedom of speech only applies in public spaces (because that’s where the First Amendment mostly applies), but if you follow Mill’s argument, when Google fired James Damore, the true victim was Google.

The victims of censorship are at all levels: society, organization, group, family, couple.

Even at the level of a couple, what would happen to a couple that just doesn’t speak about a certain topic, like say abortion?

What happens if the company you work for bans the topic of open spaces? Who do you think suffers? The people that want to criticize open spaces, or the whole company?

The First Amendment may apply only at a certain level, but freedom of speech, that is: the argument against censorship, is valid at every level.

Online communities

Organizations that attempt to defend freedom of speech struggle because while they want to avoid censorship, some people simply don’t have anything productive to say (e.g. trolls), and trying to achieve a balance is difficult, especially if they don’t have a clear understanding of what freedom of speech even is.

But my contention is that most of the struggle comes from the misunderstandings about freedom of speech.

If there’s a traditional debate between two people, there’s an audience of one hundred people, and one person in the audience starts to shout facts about the flat-Earth, would removing that person from the venue be a violation of freedom of speech? No. It’s just not part of the format. In this particular format an audience member can ask a question at the end in the Q&A part of the debate. It’s not the idea that is being censored, it’s the manner in which the idea was expressed that is the problem.

The equivalent of society in this case is not hurt by a disruptive person being removed.

Online communities decide in what format they wish to have discussions in, and if a person not following the format is removed, that doesn’t hide novel ideas nor weakens existing ideas. In order words: the argument against censorship doesn’t apply.

But in addition the community can decide which topics are off-topic. It makes no sense to talk about flat-Earth in a community about socialism.

But when a person is following the format, and talking about something that should be on-topic, but such discussion is hindered either by overt censorship (e.g. ban), or social opprobrium (e.g. downvotes), then it is the community that suffers.

Ironically when online communities censor the topic of vaccine skepticism, the only thing being achieved is that the idea becomes weak, that is: the people that believe in vaccines do so for the wrong reasons (even if correct), so they become easy targets for anti-vaxxers. In other words: censorship creates the exact opposite of what it attempts to solve.

Online communities should fight ideas with ideas, not censorship.

How a proper git commit is made (by a git developer)

Throughout my career of 20 years as an open source software developer I’ve seen all kinds of projects with all kinds of standards, but none with higher standards as the Git project (although the Linux kernel comes pretty close).

Being able to land a commit on git.git is a badge of honor that not all developers would be able to get, since it entails a variety of rare skills, like for example the ability to receive criticism after criticism, and the persistence to try many times–sometimes reaching double-digit attempts, not to mention the skill necessary to implement the solution to a problem nobody else saw before–or managed to implement properly–in C, and sometimes the creativity to come up with alternative solutions that address all the criticism received previously.

The purpose of this post is not to say “this is why Git is better than your project” (the Git project has many issues), the purpose is to showcase parts of a fine-tuned development process so you as a developer might consider adopting some yourself.

Discussion

The first part of the process (if done correctly) is inevitably discussion. The git mailing list is open, anyone can comment on it, and you don’t even need to be subscribed, just send a mail to the address and you will be Cc’ed in all the replies.

This particular story starts with a mail by Mathias Kunter in which he asks why git push fails with certain branches, but only if you don’t specify them. So for example if you are in a branch called “topic”, this would fail:

git push

But not this:

git push origin topic

Why? His question is a valid one.

The first reply comes from me, and I explain to him how the code views the two commands, for which a basic understanding of refspecs is needed. Briefly, a refspec has the form of src:dst, so “test:topic” would take the “test” branch and push it to the remote as “topic”. In the first command above, the refspec would be the equivalent of “topic:” (no destination), while the second command would be “topic:topic”. In other words: git does’t assume what name you want as destination.

Did I know this from the top of my head? No. I had to look at the code to understand what it’s doing before synthesizing my understanding in as succinct as a reply as I possibly could write. I often don’t look at the official documentation because I find it very hard to understand (even for experts), and it’s often inaccurate, or ambiguous.

Notice that I simply answered his question of why the first command fails, and in addition I offered him a solution (with push.default=current git would assume the name of the destination to be the same as the source), but at no point did I express any value judgement as of what was my opinion of what git ought to actually do.

Mathias thanked me for my reply, and pushed back on the solution to use the “current” mode because he thought the “simple” mode (which is the default), should behave the same way in this particular case. For his argument he used the documentation about the simple mode:

When pushing to a remote that is different from the remote you normally pull from, work as current.

This is where the nuance starts to kick in. If the “topic” branch has an upstream branch configured (e.g. “origin/topic”), then git push would behave the same in both the “simple” and “current” modes, however, if no upstream branch is configured (which is usually the case), then it depends on the remote. According to Mathias, if he has no upstream configured, then there’s no “remote that is different from the remote you normally pull from” (the remote you normally pull from in this case is “origin”, because the upstream branch is “origin/topic”), so “simple” should work like “current”.

In my opinion he is 100% right.

Additionally, I have to say we need more users like Mathias, who even though he knows how to fix the issue for himself, he is arguing that this should be fixed for everyone.

Elijah Newren suggested that perhaps the documentation could be changed to explain that this only happens when “you have a remote that you normally pull from”, in other words: when you have configured an upstream branch. But this doesn’t make sense for two reasons. If you don’t have configured an upstream branch, then the other mode that would be used is “upstream”, but “upstream” fails if you don’t have configured an upstream branch, so it would always fail. Secondly, the reason why it doesn’t always fail is that the former is not possible: when you don’t have configured an upstream branch, “origin” is used, therefore you always have a “remote that you normally pull from”.

I explained to Elijah that the remote is never null, since by default it’s “origin”, and suggested a trivially small modification to the documentation to mention that (since even experts like him miss that), but he suggested an even bigger one:

If you have a default remote configured for the current branch and are pushing to a remote other than that one (or if you have no default remote configured and are pushing to a remote other than ‘origin’), then work as ‘current’.

Oh boy! I cannot even being to explain why I find this explanation so wrong on so many levels, but let me start by saying I find this completely unparseable. And this is the biggest problem the official git documentation has: it’s simply translating what the code is doing, but if the code is convoluted, then the documentation is convoluted as well. I shred this paragraph piece by piece and showed why changing the logic would make it much more understandable.

But at this point I got tired of reading the same spaghetti code over and over again. It’s not just that I found the code hard to follow, it’s that I wasn’t actually sure of what it was doing. And so my reorganization patch series began.

The series

I am of the philosophy of code early, and code often. I don’t believe in design as separate from code, I believe the design should evolve as the code evolves. So I just went ahead and wrote the thing. The first version of my patch series reorganized the code so that the “simple” mode was not defined in terms of “current” and “upstream”, but as a separate mode, and then once it became clear what “simple” actually did, redefine “current” in terms of “simple”, rather than the other way around. It consisted of 11 patches:

  1. push: hedge code of default=simple
  2. push: move code to setup_push_simple()
  3. push: reorganize setup_push_simple()
  4. push: simplify setup_push_simple()
  5. push: remove unused code in setup_push_upstream()
  6. push: merge current and simple
  7. push: remove redundant check
  8. push: fix Yoda condition
  9. push: remove trivial function
  10. push: flip !triangular for centralized
  11. doc: push: explain default=simple correctly

I could describe each one of these patches in great detail, and in fact I did: in the commit messages of each of these patches, but just to show an example of the refactorization these patches do, let’s look at patch #7, #8, and #9.

push: remove redundant check

 static int is_workflow_triangular(struct remote *remote)
 {
-	struct remote *fetch_remote = remote_get(NULL);
-	return (fetch_remote && fetch_remote != remote);
+	return remote_get(NULL) != remote;
 }

There is no need to do two checks: A && A != B, because if A is NULL, then NULL != B (B is never NULL), so A != B suffices.

push: fix Yoda condition

 static int is_workflow_triangular(struct remote *remote)
 {
-	return remote_get(NULL) != remote;
+	return remote != remote_get(NULL);
 }

There’s a lot of Yoda conditions in the git code, and it’s very hard for many people to parse what that code is supposed to do in those situations. Here we want to check that the remote we are pushing to is not the same as the remote of the branch, so the order is the opposite of what’s written.

push: remove trivial function

-static int is_workflow_triangular(struct remote *remote)
-{
-	return remote != remote_get(NULL);
-}
-
 static void setup_default_push_refspecs(struct remote *remote)
 {
 	struct branch *branch = branch_get(NULL);
-	int triangular = is_workflow_triangular(remote);
+	int triangular = remote != remote_get(NULL);

Now that the code is very simple, there’s no need to have a separate function for it.

The rest

Notice that there’s absolutely no functional changes in the three patches above: the code after the patches ends up doing the exactly same as before. The same applies for all the other patches.

There’s three things that I think are important of the overall result: 1. the new function setup_push_simple is a standalone function, so to understand the behavior of the “simple” mode, all you have to do is read that function, 2. the existing function is_workflow_triangular is unnecessary, has the wrong focus, and is inaccurate, and 3. now that it’s easy to follow setup_push_simple, the documentation becomes extremely clear:

pushes the current branch with the same name on the remote.

If you are working on a centralized workflow (pushing to the same repository you pull from, which is typically origin), then you need to configure an upstream branch with the same name.

Now we don’t need to argue about what the “simple” mode does, there’s no confusion, and the documentation while still describing what the code does is easy to understand by anyone, including the original reporter: Mathias.

Great. Our work is done…

Not so fast. This is just the first step. Even though this patch series is perfectly good enough, in the Git project the first version is very rarely accepted; there’s always somebody with a comment.

The first comment comes from Elijah, he mentioned that in patch #3 I mentioned that I merely moved code around, but that wasn’t strictly true, since I did remove some dead code too, and that made it harder to review the patch for him. That wasn’t his only comment, additionally he stated that in patch #4 I should probably add a qualification to explain why branch->refname cannot be different from branch->merge[0]->src, and in patch #10 he said the commit message seemed “slightly funny to [him]” but he agreed it made the code easier to read, and finally in the last patch #11 which updates the documentation:

Much clearer. Well done.

I replied to all of Elijah’s feedback mostly stating that I’m fine with implementing his suggestions, but additionally I explained the reasoning behind changing “!triangular” to “centralized”, and that’s because I heard Linus Torvalds state that the whole purpose of git was to have a decentralized version control system so it makes no sense to have a conditional if (triangular) when that conditional is supposed to be true most of the time.

Additionally Bagas Sanjaya stated that we was fine with the changes to the documentation, since the grammar was fine.

v2

OK, so the first version seemed to be a success in the sense that everyone who reviewed it didn’t find any fatal flaws in the approach, but that doesn’t mean the series is done. In fact, most of the work is yet ahead.

For the next version I decided to split the patch series into two parts. The first part includes the patches necessary to reach the documentation update which was the main goal, and the second part would be everything that makes the code more readable but which is not necessary for the documentation update. For example changing !triangular to centralized is a nice change, but not strictly necessary.

Now, v2 requires more work than v1 because not only do I have to integrate all the suggestions from Elijah, but I have to do it in a way that it’s still a standalone patch series, so anyone who chooses to review v2 but hasn’t seen v1, can understand it. So essentially v2 has to be recreated.

This is why git rebase is so essential, because it allows me to choose which commits to update, and the end result would look like all the suggestions from Elijah had always been there.

But not only that, I have to update the description of the series (the cover letter in git lingo), to explain what the patch series does for the people that haven’t reviewed it before, and in addition explain what changed from v1 to v2 for the people that have already reviewed the first version.

This is where one tool which I think is mostly unknown comes into play: git rangediff. This tool compares two versions of a patch series (e.g. v1 and v2), and then generates a format similar to a diff delineating what changed. It takes into consideration all kinds of changes, including changes to the commit message of every commit, and if there are no changes either in the code or the commit message, that’s shown too.

So essentially people who have already reviewed v1 can simply look at the rangediff for v2 and based on that figure out if they are happy with the new version. The reason why it’s called rangediff is because it receives two ranges as arguments (e.g. master..topic-v1 master..topic-v2).

This time the result was 6 patches. If you check the rangediff you can see that I made no changes in the code, whoever, I changed some of the commit messages, and 5 patches were dropped.

Let’s look at one change from the rangediff for patch #3.

3: de1b621b7e ! 3: d66a442fba push: reorganize setup_push_simple()
    @@ Metadata
      ## Commit message ##
         push: reorganize setup_push_simple()
     
    -    Simply move the code around.
    +    Simply move the code around and remove dead code. In particular the
    +    'trivial' conditional is a no-op since that part of the code is the
    +    !trivial leg of the conditional beforehand.
     
         No functional changes.
     
    +    Suggestions-by: Elijah Newren <newren@gmail.com>
         Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
     
      ## builtin/push.c ##

It should be pretty straightforward to see what I changed to the commit message.

The second part is what is most interesting. Not only does it includes better versions of the 5 patches that were dropped from the previous series, but it includes 10 patches more. So this is something that has to be reviewed from scratch, since it’s completely new.

  1. push: create new get_upstream_ref() helper
  2. push: return immediately in trivial switch case
  3. push: reorder switch cases
  4. push: factor out null branch check
  5. push: only get the branch when needed
  6. push: make setup_push_* return the dst
  7. push: trivial simplifications
  8. push: get rid of all the setup_push_* functions
  9. push: factor out the typical case
  10. push: remove redundant check
  11. push: fix Yoda condition
  12. push: remove trivial function
  13. push: only get triangular when needed
  14. push: don’t get a full remote object
  15. push: rename !triangular to same_remote

I’m not going go through all the different attempts I made locally to arrive to this series, but it was way more than one. I used git rebase over and over to create a sequence of commits to arrive to a clean result, where each commit implemented a logically-independent change that I thought would be easy to review, would not break previous functionality, and would make the code easier to read from the previous state.

The highlights from this series is that I added a new function get_upstream_ref that would be used by both the “simple” and “upstream” modes, but additionally by reorganizing the code I was able to remove all the setup_push_* functions, including the one I introduced: setup_push_simple. By moving everything into the same parent function now it’s easy to see what all the modes do, not only the “simple” mode. Additionally, instead of changing !triangular to centralized, I decided to change it to same_remote. Upon further analysis I realized that you could be in a triangular workflow and yet be pushing into the same repository (that you pull from), and the latter is what the code cared about, not triangular versus centralized. More on that later.

This time much more people commented: Mathias Kunter, Philip Oakley, Ævar Arnfjörð Bjarmason, and Junio C Hamano himself (the maintainer).

Philip Oakley pointed out that the documentation currently doesn’t explain what a triangular workflow is, but that isn’t a problem of this patch series. Ævar Arnfjörð Bjarmason provided suggestions for the first part of the series, but those were not feasible, and overridden by the second part of the series. Mathias Kunter said he actually preferred the explanation of the “simple” mode provided in another part of the documentation, but he incorrectly thought my patches changed the behavior to what he suggested, but that wasn’t the case.

Junio provided comments like:

  • Much cleaner. Nice.
  • Simpler and nicer. Good.
  • Nice.
  • Nicely done.
  • Well done.

If you want to take a look at the change from the most significant patch, check: push: get rid of all the setup_push_* functions.

His only complaints were my use of the word “hedge” (which he wasn’t familiar with), of “move” (when in fact it’s duplicated), “reorder” (when in his opinion “split” is better), and that he prefers adding an implicit break even if the last case of a switch is empty.

I explained to Junio that the word “hedge” has other meanings:

* to enclose or protect with or as if with a dense row of shrubs or low trees: ENCIRCLE

* to confine so as to prevent freedom of movement or action

Some synonyms: block, border, cage, confine, coop, corral, edge, fence, restrict, ring, surround.

So my use of it fits.

For most of his comments I saw no issue implementing them as most didn’t even involve changing the code, but additionally he suggested changing the order so the rename of triangular is done earlier in the series. While I have no problem doing this and in fact I already tried in an interim version of the series, I knew it would entail resolving a ton of conflicts, and worse than that, if later on somebody decides they don’t like the name same_remote I would have to resolve the same amount of conflicts yet again. I said I would would consider this.

Moreover, I recalled the reason why I chose same_remote instead of centralized, and I explained that in detail. Essentially it’s possible to have a triangular workflow that is centralized, if you pull and push to different branches but of the same repository. The opposite of triangular is actually two-way, not centralized.

centralized = ~decentralized
triangular = ~two-way

This triggered a discussion about what actually is a triangular workflow, and that’s one of the benefits of reviewing patches by email: discussions turn into patches, and patches turn into discussions. In total there were 36 comments in this thread (not counting the patches).

v3

For the next version I decided not just to move the same_remote rename sooner in the series as Junio suggested, but actually in the very first patch. Although a little risky, I thought there was a good chance nobody would have a problem with the name same_remote, since my rationale of triangular versus two-way seemed to be solid.

Part 1:

  1. push: rename !triangular to same_remote
  2. push: hedge code of default=simple
  3. push: copy code to setup_push_simple()
  4. push: reorganize setup_push_simple()
  5. push: simplify setup_push_simple()
  6. push: remove unused code in setup_push_upstream()
  7. doc: push: explain default=simple correctly

Part 2:

  1. push: create new get_upstream_ref() helper
  2. push: return immediately in trivial switch case
  3. push: split switch cases
  4. push: factor out null branch check
  5. push: only get the branch when needed
  6. push: make setup_push_* return the dst
  7. push: trivial simplifications
  8. push: get rid of all the setup_push_* functions
  9. push: factor out the typical case
  10. push: remove redundant check
  11. push: remove trivial function
  12. push: only check same_remote when needed
  13. push: don’t get a full remote object

Moving the same_remote patch to the top changed basically the whole series, as can be seen from the rangediff, but the end result is exactly the same, except for one little break.

Junio mentioned that now the end result matches what he had prepared in his repository and soon included then in his “what’s cooking” mails.

* fc/push-simple-updates (2021-06-02) 7 commits

 Some code and doc clarification around "git push".

 Will merge to 'next'.

* fc/push-simple-updates-cleanup (2021-06-02) 13 commits

 Some more code and doc clarification around "git push".

 Will merge to 'next'.

So there it is, unless somebody finds an issue, they will be merged.

To be honest this is not representative of a typical patch series. Usually it takes more than 3 tries to got a series merged, many more. And it’s also not common to find so many low-hanging fruit.

switch (push_default) {
default:
case PUSH_DEFAULT_UNSPECIFIED:
case PUSH_DEFAULT_SIMPLE:
    if (!same_remote)
        break;
    if (strcmp(branch->refname, get_upstream_ref(branch, remote->name)))
        die_push_simple(branch, remote);
    break;

case PUSH_DEFAULT_UPSTREAM:
    if (!same_remote)
        die(_("You are pushing to remote '%s', which is not the upstream of\n"
              "your current branch '%s', without telling me what to push\n"
              "to update which remote branch."),
            remote->name, branch->name);
    dst = get_upstream_ref(branch, remote->name);
    break;

case PUSH_DEFAULT_CURRENT:
    break;
}

refspec_appendf(&rs, "%s:%s", branch->refname, dst);
Documentation/config/push.txt | 13 +++----
builtin/push.c | 91 +++++++++++++++++++------------------------
2 files changed, 47 insertions(+), 57 deletions(-)

The commit

These two patch series cleaned up the code and improved the documentation, but they didn’t actually change any functionality. Now that it’s clear what the code is actually doing, Mathias Kunter’s question is easily answered: why is git push failing with “simple”? Because get_upstream_ref always fails if there’s no configured upstream branch, but should it?

All we have to do is make it not fail. In other words: if there’s no upstream branch configured, just go ahead, and thus it would behave as “current”.

Then this:

If you are working on a centralized workflow (pushing to the same repository you pull from, which is typically origin), then you need to configure an upstream branch with the same name.

Becomes this:

If you are working on a centralized workflow (pushing to the same repository you pull from, which is typically origin), and you have configured an upstream branch, then the name must be the same as the current branch, otherwise this action will fail as a precaution.

Which makes much more sense.

Right now pushing to master works by default:

git clone $central .

git push

But not pushing to a new topic branch:

git clone $central .
git checkout -b topic

git push

It makes no sense to allow pushing “master” which is much more dangerous, but fail pushing “topic”.

This is the patch that changes the behavior to make it sensible, where I explain all of the above, and the change in the code is simple, basically don’t die().

It took 47 days from the moment Mathias sent his question to the point where Junio merged my patches to master, but we are close the finish line, right?

No, we are not.

Convincing Junio of some code refactoring is relatively easy, because it’s simply a matter of showing that 2 + 2 = 4; it’s a technical matter. But convincing him of what git ought to do is much more difficult because it requires changing his opinion using arguments, this part is not technical, but rhetorical.

For reference, even though several years ago I managed to convince everyone that @ is a good shortcut for HEAD, Junio still complains that it “looks ugly both at the UI level and at the implementation level“, so for some reason my arguments failed to convince him.

So can Junio be convinced of this obvious fix? Well, everything is possible, but I wouldn’t be holding my breath, especially since he has decided to ignore me and all my patches.

Either way the patch is good. It’s simple thanks to the many hours I spent cleaning up the code, and benefited from all the input from many reviewers. And of course if anyone still has any comments on it they are free to state them on the mailing list and I’d be happy to address them.

Why is git pull broken?

A lot of people complained that my previous post–git update: the odyssey for a sensible git pull–was too long (really? an article documenting 13 years of discussions was long?), and that a shorter version would be valuable. The problem is that the short version is actually too short:

Do not use git pull.

That’s it, really.

But why? Even thought it’s obvious for me, and many other developers why git pull is broken and should not be used by most users, presumably a lot of people don’t know that, since they continue to use it.

Here it is.

Caveat

Let’s start by explaining where git pull is not broken.

It was created for maintainers; when you send a pull request, a maintainer is supposed to run git pull on the other side. For this git pull works perfectly fine.

If you are a developer (non-maintainer), and use a topic branch workflow, then you don’t even need git pull.

That leaves developers who work on a centralized workflow (e.g. trunk-based development). The rest of the article is with them in mind, who unfortunately are the vast majority of users, especially novices.

It creates merge commits

What most people want to do is synchronize their local branch (e.g. “master”) with the corresponding remote branch (e.g. “origin/master”), in particular because if they don’t, git push fails with:

To origin
! [rejected] master -> master (non-fast-forward)
error: failed to push some refs to 'origin'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull …') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

OK, so we need to “integrate” the remote changes with git pull, so presumably git pull is the mirror of git push, so it makes sense.

Except it’s not, and git pull was never designed with this use case in mind. The mirror of git push is git fetch which simply pulls the remote changes locally so you can decide later on how to integrate those changes. In Mercurial hg pull is the equivalent of git fetch, so in Mercurial hg push and hg pull are symmetric, but not in git.

At some point in time a path to make git pull be symmetric to git push was delineated, but the maintainer of the Git project considered it “mindless mental masturbation“, so forget about it.

After you have pulled the changes with git fetch, then there’s two possibilities: fast-forward and diverging.

fast-forward

A fast-forward is simple; if the local branch and the remote branch have not diverged, then the former can be easily updated to the latter.

In this case “master” (A) can be fast-forwarded to “origin/master” (C) (only possible if the branches have not diverged).

merge

However, if the branches have diverged, it’s not that easy:

In this case “master” (D) has split of “origin/master” (C) so a new commit (E) is needed to synchronize both.

rebase

There’s another more advanced possibility if the branches have diverged:

In this case the diverging commit of the local branch “master” (D) is recreated on top of “origin/master” (C) so the resulting history is linear (as if it never had diverged in the first place and the base of the local branch was C).

Choices

OK, so if the branches have diverged you have two options (merge or rebase), which one should you pick? The answer is: it depends.

Some projects prefer a linear history, in those cases you must rebase. Other projects prefer to keep the history intact, so it’s fine if you merge. If you don’t do many changes then most of the time you can fast-forward.

Most experts would do a rebase, but if you are new to git a merge is easier.

We are still nowhere near a universal answer, and what do most people do when the answer is not clear? Nothing. By default git pull does a merge, so that’s what most people end up doing by omission, but that’s not always right.

So that’s the first problem: git pull creates a merge commit by default, when it shouldn’t. People should be doing git fetch instead and then decide whether to merge or rebase if the branches have diverged (a fast-forward is not possible).

Merges are created in the wrong order

Let’s say the project allows merges, in that case it’s OK to just do git pull (since the default action is merge) right?

Wrong.

This is what git pull does by default: a merge commit. However, it’s merging “origin/master” (C) into “master” (D), but upstream is the remote repository, not the local one.

The order is wrong:

This is a correct merge: the local “master” (D) is merged into the remote “origin/master” (C). A similar result would happen if you had created a topic branch for D, and then merged that into “master”

In git, merge commits are commits with more than one parent, and the order matters. In the example above the first parent of E is C, and the second one is D. To refer to the first parent you do master^1, the second is master^2.

Proper history

Does it really matter which is the first parent? Yes it does.

Correct vs. incorrect order

In the correct history (left) it’s clear how the different topic branches are integrated into “master” (blue). Visualization tools (e.g. gitk) are able to represent such history nicely. Additionally you can do git log --first-parent to traverse only the main commits (blue).

In the incorrect history (right) the merges are a mess. It’s not clear what merged into what, visualization tools will show a mess, and git log --first-parent will traverse the wrong commits (green ones).

Better conflict resolution

If that wasn’t enough, at the time of resolving conflicts it makes more sense to think of integrating your changes to upstream (“origin/master”) rather than the other way around. Mergetools like meld would present the flow correctly: from right to the middle.

Consensus

Update: In the original version of the article I only concentrated on the facts, and I didn’t include the opinion of other developers, but since there seems to be a lot of people ignoring the facts, and distrusting my judgement, I’ve decided to list some of the developers who agree git pull isn’t doing what it should be doing (at least by default, for non-maintainers).

Conclusion

So every time you do a merge, you do it wrong. The only way to use git pull correctly is to configure it to always do a rebase, but since most newcomers don’t know what a rebase is, that’s hardly a universal solution.

The proper solution is my proposal for a git update command that creates merge commits with the correct order of the parents, does only fast-forwards by default, and can be properly configured.

So there you have it. Now you know why git pull is definitely broken and should not be used. It was never intended to be used by normal users, only maintainers.

Do git fetch instead, and then decide how to integrate your changes to the remote branch if necessary.

git update: the odyssey for a sensible git pull

While git pull kind of works, over the years a slew of people have pointed out many flaws, and proposed several fixes. Some have even suggested that newcomers should be discouraged from using the command (as they often use it wrongly), and others to remove it entirely.

I spent several days digging through the entire history of the Git mailing list in order to document all the discussions related to git pull and its default mode. It’s a long story, so grab some coffee (a whole pot of it).

Update: I wrote a shorter article that only talks about why git pull is broken.

Preamble

The entire post is going to be talking about fast-forward merges so I’ll attempt to explain them briefly.

Basically when your history and the remote history have diverged, like:

non-fast-forward

a fast-forward merge is not possible: you need to create a real merge that will have two parents: C and B.

The fast-forward case is much simpler:

fast-forward

While you can create a merge, in this case it’s not necessary: the “master” branch pointer can simply move forward to B.

The whole story is about what happens when a novice does git pull when a fast-forward is not possible.

--ff-only

The story begins in 2008 with a request from Sverre Hvammen Johansen to add the --ff-only option to git merge (then called git-merge), since in his workflow a certain branch was only allowed to fast-forward.

A few weeks later we get the first patch to implement the --ff=only merge strategy. Unfortunately the patch was more complicated that it needed to be, and it wasn’t merged.

Sverre gave it another go a few months later. Junio C. Hamano (the maintainer) said “with vastly improved documentation and justification compared to the previous rounds, I am beginning to actually like this series”. Unfortunately even though Sverre sent several rounds, they were never merged. Good attempt by Sverre.

Another attempt was made at the beginning of 2009 by Yuval Kogman, this time only adding the --ff-only option. What it did was simple: if a fast-forward is possible then do it, if not then fail. Everyone was in favor. Initial versions of the patch had some issues, those were fixed, but unfortunately the last iteration received no feedback. Thanks for trying though.

A few months later, the question was asked “how do you fast-forward your local dev to sync up with origin/dev?“. If the previous patch was merged, the answer would have been simple: git merge --ff-only, but since that wasn’t the case, Junio replied with:

git push . origin/dev dev

This is not user-friendly if you ask me, and later on turned out it wasn’t even correct.

Months later Randal L. Schwartz brings up the same issue and demands a simple command that can be told to users in the #git IRC channel who constantly asked this question. It is pointed out such thing has been implemented twice already, but never merged.

To which Junio replied:

Do you mean twice they were both found lacking, substandard, useless, uninteresting, buggy, incorrect, or all of the above?

Geez! Eyvind was just trying to help. They clearly were not useless nor uninteresting, since many people had rallied in favor of such a feature, and yet more joined saying +1, even in this very thread.

The first big one

At the end of the year appears the first big discussion about the asymmetry of pull and push. Thomas Rast makes very bold suggestions, like deprecating git fetch completely. These drastic changes would have worked using the suggestion of Wesley J. Landaker to add a configuration, and deprecation warnings.

This proposal was not taken seriously; Junio considered the suggestion to turn git pull into git fetchmindless mental masturbation“, and did not see how git pull and git push could ever be symmetric.

On the other hand an interesting piece of data was provided by Björn Steinbrink. He investigated 10 days of IRC activity in the #git channel and found plenty of users confused by git pull, many expected not only there to be a --ff-only option, but this behavior to be the default.

Aside from --ff-only, other main ideas were:

  1. Add a --merge option to git pull
  2. Add a pull.merge configuration

Björn Gustavsson used this big thread as inspiration to implement --ff-only for both git merge and git pull.

Finally. After three implementations, Junio accepted the idea and merged the patches in October 2009.

Begging for a configuration

Not long after the introduction of git pull --ff-only (6 months later), the first request for a configuration to make it the default appears in 2010. Peter Kjellerstedt argues that pull.options and merge.options configurations would fit him well. Not only that, but he thinks --ff-only should be the default. Nothing comes of this.

In 2013 people in the Linux Kernel Mailing List found issues with git pull and Linus Torvalds brought them to the Git mailing list. Linus argued that depending on whether you are a maintainer or a submaintainer you might want to do either --ff-only, or a true merge. An alias could do the trick (for pull --ff-only), but asked the Git community for ideas.

One of the interesting ideas came from Theodore Ts’o; have a per-repository configuration. Junio did follow this thread closely, but nothing happened.

Only days later, Ramkumar Ramachandra introduces the idea of a pull.default configuration which ideally could be specified on a repository-specific basis. Ramkumar doesn’t seem to have pursued his idea further.

Second big one

A couple of months later Andreas Krey triggered a huge discussion when he pointed out that the order of the parents when doing a merge was wrong. What does that mean?

So this is what git pull does:

The new merge commit D merges B into C, this is fine if you are a maintainer, and instead of “origin/master” you are pulling something like “john/topic” into your “master” branch, in that case your “master” (C) is the first parent. But it’s not OK for normal developers (i.e. non-maintainers).

A normal developer would want to do the opposite: to merge his “master” (C) to the upstream “origin/master” (B), so the first parent is “origin/master” (B). This is not a trivial thing. The order of the parents completely changes how history is visualized, and even traversed, plus it’s important when resolving conflicts, along many other things.

Junio’s response:

Don’t do that, then.

For the record: there’s no way to tell git pull not to reverse the parents, and the four people that expressed an opinion said they would rather have the order reversed (John Szakmeister, Jeremy Rosen, John Keeping, and of course Andreas Krey).

Junio said he would not be opposed to an option to reverse the order of the parents, but as you’ll see later on, when I sent a proposal to do precisely that he ignored it.

Yet again another suggestion to make --ff-only the default, this time from John Keeping. This round I chimed in, and so did many others. Junio was not convinced this was the way to go, because integrators would be affected:

It is not about passing me, but about not hurting users like kernel folks we accumulated over 7-8 years.

Since I have worked on the Linux kernel, I explained to Junio that most lieutenants (submaintainers) don’t do git pull; they send pull requests to Linus, so essentially Linus is the only person that does git pull as an integrator (as Junio does). Linus is certainly capable of configuring git pull to something other than --ff-only (which I don’t know why Junio switched for --rebase in his responses).

That is not something I can agree or disagree without looping somebody whose judgement I can trust from the kernel circle ;-).

That somebody was none other than Linus Torvalds, who agreed with us:

It would be a horrible mistake to make “rebase” the default, because it’s so much easier to screw things up that way.

That said, making “[--ff-only]” the default, and then if that fails, saying [a big warning message], [would make sense].

Linus suggested it would be even better to make this configuration per-branch, which is similar to the suggestion of Theodore Ts’o’s idea of making it per-repository.

Eventually, using Mercurial as inspiration, I suggested the following error message:

The pull was not fast-forward, please either merge or rebase.

This seemed to land on Junio, who finally saw why --ff-only made sense:

Initially I said limiting “git pull” to “--ff-only” by default did not make sense, but when we view it that way, I now see how such a default makes sense.

John Keeping came with another good idea by using a hypothetical command git update as an example. This command, unlike git pull would have the correct order of the parents, and would be similar to svn update, and others. While John didn’t propose such command, eventually I did.

After Junio realized the value of --ff-only he event sent a patch integrating Linus’ suggestions and mine. However, I came up with a simpler approach that in my opinion was better, and that triggered another big discussion.

Third big one

My response to Junio’s patch triggered yet another big discussion.

Richard Hansen argued that git pull was straight up dangerous, and he recommended this instead:

git remote update -p
git merge --ff-only @{u}'

He suggested people have that as a git up alias.

In September of 2013 I sent a patch series to add a pull.mode configuration which can be set to merge-ff-only. It was completely ignored, however Junio soon after made a similar proposal on the other thread: pull.integrate with an option of fail back. I argued that my proposal of pull.mode was better, and other people agreed.

Another point I argued is that we needed a deprecation warning before changing the default mode, something like “in the future you would have to choose a merge or a rebase, merging automatically for now”.

Jeff King provided some useful information from GitHub trainers, essentially they didn’t like the rebase option by default because most people don’t know what a rebase is. This is what prompted me to add the following line to the warning:

If unsure, run “git pull --merge“.

v3 of my patch series did receive some feedback, but it got stuck with trivial details. I tried to fix them for v4, but didn’t receive any feedback at all.

Moving on

At the beginning of 2014 David Aguilar sent a couple of patches to add the pull.ff configuration, and Junio simply merged them with no comment.

This ignored all the previous feedback, and doesn’t add a configuration per-repository, or per-branch, but well… At least there’s a configuration now (sort of).

Pull is Evil (first huge one)

In April of 2014 Marat Radchenko pointed out a list of problems 20 newcomers in his organization faced after trying everyday tasks. Most of the problems came from the order of the parents in git pull. Marc Branchaud pretty early on renamed the thread to “Pull is Evil” since in his opinion git pull is pretty much broken, due to the fact that different people use it in different ways.

I think the pull command should be deprecated and quietly retired as a failed experiment.

Junio brought up again the idea of a git update command, which would demarcate the behavior meant for integrators, and normal developers.

Another interesting idea came from W. Trevor King: add an --interactive option (like git add or git rebase) so the user can choose to a) merge, b) rebase, or c) fast-forward.

I pointed that this has been discussed before, everyone agrees the default should change, and I had already sent the patches. Matthieu Moy rejected the notion that everyone was in favor, since he himself wasn’t, but I pointed that he was the only person not in favor:

But in fact, he was in favor, not with changing the default, but with the configuration I proposed. So he wasn’t against all my patches, just one.

Either way I still think Matthieu Moy is wrong. Yes he is right in questioning why ask if to merge or rebase if most newcomers would not want to know at that point in time, but the answer is simple: they don’t have to know, they can simply do git pull --merge and find out later why that --merge is necessary. Exactly the same thing happens with git commit --all.

Pull is Mostly Evil (Pull is Evil part 2)

Marc Branchaud din’t find a good subthread to reply to, so he started yet another thread. In it he suggested that since git pull can only do the right thing if configured properly, it should be an advanced command that must be configured to be used, and when it’s not configured it does nothing by default.

Junio once more pondered about a git update command that would separate the two clearly distinct ways of using git pull.

Being frustrated with the lack of follow-up I made the prediction that the consensus and the proposals will likely be ignored and nothing would change:

And it has been discussed before. If history is any indication, it will be discussed again.

Spoiler: I was right.

Jeff King did not like my comment, and also stated that he wasn’t sure the majority of users would want the parents on a merge done with git pull to be the other way. This despite many big threads started precisely because the order of the parents was wrong, and many developers stating that this was the only reason why they didn’t use git pull, nor did they recommend newcomers to use it. I offered to point him to the many endless discussions in the mailing list.

Richard Hansen wrote a comprehensive description of the issues, and how they could be solved with different new commands: git update and git integrate, as well as other options.

The rest of the discussion was pretty much Richard Hansen and me hashing out different use-cases and ways to deal with them.

Noise

The discussion did not go without hiccups. In particular David Kastrup and James Denholm threw some personal attacks at me that had absolutely nothing to do with the discussion. Back then I was much more abrasive that I am today, in many instances throughout the entire history of the mailing list (including these discussions) the present me would have reacted differently, but that doesn’t change the fact that I’ve always focused on the code, not so much the feelings of participants of the discussion.

I quickly ignored them and focused on the code itself.

One more try

Since this issue popped yet again, I sent v5, and a v6 of my pull.mode approach. They did receive some amount of feedback, but in particular there was a comment from Richard Hansen that showed yet another issue with git pull.

Supposing we are currently on the branch “master”, and the upstream branch is “origin/master”.

  1. git pull --rebase (rebase master onto origin/master)
  2. git pull --merge (merge origin/master into master)
  3. git pull --rebase github john (rebase master onto github/john)
  4. git pull --merge github john (merge github/john into master)

Half of these are wrong. #1 is correct, since when we do git pull --rebase we want to update our “master” to “origin/master”, so we rebase on top of, but #2 is wrong, because when we use --merge we want a similar thing: we want to merge “master” to “origin/master”, not the other way around. This is what Andreas Krey pointed out, which started the second big thread. But in addition #3 is wrong too, because when we specify a repository and a branch to pull from, we want that branch to rebase on top of master, not the other way around.

So basically it’s all wrong.

Summary

After listening and responding to everyone, all the issues with git pull were clear on my mind, but I realized not everyone had the patience, the stamina, or the time to go through all the responses, so I wrote a summary.

I reduced the main problems with git pull to two:

  1. It does unwanted merges
  2. It does merges with reversed parents

The deeper issue is that git pull is in fact used in two very different ways: to integrate and to update. So when you try to fix the defaults for one, you stumble into unintended consequences for the other. While I do see clearly in my mind how git pull can be disentangled in a way that the defaults would work correctly for everyone, it is pointless if I’m the only one who sees that, because I can’t convince anyone (except perhaps Richard Hansen).

Instead I chose an escape hatch, a completely new command: git update, which would integrate the input from everyone in order to do what git pull can’t do cleanly.

By default git update would be essentially the same as git pull --ff-only (or more like my pull.mode=ff-only), which does solve the main issue: newcomers won’t do real merges by mistake. But it also solves the issue about the order of the parents:

  • git update --merge (merge “master” into “origin/master”)
  • git pull --merge github john (merge “github/john” into “master)

Not only that, but we could fix the order of the parents for rebase too:

  • git update --rebase (rebase “master” onto “origin/master”)
  • git pull --rebase github john (rebase “github/john” onto “master”)

Now everything is fixed. Additionally the default mode for update could be configured with update.mode, and the mode for pull with pull.mode (e.g. update.mode=rebase, pull.mode=merge). Not only that, but this could be configured per-branch (as Linus Torvalds and Theodore Ts’o suggested) with branch.master.updatemode and branch.master.pullmode. This was not only a proposal, I wrote the patches for all of that.

All users agreed my proposed git update was good, as Philippe Vaucher stated:

I know patches from Felipe are sensitive subject and that a lot of people probably simply ignore him entirely, however I think this patch deserves more attention.

No developer commented on it.

Moving on

Nothing resulted from the big Pull is Evil thread, except after it was finished, Jim Garrison asked how would pull do the wrong thing? Junio sent a comprehensive reply and Stephen P. Smith’s proposed to pretty much copy-paste the email from Junio into the list of howtos, and was accepted without any pushback. Not only was the fist version accepted, but Junio himself fixed the patch so no v2 was necessary from Stephen.

Unnecessary merge

At the beginning of 2018 Linus Torvalds once again contacted the mailing list due to an “unnecessary merge” in the Linux kernel. He explained the difference between a merge and a “back-merge”, and why in one you want a fast-forward, but not the other. Unfortunately git doesn’t know which situation we are in.

He even suggested the following configuration:

   git config --global alias.update pull --ff-only

This is precisely what my proposed git update did by default. Linus Torvalds even used the exact same name.

Junio then proceeds to suggest a per-repository configuration pullFF=only, when in fact Linus had already proposed a per-branch configuration back in 2013, and I had already implemented it. Nothing happens.

Why merge?

In 2020 Robert Dailey asks why does git pull use merge by default, he argues most people prefer rebases. Once again Elijah Newren argues that it would be desirable to consider a transition to ff-only by default. Once again Konstantin Tokarev tells Junio that in reality what newbies often do is wrong merges, and that ff-only would be better. Robert Dailey also gets convinced that ff-only is better. Alex Henrie mentions a patch he wrote but was reluctant to send it in order to avoid a big discussion.

The warning

Alex Henrie finally decided to send his patch which adds a warning telling users that running git pull without specifying to either merge or rebase is discouraged. Initially Alex wrote the warning to state that the default behavior would change in a future version of git, but Junio objected to that because there was “no consensus”:

Sorry for not catching this in the earlier round, but I do not think anybody argued, let alone the community came to a concensus on doing, such a strong move. Am I mistaken?

After five rounds of comments from Junio, he accepted the patch.

Interestingly enough this change immediately caused people to ask what it was:

Also, we have this lovely comment on GitHub:

Thanks for confusing everyone instead of just doing the default strategy when no parameter is given.

Imagine when every (CLI) application warns the user about not having each default value within the configuration file or given as parameter, despite the user is OK with (or does not care about) the default behaviour.

And we have a bug report from Wolfgang Fahl completely in German. Roughly translated: the bug is the warning.

Clearly people did not like this warning.

Pick the right default

On November of 2020 Vít Ondruch complained about the warning and stated that the Git project should choose a sensible default and not bother users with a warning.

I’d like to keep my trust in Git upstream, but this is was not trustworthy decision IMO.

This started because of a bug report on Fedora.

The main argument from Junio is that there was no perfect default:

There is no clear “default that would suit everybody”.

I pointed out that this is a false dichotomy because the options are not 1) implement the perfect solution, and 2) do nothing; we have another option 3) implement a better solution than the current one, and we all know what that solution is: ff-only. Plus we had the patches for that.

Jeff King pointed out that now there’s a pull.ff=only configuration, and he didn’t see how my pull.mode=ff-only was different. However, I explained that my version provides a more friendly error message, also works correctly with pull.rebase=true, and in addition allows a default mode that throws a more correct warning than the current one.

Junio proposed to make git pull --rebase honor pull.ff=only and fail if the pull is not fast-forward:

And then making an unconfigured pull.ff to default to pull.ff=only may give a proper failure whether you merge or rebase. I dunno.

This is obviously the wrong thing to do. As I explained to both Junio and Jeff, if you make git pull --rebase respect that, and you make pull.ff=only the default, then git pull --rebase will always fail (unless it’s a fast-forward case).

Once again I stated what should be the default.

  • git pull (fail by default unless it’s a fast-forward)
  • git pull --merge (force a merge [unless it’s a fast-forward,
    depending on pull.ff])
  • git pull --rebase (force a rebase [unless it’s a fast-forward,
    depending on pull.ff])

Vít Ondruch agreed.

Both Jeff and Alex Henrie argued that what my pull.mode does could be achieved with pull.rebase and pull.ff, and that’s the direction Junio wanted to take. I wasn’t convinced, and at this point I started to convert my patches from 2014 which was no easy task because before git pull was written in shell, but now it had been converted to C, so essentially I had to rewrite everything from scratch. Alex started to write his approach too.

Junio welcomed the competition:

Let’s see how well the comparison between two approaches play out.

Pretty quickly I realized what they suggested was not possible. If I removed my pull.mode and made pull.ff=only the default, then git pull --merge would fail. We want git pull fail by default, but not git pull --merge. To make this approach possible we would need to change the semantics of pull.ff=only, and that didn’t seem a clean solution to me.

pull.rebase=ff-only

In order to avoid all the hurdles from 2013 I decided to write the simplest patch possible that would achieve the desired result. Instead of creating a new pull.mode=ff-only, I reused an existing configuration, and named the mode pull.rebase=ff-only. This obviously was not ideal–since it left the interface in an inconsistent state (didn’t fix all the issues my original patch did)–but it was better than the current situation.

Junio responded that the name looked funny:

It looks quite funny for that single variable that controls the way how pull works, whether rebase or merge is used, is pull.REBASE.

Yes Junio, that’s why I suggested pull.mode instead, but that patch was never merged.

Raymond E. Pasco also agreed ff-only was the way to go.

pull.mode revampled

My first attempt to revive my old series recreated most of the same old behavior, but not all. It only received one comment for a tiny correction on the documentation.

Since my patches for pull.mode were ignored again, on my v2 I sent a bunch of improvements that were not 100% related to my approach, and instead of using my proposed pull.mode, I changed the semantics of --ff-only, which was not my preferred solution, and I clearly demarcated that particular patch with “tentative“. Elijah Newren provided a ton of valuable feedback, and I incorporated most of it, but there was one point where I did not agree:

If unsure, run “git pull --merge“.

I believe that escape hatch is useful, even if it undermines the value of the warning because 1) the warning must be temporary anyway 2) people are complaining about the warning already, 3) input from GitHub trainers already showed many people don’t even know what a rebase is, and 4) the merge is happening already anyway.

I explained my point of view to Elijah using an analogy. If a passenger in a plane is ignoring the safety demonstration he is doing something wrong, however, he should be completely free to do it, it’s his volition. The crew should allow him to ignore the demonstration. Similarly git users should be able to ignore the warning, and they can skip it by simply doing git pull --merge.

I argued that even if he didn’t agree on that particular line of that particular patch, the rest of the changes in the series don’t suddenly make the situation worse. He said he didn’t agree.

Junio used my analogy to liken users doing git pull --merge when they shouldn’t to be the same as passengers that should be removed from a plane:

A team member who runs “pull --[no-]rebase” pushes the result, which may be a new history in a wrong shape, back to the shared repository probably falls into a different category. … Or perhaps in the same public hazard category?

I don’t see how one line in a temporary warning can potentially create a “public hazard”. It’s users doing wrong merges what creates problems, and removing one line from the warning wouldn’t magically fix that. The warning–which if we take Stack Overflow and reddit as indication–isn’t even properly understood by many users, and doesn’t stop the code from creating a merge anyway. If these merges did indeed create a “public hazard”, shouldn’t git simply not do them by default? In other words: make ff-only the default.

Sadly I did not think of that argument at that time, I merely responded that what constitutes a public hazard is up to each individual project to determine, not git. Which is true.

On the other hand he said something that was completely untrue:

I hope “his comment” does not refer to what I said. Redefining the semantics of --ff-only (but not --ff=no and friends) to make it mutually incompatible with merge/rebase was what Felipe wanted to do, not me.

That’s not true. I wanted to introduce pull.mode=ff-only, not change the semantics of --ff-only. But as I had already explained before, --ff-only cannot be made the default without changing its semantics. The patch that introduced pull.mode was ignored (again), so I took a shot at making --ff-only work, and the only way is changing its semantics, which I did in a patch clearly labeled as “tentative” because I was not advocating for that.

And to make it crystal clear I repeated it yet again:

I don’t want to change the semantics of --ff-only; I want to introduce a new “pull.mode=ff-only“.

Moreover, I explained once again that this patch series was merely part 1. In part 2 I would introduce pull.mode. And there’s more parts after that.

For some reason Junio didn’t seem to understand what I was proposing. I explained that with pull.mode=ff-only I wanted to introduce a simple error message, but with pull.mode=default a warning to give users a heads up.

I quite don’t get it. They say the same thing. At that point the user needs to choose between merge and rebase to move forward and ff-only would not be a sensible choice to offer the user.

Other people like Jacob Keller did not have any issue understanding my proposal.

After multiple rounds eventually Junio gets my proposal, but now he doesn’t see the point of taking it slow instead of flipping the default to ff-only right away. Junio said that he assumed everyone has already configured pull.rebase therefore nobody would notice the change. I disagreed and pointed why this whole thread started: because a user refused to configure git and expected git to do the right thing–choosing a useful default.

Junio responded as he did at the start:

Which is basically asking for impossible, and I do not think it is a good idea to focus our effort to satisfy such a request in general. There is no useful default that suits everybody in this particular case.

This is once again the Nirvana fallacy. Nobody asked for a perfect solution, merely a better default.

Additionally he argued the pull.mode=ff-only would only be useful for users that don’t use git “for real”:

But for anybody who uses git for real (read: produces his or her own history), it would be pretty much a useless default that forces the user to say rebase or merge every time ‘git pull’ is run.

I mean… didn’t he just argue that he wanted to make ff-only the default right away? Literally in his last reply. Very confusing.

I showed to him a real example of me using git pull --ff-only “for real”. Additionally I asked him in the most polite way I could a question that should be very valid at this point:

How often do you send pull requests?

He didn’t answer.

Since once again Junio stated that he didn’t see a good reason why certain classes of users would want to configure pull.mode=ff-only, I offered to do some mail archaeology which lead to the creation of this blog post. It is obvious this is needed.

Once again Junio misinterpreted what I was proposing, and once again I explained in great detail. This time however Junio correctly identified that without pull.mode the user cannot explicitly ask for the default behavior, and that in my opinion is a crucial thing the user must be able to do in order to test this proposed new default. Moreover, I explained that the name is irrelevant, it could be pull.mode=ff-only, or pull.rebase=ff-only, or pull.ff=only, but one option should turn on the new behavior we are proposing to users, and currently no option does.

To cut to the chase I explained that we want:

  1. git pull to eventually fail in non-fast-forward cases
  2. A grace period before that switch is flipped

And so far no other proposal but mine achieved that.

Since Junio insisted pull.mode was not necessary (which I had already demonstrated multiple times it is), I asked him a poignant question to unequivocally demonstrate that:

So, after your hypothetical patch, there would be no difference between:

git -c pull.rebase=no -c pull.ff=only pull

and:

git -c advice.pullnonff=false pull

?

Junio answered that both would error out, which is correct. But then he has to concede my point, because:

  • git -c pull.ff=only pull (fail)
  • git -c pull.ff=only pull --merge (fail)

If both fail we cannot tell the user:

Not a fast-forward, either merge or rebase.

Because the user would do git pull --merge and that command would fail as well because it will try to do git merge --ff-only and it’s not a fast-forward. That’s it. pull.ff=only can’t be used for this. Period.

Junio did not reply.

Part 2 was completely ignored.

When Junio sent his regular mail with the status of all patch series he didn’t include any of my proposals. Since Junio himself was the one that asked for Alex and me to provide our competing proposals I replied to him asking for the status of the ff-only topic. He didn’t reply.

Alex’s approach

Junio asked for both my approach and Alex’s approach which he sent in two patches. One changing the warning to state that pull.ff=only will be the new default, and the other actually changing the default.

Junio did not think the approach was correct, and neither did I. When I explained why, Junio mis-read my response, but this time he himself caught the mistake, and then said the following:

If we instead introduced a separate command, say “git update“, that is “fetch followed by rebase” (just like “git pull” is “fetch followed by merge”), to rebase your work on top of updated upstream, there wouldn’t be a need for us to be having this discussion.

Yes, if my suggestion to add git update was heard we wouldn’t be having this discussion (yet again).

But to move forward I argued that there’s three things to do:

  1. Improve the annoying warning
  2. Consider changing the semantics of --ff-only, or implement pull.mode=ff-only
  3. Consider a new “git update” command

Since my patches for pull.mode were not being reviewed, I suggested to focus only on #1 for now.

Jacob Keller chimed in and also agreed the order of the parents with git pull was a significant problem.

I also reminded Junio of the purpose of the patch, to which he replied:

Sorry, I know you keep repeating that “keep in mind”, but I do not quite see why anybody needs to keep that in mind. Has a concensus that the repurposed --ff-only should be the default been established?

This is the title of the patch from Alex we were supposed to be reviewing “pull: default pull.ff to "only" when pull.rebase is not set either“. The whole point is to make pull.ff=only the default, and in that case git pull --merge would fail, which is not what we want. Unless the semantics of pull.ff=only are changed, which is not my proposal, my proposal (pull.mode=ff-only) is better because it doesn’t need that change.

He did not reply again.

Alex stated that he had trouble following the discussion and simply left it at the table.

A fast-forward

In v3 I focused on fixing all the issues without committing to any particular approach. However, Junio objected to my use of the word “fast-forward”:

… I find the phrase “in a fast-forward way” a bit awkward. Perhaps use the ‘fast-forward’ as a verb

I argued that if fast-forward was a verb, we would have git fast-forward command (which I actually think is a good idea), but currently we don’t have it, so fast-forward is a modifier. In particular if git merge is a verb, then --ff is an adverb.

Elijah disagreed:

A square is a special type of a rectangle, but that doesn’t make “square” an adjective; both square and rectangle are nouns.

Since I’m deeply interested in language, this is a discussion I could really weigh in:

Words have multiple definitions. The word “square” is both a noun, and an adjective. It’s perfectly fine to say “square corner”.

Moreover, I found plenty of instances in the documentation which use fast-forward as an adverb. Even the git-merge man page uses “fast-forward merge”.

  • non-fast-forward update
  • fast-forward merges
  • fast-forward check
  • “fast-forward-ness”
  • fast-forward situation
  • fast-forward push
  • non-fast-forward pushes
  • non-fast-forward reference
  • fast-forward cases
  • “fast-forward” merge

This time I managed to convince him.

However it didn’t escape to me the fact that the documentation already talks of “fast-forward update”, so this suggests that the git update command could fill a whole already present in the documentation.

Everything at once

Since Junio had trouble figuring out my proposal in full, I decided to send the 19 patches together at once, and in addition I wrote a detailed explanation of what I was proposing to happen at every step of the way.

Is it more clear what is my proposal?

No reply.

Improve the default warning

Since I didn’t see much hope in fixing git pull‘s default, for v5 I decided to drop everything except what fixes the most urgent issue: the warning on every single pull.

Junio objected to the name of a variable I used: default_mode. He said the word “mode” was overused, and he proposed a “more focused” word, he came with rebase_unspecified. Additionally he didn’t want this variable to be global, even though there were already 38 global variables, so why did he feel that 39 was too much? I have no idea.

Moreover he made some comments on the test changes:

We are merely allowing fast-forward merges without unnecessary merge commits, but we are faced to merge c1 into c2, which is not ff. The command goes ahead and merges anyway, but why shouldn’t we be seeing the message? I am puzzled.

Now, for the most part I haven’t gone into code in this post, but I think it’s important you feel a little bit of my pain here.

git reset --hard c2 &&
test_config pull.ff true &&
git pull . c1 2>err &&
test_i18ngrep ! "Pulling without specifying how to reconcile" err

The name of the test gives a hint to what it’s trying to do “pull.rebase not set and pull.ff=true“. As you can see there’s no pull.rebase configuration (or --rebase), but there is pull.ff=true. So clearly it has something to do with that pull.ff.

What this test is doing is trying to pull c1 into c2, and since the history has diverged the merge cannot be a fast-forward, therefore the code should throw a warning…

except that only happens if the user has not specified any --ff* option (--ff, --no-ff, or --ff-only), either through to command line, or from the pull.ff configuration, and the test does set pull.ff=true. Therefore there’s no warning.

This is behavior Alex Henrie added and Junio accepted with no comment. In my opinion this behavior is wrong, but that’s what the code does at this point, and even Junio did not expect this, that’s why he was puzzled.

I explained to him that the code only checks for opt_ff, and I tried to fix that behavior in v1, v2, v3, and v4 of my patch series, but he did not comment on those patches.

So the tests used to do a fast-forward from c0 to c1, and that would normally throw a warning (as all pulls did), but because pull.ff=true is set, it doesn’t. This test written by Alex Henrie ensures that pull.ff=true doesn’t generate a warning. But after my patch doing a fast-forward from c0 to c1 would not generate a warning anyway, so the pull.ff=true doesn’t do anything, and therefore the test doesn’t do anything. That’s why I changed the test to start from c2, because then the merge is not fast-forward, and a warning is thrown unless pull.ff=true is set, which is what the test was supposed to be testing in the first place.

To show Junio that the tests are doing what they originally intended to do, I modified all the tests to remove the pull.ff configurations:

git reset --hard c2 &&
git pull . c1 2>err &&
test_i18ngrep ! "Pulling without specifying how to reconcile" err

Now the test fails.

Junio was not convinced. He stated that the original test was checking that the fast-forward would work fine, but that’s not true since from the test we can see that the only thing it’s doing is checking there’s no error message. The code could very well have done a real merge instead of a fast-forward, and the test would still pass. I explained to him that the tests that actually check for the fast-forward are in another file. This particular test is only about the warning message.

Junio claimed to know what the original author of the test intended to do and what he cared about, and that was to test that pulling c1 into c0 issued a message, except none of the tests actually did that, they checked the opposite: that the message is not there. So now that my patch skips the message, the tests pass, but for the wrong reason.

git reset --hard c0 &&
git pull . c1 2>err &&
test_i18ngrep ! "Pulling without specifying how to reconcile" err

If instead of starting from c2 we start from c0 as the original code, then the pull is fast forward, and there’s no message, so this test pass. But this test was supposed to be checking the output with pull.ff=true, that’s why it was called “pull.rebase not set and pull.ff=true“, if you remove the configuration it still passes. So in the fast-forward case what is this test supposed to be doing if it cannot possibly fail now? Nothing.

I explained that and in addition I said that I’m not in the habit of attempting to read minds, what the test are trying to do is very clear from the code.

Junio replied.

You do not have to read mind. What is written in the test is clear enough: run “git pull --rebase . c1” into c0 and see what happens.

Yes, that was precisely my point: no need to read minds (although my reading of the code is different).

Do you feel my pain? I don’t mind going as many rounds as necessary to improve a patch series, even ten or more, but these rounds are improving nothing. The name of the variable doesn’t matter, because in a few patches later I remove it, and even these tests for the warning don’t matter because eventually we want to remove the warning completely.

And this is only one test, of one the patches, of one of simplest patch series.

But fine, if Junio wants to keep a bunch of tests that literally test nothing, I’ll oblige. For v6 I left the useless tests untouched and added the real ones separately.

Junio isn’t entirely happy

Junio sent v7 himself, stating he wasn’t happy with the way the --ff* options were handled. He suggested that these options should imply a merge, but I showed situations in which that would become tricky.

Additionally he wrote a patch to remove my global variable, and seized the opportunity to state:

There is no reason to make the code deliberately worse by ignoring improvement offered on the list.

Really? I did not ignore the comments, I responded to all of them, but I disagreed with some. And so if I disagree with any of Junio’s suggestions for improvement I’m making the code “deliberately worse”? If I have to do what he says, then those are not suggestions, those are orders. Not to mention the v7 he is sending contains the patches I sent in v6, which have the useless tests–a change he suggested–and even though I was 100% against, I included it.

Anyway, since Junio didn’t want to discuss his proposal to imply merges any further, finally he merged this patch series.

After this patch the warning becomes much less annoying. So that’s progress.

What happened to the rest of the patches? Nothing.

Today

After this latest attempt to fix git pull, I distanced myself from the project for a few months, and then came back. I decided to resend some of my patches, but now separately, to minimize the possibility of them being rejected.

  1. Cleanups
  2. Documentation
  3. --merge

Of these only #1 was merged (though not yet in “master”). #2 generated yet another discussion of the semantics of “fast-forward” but nobody was against the changes themselves. #3 everyone is in favor, but no comment from Junio.

If you think any of these will eventually be merged you are more hopeful than I am. Also, there’s this comment from Junio: (emphasis mine):

If some contributors get too annoying to be worth our time interacting with, it is OK to ignore them.

He seems to be referring to me.

So where does that leave us? The order of the parents is still wrong. And the default mode still isn’t fast-forward-only. Regardless of what you may think of my communication style, I’m the one that has actually sent patches to fix all these issues. And unlike Junio and many other git developers–I’m not being paid to work on this, I’m doing this altruistically (I don’t even use git pull myself). If I’m not the person to fix this, who is? I left the project for about 6 years and nobody fixed it.

Recently I challenged Alex Henrie to send a patch to try to fix the situation. He took me up on the challenge, and when I reviewed his patch he immediately realized that more work is needed (as I predicted he would find).

What to do now?

The final solution

While writing this post I realized just how many suggestions were proposed, but not implemented, so I decided to try once again to fix the situation, but now consider absolutely all the feedback in a holistic way.

Here’s there result.

git fast-forward

The documentation about what a fast-forward is, and how to attempt one, is scattered all over. Additionally the simplest way to attempt one is with git merge --ff-only which isn’t user-friendly, and some people don’t even consider a fast-forward to be a modifier of a merge (i.e. adverb).

By having git fast-forward command we solve both problems and fast-forward in the git context now becomes a verb: something independent of git merge.

Additionally there’s an advice for newcomers:

hint: Diverging branches can't be fast-forwarded, you need to either:
hint:
hint:   git merge
hint:
hint: or:
hint:
hint:   git rebase
hint:
hint: For more information check "git help fast-forward".
hint:
hint: Disable this message with "git config advice.diverging false"

Advanced users can disable this specific advice (or all advice).

As the advice says, anyone who wants to learn more about fast-forwards can simply do git help fast-forward.

git update

I converted my old tool from shell to C, and although it doesn’t have yet all the features, it does have plenty of functionality.

git update without arguments simply does git fetch + git fast-forward. If the fast-forward fails, we get the above advice. This solves the first problem that newcomers often do merges by mistake.

Also, it’s possible to do git update --merge (or configure update.mode=merge), and on those cases the order of the parents is reversed properly, as everyone wants (everyone that is not an maintainer).

There’s also git update --rebase, and also a per-branch a configuration branch.<name>.updateMode.

pull.mode

And of course all the other patches that improve git pull, including pull.mode=fast-forward, a better warning that shows that in the future git pull will do fast-forwards by default, along with a per-branch configuration branch.<name>.pullMode.

Also, thanks to git fast-forward we also get the same divergence advice, which will be permanent, but easy to turn off.

In total it’s 33 patches, and one additional to flip the default in the future (after users have had a chance to configure pull.mode to whatever they prefer).

The patches have been sent, but I wouldn’t hold my breath.

Sadly this is where the story ends, even though everyone has agreed on what is the path forward “for some reason” the patches will not be merged, and new users will be forever doomed to keep making mistakes because git pull was not designed for them.

Wait a second…

Git is a distributed version control system. You are not forced to use Junio’s repository, you can use mine. Everything I’ve talked about is fixed there. If you are interested why not give it a try?

Note: Some people are thinking that the purpose of this article is to point fingers, but that could not farther from the truth. The only reason names are named is to be able to follow the discussions (or at least attempt to). I am not seeking to assign any blame to any one person for any action or inaction. In any discussion disagreements are to be expected, and it doesn’t have to mean anything more than that.

Adventures with man color

As it’s usually the case with me, a simple fix sends me to an unending rabbit hole of complex issues. And this was no exception.

It all started when I tried to help the Git project to move towards asciidoctor, a program that generates documentation from text files using a markdown language. The initial project was asciidoc, but it’s in a bit of a rot. The original asciidoc is written in Python (a language I detest), and the new asciidoctor in Ruby (a language I love), so I clearly saw an edge.

The problem starts with a feature asciidoctor has, but not asciidoc: generate man pages. Of course asciidoc can generate man pages, but it does so by first generating a docbook XML, and then docbook stylesheets can be used to convert those to man pages. The same can be done with asciidoctor, but additionally there’s an option to generate man pages directly.

Using docbook is very slow; generating man pages directly is way faster.

When I sent the initial patch (very trivial), a Git developer mentioned some “groff issues”. Apparently if your system uses groff (GNU troff), you are supposed to build git with GNU_ROFF=1:

Define GNU_ROFF if your target system uses GNU groff. This forces apostrophes to be ASCII so that cut&pasting examples to the shell will work.

It’s actually not “GNU groff”, but “GNU troff”, so this option is wrongly named, but regardless; virtually nobody is using it.

My Arch Linux system doesn’t build git with GNU_ROFF, it does use groff, and yet I don’t see any issues with apostrophes… So what’s going on?

The context

The original commit “Quote ‘ as \(aq in manpages” comes from 2009, and mentions the problem comes from “docbook/xmlto”, and apparently only affects groff. In the original thread “quote in help code example” you can see they mention the output of the git filter-branch command specifically, but when I look at that man page, it looks fine.

So I started to dig in, and for starters I don’t even know what groff is.

I tried different versions of asciidoc, they worked fine. Then I tried different versions of docbook stylesheets, they worked fine. I even tried different versions of asciidoctor, to see when the fixed the issue; the all worked fine. Weird.

So I decided to compile groff myself and find the point where it started to fail. First I tried a version two years ago, and it did not compile correctly, so I had to do some hacking to make it compile, after I did, everything worked fine. I continued going more and more into the past, fixing the compilation issues, and not finding any problem. I reached 2006, and still did not see any change.

This was strange. Clearly at some point in time, with some combination of tools there was a problem, but I couldn’t find either of those.

Then I decided to manually modify a man page, and put quotes directly… Bingo. I could see that ' was actually rendered as `, but what caused it?

I then moved forward in versions and to my surprise they all had this issue, even the most recent version of groff.

What on Earth is going on here?

While doing this I noticed something different from my compiled version of groff, and Arch Linux’s version. In the compiled version of groff I saw links rendered as blue. When I saw the generated man pages I saw \m[blue], but I assumed that was for some other kind of troff program or something totally unrelated. But no, here I was seeing blue, but not with all the groff binaries.

So I tried to build groff with the same options as Arch Linux… Still blue. After trying a few things I eventually found a difference: Arch Linux installs a file /usr/share/groff/site-tmac/man.local, if I remove that file the blue color returns. Inside the file there’s:

\" Shut off SGR by default (groff colors)
\" Require GROFF_SGR envvar defined to turn it on
if '\V[GROFF_SGR]'' \
  output x X tty: sgr 0

That’s it! If you export GROFF_SGR=1 on Arch Linux, you see man pages with colors, just like my compiled version. The reason my compiled version does this by default is that it doesn’t have Arch Linux’s man.local file.

GROFF_SGR

If you google GROFF_SGR you find that it’s not properly documented. Some distributions such as Debian and Arch Linux do disable groff’s colors, but they don’t document doing so. Debian “fixed it”. However, I don’t think most people are going to read the entirety of grotty’s man page, not even a little bit, so that doesn’t help, even if you are running Debian–where it’s documented.

However, if you read the man page, you will find another variable: GROFF_NO_SGR. Unlike GROFF_SGR, this one is standard, and it’s respected in all distributions.

This reminded me of trick I learned while reading Arch Linux’s installation guide:

man() {
    LESS_TERMCAP_md=$'\e[01;31m' \
    LESS_TERMCAP_me=$'\e[0m' \
    LESS_TERMCAP_so=$'\e[01;44;33m' \
    LESS_TERMCAP_se=$'\e[0m' \
    LESS_TERMCAP_us=$'\e[01;32m' \
    LESS_TERMCAP_ue=$'\e[0m' \
    command man "$@"
}

This code automatically converts parts of man pages to color (e.g. bold and underline), which looks much better than normal man pages, but it turns out it only works if you have groff’s SGR disabled, so… In other distributions you need to do GROFF_NO_SGR=1, for the above to work.

Cool. We found something.

Back to apostrophes

There’s something else in Arch Linux’s man.local:

char \' \N'39'

This converts \' to ', instead of groff’s default: \(aa (acute accent: ´). This is the reason why I could not reproduce the problem: Arch Linux was hiding it. However, this is the wrong way of fixing it. There is a reason groff developers decided to pick \(aa; they know better. Distribution packagers should not be overriding this.

Why did they do this? The change came due to task FS#9643 – man PKGBUILD shows slanted single quotes. This happened in 2008, which suggests there was indeed an issue around that time, and pacman documentation was built with asciidoc too.

Arch Linux fixed the issue in the wrong way, though. Debian chose a different path. In their bug report #507673 Shouldn’t parse ‘ to \’ they discussed the issue at length, and they correctly identified that the issue was in docbook-xsl (not groff), and if you are going to convert ' it should be to \(aq, but that would only work in groff. They also found that Pod::man had a portable solution:

.ie \n(.g .ds Aq (aq
.el .ds Aq '

This creates an alias: from Aq to (aq, but only when the program is groff, in all other programs it gets converted to '.

This is the correct solution in the correct layer, and generates the proper output everywhere.

But to check that it is the correct solution it would behoove us to understand what groff actually is. groff (or GNU troff) is a document formatting system; it receives text mixed with commands, and generates documents. A man page is just one of the many types of documents it can generate. It can for example generate a PDF.

So, let’s write a simple groff document:

.nf
single quotes: 'text'
single quoted quotes: \'text\'
apostrophe quote: \(aqtext\(aq
.fi

We can generate a PDF document using groff -T pdf test.groff > test.pdf. But this of course is not what we ultimately want, we want to generate a man page, in the same way as man does. To do that we need to specify the output device as utf8: groff -T utf8 test.groff > test.txt. This generates the following:

single quotes ': ’text’
single quoted quotes \': ´text´
apostrophe quote \(aq: 'text'

As you can see the output text is quite different from the input text; that’s what groff does. But this is only the case on a utf-8 system, if you specify the ascii output, then all the quotes above get translated to simply '.

The output with \(aq is correct in both utf-8 and ascii. And if we add the Aq alias:

apostrophe quote alias \(Aq: 'text'

That is indeed correct. The Debian fix seems to work. To make sure we would need a non-GNU troff, like in Solaris, but alas, I don’t have access to something like that, so I’m just going to assume it works in other systems (as other people reported it did).

This proper fix was eventually picked by docbook in 2010: Fixed bug #2412738 (apostrophe escaping) by applying the submitted patch. If you take a look at the code of git-filter-branch.1 you can see the fixed code in action:

git filter-branch --tree-filter *\(Aqrm filename*\(Aq HEAD

Therefore both Arch Linux’s and git’s workarounds are not necessary anymore. Yet they remain there ten years later.

Digging deeper

OK, so we found out what the issue was, and how it got fixed: in docbook, and also unnecessarily in git and Arch Linux (three different levels). But what caused it? Going back to groff from 2006 didn’t cause the issue, so what happened?

By looking back at docbook stylesheet’s history with git blame, I found out the commit that caused the issue: Reverted necessary escaping of backslash, dot, and dash. This commit happened in 2007, and it was made due to an internal limitation of docbook’s architecture.

So from 2007 to 2010 docbook stylesheets were generating wrong man pages, different projects worked-around the issue in different ways, but today–in 2021–these workarounds are not needed, and yet they remain in place.

Back to Git

After all this investigation I sent a patch to the Git project (doc: remove GNU troff workaround) to remove the GNU_ROFF option which clearly was not needed since at least ten years. But I also sent a comment about what I found regarding GROFF_SGR, and the trick to colorize man pages. In reply I received a suggestion to implement the LESS_TERMCAP trick into git help (which is basically an alias for man).

So I sent a patch (help: colorize man pages), and a big discussion propped up (typical due to the bike-shed effect). In that thread it was mentioned: “why not let the user configure man to do this?”. The problem is that you have too many moving parts; groff, man, git, less, distribution configurations, environment variables, aliases, workarounds, docbook and asciidoc bugs… And of course the thing that started it all: asciidoctor.

But it made me think: what is indeed the best way to configure man to do this?

After several days of investigation, and several days of trying options I arrived to what I think is the actual solution.

Solution

export MANPAGER="less -R --use-color -Dd+r -Du+b"
export MANROFFOPT="-c"

Unlike Arch Linux’s hack, the -D arguments to less are much more succinct, and they allow adding color (in addition to the style (e.g. underlined)), not removing information. So --color=d+r (long option for -D) converts d (Bold text) to r (red), and the + signifies add color (i.e. don’t remove the bold attribute). Moreover --use-color adds other colors to the less interface; the prompt is cyan, searches are in green, and warnings in yellow.

And instead of the the ugly GROFF_SGR=1, we can tell man to pass -c to groff.

So the full command is:

groff -T utf8 -m man -c git-filter-branch.1 | less -R --use-color -Dd+r -Du+b

No man involved. This is way simpler… Why is nobody using this?

After I updated my patch and other people tested it, it became clear it didn’t always work. In particular older versions of less did not have the -D options (at least not for Linux). So I checked the history of less and I found out that they enabled -D for Linux systems in 2021.

No wonder everyone is still using the LESS_TERMCAP_* variables. Nobody knows of the new option, because it’s too new!

So the patch to remove the GNU_ROFF option in Git (totally necessary in 2021) is there. And so are the updated Arch Linux instructions to use the new -D flags of less.

If you want to properly colorize man pages, you do this:

export MANPAGER="less -R --use-color -Dd+r -Du+b"
export MANROFFOPT="-c"

If you want to colorize other similar documentation (like Ruby’s documentation):

export RI_PAGER="less -R --use-color -Dd+r -Du+b"

And so on. And if you want less to format everthing:

export LESS='-RXF --use-color -Dd+r$Du+b'

That’s it. If you are running a recent enough version of less, everything works perfectly with a simple configuration.

Oh, also, I realized asciidoctor didn’t have the portable fix, so I sent them a patch that is now merged. I found an issue with less colors and searches that is fixed now. And I reported their unnecessary workaround to Arch Linux too.

The visual style of a programmer

Recently I heard a person say that us “geeks” don’t have a good sense of style, presumably because we typically wear very plain clothes (check pictures of Steve Jobs or Mark Zuckerberg), however, what I think many people don’t see is that we do have style, but where it matters; our computer screens, not clothes.

Most of the time a programmer is staring at code, and the program that shows code and allows you to edit it properly is called a text editor.

This is vim, one of the most popular text editors for programmers with the default configuration.

By default it works, however, staring at these colors for hours gets tedious; I want better colors. Fortunately vim has the concept of “color schemes”, so you have several hundreds of themes to choose.

After trying tons of those, I decided none were exactly what I wanted, so I decided to create my own.

Color theory

I have been choosing colors for websites for about 20 years, so I am familiar with the ways colors are programmed, but many people are not.

While sometimes you can tell a program “red” and it will use the right color, sometimes you need a slightly darker red, or something between orange and red. So in order to be perfectly specific, the most common system to tell a computer a color is called RGB (red, green, blue). In this system, red is 100%, 0%, 0% (100% of the red component), green would be 0%, 100%, 0%, and yellow (which is a combination of red and green), 100%, 100%, 0%.

But computers don’t naturally deal with percentages; they are digital, so they need concrete numbers, which is why 100% is translated to 255 (the maximum value), thus 50% would be 128. And they don’t even use the decimal system; they use binary, and the closest between decimal and binary is hexadecimal, in which 255 is “FF”. Just like in decimal 9 is the biggest digit (1 less than 10), in hexadecimal F is the biggest digit representing 15 (1 less than 16).

So, red in hexadecimal RGB (the standard) is “FF0000”.

I can do the translation in my mind between many hexadecimals to their corresponding human colors, and do some alterations, like for example making an orange more red, or make a cyan darker, or less saturated.

This method of selecting colors has served me well for several years, and I have created aesthetically pleasing colors for many interfaces, but it’s always trial and error, and although the colors look OK, I could never be sure if they are precisely what I wanted.

For example if yellow is “FFFF00” (100% red and 100% green), I could figure out orange would be “FF8000” (50% green). But for more complicated colors, like a light red “FF8080”–where green is already halved–it’s not so clear how to combine it with a light yellow “FFFF80” where green is full, or how to make a purple that looks similar.

I wanted a programmatically precise method of generating the colors I wanted, and in order to do that I researched about color wheels and learned that in fact there’s many systems of colors, and many different color wheels.

What I wanted was a way to generate the RGB color wheel, but without using the RGB color model. It turns out there’s two alternate representations of the RGB model; HSL (hue, saturation, lightness) and HSV (hue, saturation, value). I was familiar with HSV, but it turns out HSL is the one that better serves my purposes.

In HSL red is 0°, 100%, 50%, yellow is 60°, 100%, 50%, orange is 30°, 100%, 50%; the saturation and lightness are not changing, only the hue. So now it’s clear how to generate the light orange, since light red is 0°, 100%, 75%, light yellow is 60°, 100%, 75%, so obviously light orange is 30°, 100%, 75%.

I can easily generate the full color wheel by simply changing the hue: red 0°, orange 30°, yellow 60°, chartreuse green 90°, green 120°, spring green 150°, cyan 180°, azure 210°, blue 240°, violet 270°, magenta 300°, rose 330°.

My color scheme

I have been using my own color scheme for about 10 years, but armed with my new-found knowledge, I updated the colors.

I cannot stress enough how incredibly different this looks to my eyes, especially after hours of programming.

These are the colors I ended up picking.

Is this not style?

If you are a programmer using vim, here’s my color scheme: felipec.

Font

But wait, there’s more. Colors are part of the equation, but not the whole. When reading so much text it’s important in what font that text is rendered.

Generally speaking there’s three kinds of typefaces, “serif”, “sans-serif”, and “monospace”. The kind virtually everyone uses for code is monospace, which looks like: this.

There’s tons of different monospace fonts, many created specifically to read code. In fact, there’s even sites that allow you to compare code in different programming languages with different fonts to see which one you like best, for example: Coding Fonts.

It’s this way I found my new favorite coding font: Input. Not only has the font been carefully designed, but it can be configured to accommodate different preferences, such as the shape of the letter “g”, which I decided to change. You can play with different preferences and preview how it looks in different languages (and in fact different vim color schemes).

This is what it looks like:

Probably most people don’t notice the difference between the DejaVu and Input fonts, but I do, and plenty of programmers do too, which is why these fonts were created in the first place.

There there

So there is it, just because most people don’t see it, doesn’t mean there’s no there there.

Programmers do have style. It’s just that we care more about the color of a conditional more than we do about the color of our shirt.

Why renaming Git’s master branch is a terrible idea

Back in May (in the inauspicious year of 2020) a thread in the Git mailing list with the tile of “rename offensive terminology (master)” was started, it lasted for more than a month, and after hundreds of replies, no clear ground was gained. The project took the path of least resistance (as you do), and the final patch to do the actual rename was sent today (November).

First things first. I’ve been a user of Git since 2005 (before 1.0), and a contributor since 2009, but I stopped being active, and only recently started to follow the mailing list again, which is why I missed the big discussion, but just today read the whole enchilada, and now I’m up-to-date.

The discussion revolved around five subjects:

  1. Adding a new configuration (init.defaultbranch)
  2. Should the name of the master branch be changed?
  3. Best alternative name for the master branch
  4. Culture war
  5. The impact to users

I already sent my objection, and my rationale as to why I think the most important point–the impact to users–was not discussed enough, and in fact barely touched.

In my opinion the whole discussion was a mess of smoke screen after smoke screen and it never touched the only really important point: users. I’m going to tackle each subject separately, leaving the most important one at the end, but first I would like to address the actual context and some of the most obvious fallacies people left at the table.

The context

It’s not a coincidence that nobody found the term problematic for 15 years, and suddenly in the height of wokeness–2020 (the year of George Floyd, BLM/ANTIFA uprising, and so on)–it magically becomes an issue. This is a solution looking for a problem, not an actual problem, and it appeared precisely at the same time the Masters Tournament received attention for its name. The Masters being more renowned than Git certainly got more attention from the press, and plenty of articles have been written explaining why it makes no sense to link the word “masters” to slavery in 2020 in this context (even though the tournament’s history does have some uncomfortable relationship with racism) (No, the masters does not need renaming, Masters Name Offensive? Who Says That?, Will Masters Be Renamed Due to BLM Movement? Odds Favor “No” at -2500, Calls for The Masters to change its name over ‘slave’ connotations at Augusta). Few are betting on The Masters actually changing its name.

For more woke debates, take a look at the 2 + 2 = 5 debate (also in 2020).

The obvious fallacies

The most obvious fallacy is “others are doing it”. Does it have to be said? Just because all your friends are jumping off a cliff doesn’t mean you should too. Yes, other projects are doing it, that doesn’t mean they don’t have bad reasons for it. This is the bandwagon fallacy (argumentum ad populum).

The second one comes straight out of the title “offensive terminology”. This is a rhetorical technique called loaded language; “what kind of person has to deny beating his wife?”, or “why do you object to the USA bringing democracy to Iraq?”. Before the debate even begins you have already poisoned the well (another fallacy), and now it’s an uphill battle for your opponents (if they don’t notice what you are doing). It’s trying to smuggle a premise in the argument without anyone noticing.

Most people in the thread started arguing why it’s not offensive, while the onus was on the other side to prove that it was offensive. They had the burden of proof, and they inconspicuously shifted it.

If somebody starts a debate accusing you of racism, you already lost, especially if you try to defend yourself.

Sorry progressives, the word “master” is not “offensive terminology”. That’s what you have to prove. “What kind of project defends offensive terminology?” Is not an argument.

Adding a new configuration

This one is easy. There was no valid reason not to add a new configuration. In fact, people already had configurations that changed the default branch. Choice is good, this configuration was about making it easier to do what people were already doing.

The curious thing is that the only places in the thread where the configuration was brought up was as a diversion tactic called motte and bailey.

What they started with was a change of the default branch, a proposition that was hard to defend (bailey), and when opponents put enough pressure they retreated to the most defensible one (motte): “why are you against a configuration?”

No, nobody was against adding a new configuration, what people were against was changing the default configuration.

Should the name of the master branch be changed?

This was the crux of the matter, so it makes sense that this is where most of the time debating was spent. Except it wasn’t.

People immediately jumped to the next point, which is what is a good name for the default branch, but first it should be determined that changing the default is something desirable, which was never established.

You don’t just start discussing with your partner what color of apartment to choose. First, your girlfriend (or whatever) has to agree to live together!

Virtually any decision has to be weighted in with pros and cons, and they never considered the cons, nor established any real pro.

Pro

If the word “master” is indeed offensive, then it would be something positive to change it. But this was never established to be the case, it was just assumed so. Some arguments were indeed presented, but they were never truly discussed.

The argument was that in the past (when slavery was a thing), masters were a bad thing, because they owned slaves, and the word still has that bad connotation.

That’s it. This is barely an argument.

Not only is very tenuously relevant in the present moment, but it’s not actually necessarily true. Slavery was an institution, and masters simply played a role, they were not inherently good or bad. Just because George Washington was a slave owner, that doesn’t mean he was a monster, nor does it mean the word “master” had any negative connotation back then. It is an assumption we are making in the present, which, even if true; it’s still an assumption.

This is called presentism. It’s really hard to us to imagine the past because we didn’t live it. When we judge it we usually judge it wrong because we have a modern bias. How good or bad masters were really viewed by their subjects is a matter for debate, but not in a software project.

Note: A lot of people misunderstood this point. To make it crystal clear: slavery was bad. The meaning of the word “master” back then is a different issue.

Supposing that “master” was really a bad word in times of slavery (something that hasn’t been established), with no other meaning (which we know it isn’t true) this has no bearing in the modern world.

Prescriptivism

A misunderstanding many people have of language, is the difference between prescriptive and descriptive language. In prescriptivism words are dictated (how they ought to be used). In descriptivism words are simply described (how they are actually used). Dictionaries can be found on both camps, but they are mainly on the descriptive side (especially the good ones).

This misunderstanding is the reason why many people think (wrongly) that the word “literally” should not mean “virtually” (even though many people use it this way today). This is prescriptiveness, and it doesn’t work. Words change meaning. For example, the word “cute” meant “sharp” in the past, but it slowly changed meaning, much to the dismay of prescriptivists. It does not matter how much prescriptivists kick and scream; the masses are the ones that dictate the meaning of words.

So it does not matter what you–or anyone–thinks, today the word “literally” means “virtually”. Good dictionaries simply describe the current use, they don’t fight it (i.e. prescribe against it).

You can choose how you use words (if you think literally should not mean virtually, you are free to not use it that way). But you cannot choose how others use language (others decide how they use it). In other words; you cannot prescribe language, it doesn’t matter how hard you try; you can’t fight everyone.

Language evolves on its own, and like democracy; it’s dictated by the masses.

So, what do the masses say about the word “master”? According to my favorite dictionary (Merriam-Webster):

  1. A male teacher
  2. A person holding an academic degree higher than a bachelor’s but
    lower than a doctor’s
  3. The degree itself (of above)
  4. A revered religious leader
  5. A worker or artisan qualified to teach apprentices
  6. An artist, performer, or player of consummate skill
  7. A great figure of the past whose work serves as a model or ideal
  8. One having authority over another
  9. One that conquers or masters
  10. One having control
  11. An owner especially of a slave or animal
  12. The employer especially of a servant
  13. A presiding officer in an institution or society
  14. Any of several officers of court appointed to assist a judge
  15. A master mechanism or device
  16. An original from which copies can be made

These are not all the meanings, just the noun meanings I found relevant to today, and the world in general.

Yes, there is one meaning which has a negative connotation, but so does the word “shit”, and being Mexican, I don’t get offended when somebody says “Mexico is the shit”.

So no, there’s nothing inherently bad about the word “master” in the present. Like all words: it depends on the context.

By following this rationale the word “get” can be offensive too; one of the definitions is “to leave immediately”. If you shout “get!” to a subordinate, that might be considered offensive (and with good reason)–especially if this person is a discriminated minority. Does that mean we should ban the word “get” completely? No, that would be absurd.

Also, there’s another close word that can be considered offensive: git.

Prescriptives would not care how the word is actually used today, all they care about is to dictate how the word should be used (in their opinion).

But as we saw above; that’s not how language works.

People will decide how they want to use the word “master”. And thanks to the new configuration “init.defaultbranch”, they can decide how not to use that word.

If and when the masses of Git users decide (democratically) to shift away from the word “master”, that’s when the Git project should consider changing the default, not before, and certainly not in a prescriptive way.

Moreover, today the term is used in a variety of contexts that are unlikely to change any time soon (regardless of how much prescriptivists complain):

  1. An important room (master bedroom)
  2. An important key (master key)
  3. Recording (master record)
  4. An expert in a skill (a chess master)
  5. The process of becoming an expert (mastering German)
  6. An academic degree (Master of Economics)
  7. A largely useless thing (Master of Business Administration [MBA])
  8. Golf tournaments (Masters Tournament [The Masters])
  9. Famous classes by famous experts (MasterClass Online Classes)
  10. Online tournament (Intel Extreme Masters [IEM])
  11. US Navy rank (Master-at-Arms [MA])
  12. Senior member of a university (Master of Trinity College)
  13. Official host of a ceremony (master of ceremonies [MC])
  14. Popular characters (Jedi Master Yoda)
  15. A title in a popular game (Dungeon Master)
  16. An important order (Grand Master)
  17. Vague term (Zen master)
  18. Stephen Hawking (Master of the Universe)

And many, many more.

All these are current uses of the word, not to mention the popular BDSM context, where having a master is not a bad thing at all.

Subjectiveness

Even if we suppose that the word is “bad” (which is not), changing it does not solve the problem, it merely shuffles it around. This notion is called language creep (also concept creep). First there’s the n-word (which I don’t feel comfortable repeating, for obvious reasons), then there was another variation (which ends in ‘o’, I can’t repeat either), then there was plain “black”, but even that was offensive, so they invented the bullshit term African-American (even for people that are neither African, nor American, like British blacks). It never ends.

This is very well exemplified in the show Orange Is The New Black where a guard corrects another guard for using the term “bitches”, since that term is derogatory towards women. The politically correct term now is “poochies”, he argues, and the proceeds to say: “these fucking poochies”.

Words are neither good or bad, is how you use them that make it so.

You can say “I love you bitches” in a positive way, and “these fucking women make me vomit” in a completely derogatory way.

George Carlin became famous in 1972 for simply stating seven words he was forbidden from using, and he did so in a completely positive way.

So no. Even if the word “master” was “bad”, that doesn’t mean it’s always bad.

But supposing it’s always bad, who are the victims of this language crime? Presumably it’s black people, possibly descended from slaves, who actually had masters. Do all black people find this word offensive? No.

I’m Mexican, do I get offended when somebody uses the word “beaner”? No. Being offended is a choice. Just like nobody can make you angry, it’s you the one that gets angry, nobody inflicts offense on other people, it’s the choice of the recipients. There’s people with all the reason in the world, who don’t get offended, and people that have no reason, and yet they get easily offended. It’s all subjective.

Steve Hughes has a great bit explaining why nothing happens when you get offended. So what? Be offended. Being offended is part of living in a society. Every time you go out the door you risk being offended, and if you can’t deal with that, then don’t interact with other people. It’s that simple.

Collective Munchausen by proxy

But fine, let’s say for the sake of argument that “master” is a bad word, even on modern times, in any context, and the people that get offended by it have all the justification in the world (none of which is true). How many of these concerned offended users participated in the discussion?

Zero.

That’s right. Not one single person of African descent (or whatever term you want to use) complained.

What we got instead were complainers by proxy; people who get offended on behalf of other (possibly non-existent) people.

Gad Saad coined a term Collective Munchausen by proxy that explains the irrationality of modern times. He borrows from the established disorder called Munchausen Syndrome by Proxy.

So you see, Munchausen is when you feign illness to gain attention. Munchausen by proxy is when you feign the illness of somebody else to gain attention towards you. Collective Munchausen is when a group of people feign illness. And collective Munchausen by proxy is when a group of people feign the illness of another group of people.

If you check the mugshots of BLM activists arrested, most of them are actually white. Just like the people pushing for the rename (all white), they are being offended by proxy.

Black people did not ask for this (the master rename (but probably many don’t appreciate the destruction of their businesses in riots either)).

Another example is the huge backlash J. K. Rowling received for some supposedly transphobic remarks, but the people that complained were not transgender, they were professional complainers that did so by proxy. What many people in the actual transgender community said–like Blair White–is that this was not a real issue.

So why on Earth would a group of people complain about an issue that doesn’t affect them directly, but according to them it affects another group of people? Well, we know it has nothing to do with the supposed target victim: black people, and everything to do with themselves: they want to win progressive points, and are desperate to be “on the right side of history”.

It’s all about them.

The careful observer probably has already noticed this: there are no pros.

Cons

Let’s start with the obvious one: it’s a lot of work. This is the first thing proponents of the change noticed, but it wasn’t such a big issue since they themselves offered to do the work. However, I don’t think they gauged the magnitude of the task, since just changing the relevant line of code basically breaks all the tests.

The tests are done now, but all the documentation still needs to be updated. Not only the documentation of the project, but the online documentation too, and the Pro Git book, and plenty of documentation scattered around the web, etc. Sure, a lot of this doesn’t fall under the purview of Git developers, but it’s something that somebody has to do.

Then we have the people that are not subscribed to the mailing list and are completely unaware that this change is coming, and from one day to the next they update Git and they find out there’s no master branch when they create a new repository.

I call these the “silent majority”. The vast majority of Git users could not tell you the last Release Notes they read (probably because they haven’t read any). All they care about is that Git continues to work today as it did yesterday.

The silent majority doesn’t say anything when Git does what it’s supposed to do, but oh boy do they complain when it doesn’t.

This is precisely what happened in 2008, when Git 1.6.0 was released, and suddenly all the git-foo commands disappeared. Not only did end-users complained, but so did administrators in big companies, and distribution maintainers.

This is something any project committed to its user-base should try to avoid.

And this is a limited list, there’s a lot more than could go wrong, like scripts being broken, automated testing on other projects, and many many more.

So, on one side of the balance we have a ton of problems, and in other: zero benefits. Oh boy, such a tough choice.

Best alternative name for the master branch

Since people didn’t really discuss the previous subject, and went straight to the choice of name, this is where they spent a lot of the time, but this is also the part where I paid less attention, since I don’t think it’s interesting.

Initially I thought “main” was a fine replacement for “master”. If you had to choose a new name, “main” makes more sense, since “master” has a lot of implications other than the most important branch.

But then I started to read the arguments about different names, and really think about it, and I changed my mind.

If you think in terms of a single repository, then “main” certainly makes sense; it’s just the principal branch. However, the point of Git is that it’s distributed, there’s always many repositories with multiple branches, and you can’t have multiple “main” branches.

In theory every repository is as important as another, but in practice that’s not what happens. Humans–like pretty much all social animals–organize themselves in hierarchies, and in hierarchies there’s always someone at the top. My repository is not as important as the one of Junio (the maintainer).

So what happens is that my master branch continuously keeps track of Junio’s master branch, and I’d venture to say the same happens for pretty much all developers.

The crucial thing is what happens at the start of the development; you clone a repository. If somebody made a clone of you, I doubt you would consider your clone just as important as you. No, you are the original, you are the reference, you are the master copy.

The specific meaning in this context is:

an original from which copies can be made

Merriam-Webster

In this context it has absolutely nothing to do with master/slaves. The opposite of a master branch is either a descendant (most branches), or an orphan (in rare cases).

The word “main” may describe correctly a special branch among a bunch of flat branches, but not the hierarchical nature of branches and distributed repositories of clones of clones.

The name “master” fits like a glove.

Culture war

This was the other topic where a lot of time was spent on.

I don’t want to spend too much time on this topic myself–even though it’s the one I’m most familiar with–because I think it’s something in 2020 most people are faced with already in their own work, family, or even romantic relationships. So I’d venture to say most people are tired of it.

All I want to say is that in this war I see three clear factions. The progressives, who are in favor of ANTIFA, BLM, inclusive language, have he/him in bio, use terms like anti-racism, or intersectional feminism, and want to be “on the right side of history”. The anti-progressives, who are pretty much against the progressives in all shapes or forms, usually conservatives, but not necessarily so. But finally we have the vast majority of people who don’t care about these things.

The problem is that the progressives are trying to push society into really unhealthy directions, such as blasphemy laws, essentially destroying the most fundamental values of modern western society, like freedom of speech.

The vast majority of people remain silent, because they don’t want to deal with this obvious nonsense, but eventually they will have to speak up, because these dangerous ideologies are creeping up everywhere.

For more about the subject I can’t recommend enough the new book of Gad Saad: The Parasitic Mind: How Infectious Ideas Are Killing Common Sense.

It really is a parasitic mindset, and sensible people must put a stop to it.

Update: The topic has been so controversial that as a result of this post reddit’s r/git decided to ban the topic completely, and remove the post. Hacker News also banned this post.

The impact to users

I already touched on this on the cons of the name change, but what I didn’t address are the mitigation strategies that could be employed.

For any change there’s good and bad ways of going about it.

Even if the change from “master” to “main’ was good and desirable (which it isn’t), simply jumping to it in the next version (Git 2.30) is the absolute worst way of doing it.

And this is precisely what the current patch is advancing.

I already briefly explained what happened in 2008 with the v1.6.0 release, but what I find most interesting is that looking back at those threads many of the arguments of how not to do a big change, apply exactly in the same way.

Back then what most people complained about was not the change itself (from git-foo to “git foo”) (which they considered to be arbitrary), but mainly the manner in which the change was done.

The main thing is that there was no deprecation period, and no clear warning. This lesson was learned, and the jump to Git 2.0 was much smoother precisely because of the warnings and period of adjustment, along with clear communication from the development team about what to expect.

This is not what is being done for the master branch rename.

I also find what I told Linus Torvalds very relevant:

What other projects do is make very visible when something is deprecated, like a big, annoying, unbearable warning. Next time you deprecated a command it might be a good idea to add the warning each time the command is used, and obsolete it later on.

Also, if it’s a big change like this git- stuff, then do a major version bump.

If you had marked 1.6 as 2.0, and added warnings when you deprecated the git-foo stuff then the users would have no excuse. It would have been obvious and this huge thread would have been avoided.

I doubt anyone listened to my suggestion, but they did this for 2.0, and it worked.

I like to refer to a panel Linus Torvalds participated in regarding the importance of users (educating Lennart Poettering). I consider this an explanation of the first principles of software: the main purpose of software is that it’s useful to users, and that it continues to be useful as it moves forward.

“Any time a program breaks the user experience, to me that is the absolute worst failure that a software project can make.”

Linus Torvalds

Now it’s the same mistake of not warning the users of the upcoming change, except this time it’s much worse, since there’s absolutely no good reason for the change.

The Git project is simply another victim of the parasitic mindset that is infecting our culture. It’s being held hostage by a tiny amount of people pushing for a change nobody else wants, would benefit no one, would affect negatively everyone, and they want to do it in a way that maximizes the potential harm.

If I was a betting man, my money would be on the users complaining about this change when it hits them on the face with no previous warning.