Is ‘git pull’ broken? If so, what’s the fix?

Is ‘git pull’ really broken? I know what you are thinking; such a pervasive and basic command cannot possibly be broken. Unfortunately, it is.

It is not some marginal issue, many experienced Git users avoid ‘git pull’ and even urge newcomers to avoid using that command, there’s many sites that encourage you to not use the command, and there have been a lot of threads on the mailing list about the issue (Pull is mostly evil, A failing attempt to use Git in a centralized environment), the maintainer, Junio C Hamano has accepted there’s a big problem, even Linus Torvalds agreed something needs to change.

In order to identify the problem we first need to define the two main ways ‘git pull’ is used.

Pull requests

One way ‘git pull’ is used, is to integrate pull requests into the mainline. For example in the Linux kernel, the DRM maintainer sends a pull request to Linus Torvalds, saying basically:

The following changes are available in the git repository at:

git://people.freedesktop.org/~airlied/linux drm-next

So Linus can just do:

git pull git://people.freedesktop.org/~airlied/linux drm-next

In this mode ‘git pull’ actually works fine, which is not too surprising, since it’s the main thing Linus Torvalds does.

However, this is not the way most people use ‘git pull’.

Update branch

What most people do is for example update their local ‘master’ branch, to the remote ‘origin/master’ branch. Essentially doing ‘git fetch origin’, ‘git merge origin/master’.

However, that’s not exactly what most people actually want to do.

If you don’t have any changes of your own in ‘master’, then yes, ‘git pull’ does what you want, but if you do have changes, and thus the branches have diverged, then ‘git pull’ will create a new merge commit. This might or might not be what you want, but the majority of Git newbies do not want that, or rather, the team they contribute to don’t want those “evil merges”. Unfortunately these newbies don’t know what they are doing, and Git is not making it easier.

So you end up with something like this:

git-pull

Most likely what the team wants is that the local chances are rebased on top of the remote ones, but if they want a merge, they want it the other way around, that is: merge the local changes to the remote ones, as if a topic branch was merged.

git-pull-fix

A merge with this order of parents has many advantages, including a clearer history, however, it’s not possible to do that with ‘git pull’, so you have to do ‘git fetch’, create a new branch, switch to the master branch, merge the other branch, and finally remove the other branch. It’s not straight-forward at all.

It is this mode that is broken, and that’s the reason many people try to avoid ‘git pull’; it rarely does what you want by default.

The solution

There have been many solutions proposed, however, there are many many use-cases to consider, and a solution that takes them all into consideration for the future is not easy to find.

The best solution that seems to accommodate all present use-cases and future ones is the introduction of a new command: ‘git update‘.

By default this command will complain if the branches have diverged, so you have to either do ‘git update --rebase‘ or ‘git update --merge‘, this ensures that newbies aren’t going to do “evil merges” by mistake.

Also, when you do a ‘git update --merge‘ the order of the parents is reversed, which means it appears you are merging ‘master’ to ‘origin/master’, and not the other way around as it happens with ‘git pull’, which means it appears as if you are merging a topic branch, which is what most people want.

git-update

There are many many more advantages to this new command, but probably too subtle to mention in this post.

When will this be ready?

Probably never. I sent a summary of the issues and the solution to the mailing list, which addresses all the use-cases that were discussed. I have the required patches with tests and documentation on my personal branch, and I’ve been using this new command for a while now.

Why isn’t this picked? Maybe it’s because none of the core developers experience these issues. Maybe because they don’t use ‘git pull’ in the second form. Who knows.

The fact is that there is no interest to get this fixed, even though the issue has been acknowledged, so it’s not likely to be fixed any time soon.

So what can you do about it? The best thing you can do right now is simply avoid using ‘git pull’. Additionally, you might want to instruct your fellow coworkers to avoid unsing it as well, specially the ones that are not very familiar with Git.

Also, you might want to use my fork, git-fc, which does have the ‘git update‘ command, which works better than ‘git pull‘ even when there’s no branch divergence, and when there is, ‘git update --merge‘ is also superior, because the order of the parents is right.

12 thoughts on “Is ‘git pull’ broken? If so, what’s the fix?

  1. Isn’t git pull –rebase does exactly this? I know it’s more a workaround than a solution, and maybe I just miss some points (which maybe because I hardly use git pull at all).

  2. @gergelypolonkai Yes, ‘git pull –rebase’ works fine, but sometimes you want a merge, not a rebase. Also, Git newbies don’t run that command, so they constantly keep making “evil merges”. Unless you think ‘git pull’ without arguments should do a rebase by default instead, I don’t see how that’s relevant.

  3. @Gergely Polonkai The problem with that a rebase is harder to understand, and indeed many newcomers don’t know about ‘git rebase’, how to resolve conflicts and continue, and so on. Many people in the mailing list objected to this possibility.

    Even if that were the default, sometimes you would want to do ‘git pull –merge’ anyway, and you still want the order of the parents reversed.

  4. It is not a good practive anyway for all of the numerous developers to push directly to master. Even they are allowed to update master without prior acceptance – your “git update –merge” will help only sometimes. If, while the developer running his merge, somebody else pushes something mode to master – the developer will use the same command to merge again, and the result will be incorrect regardless of what the parent order is in that merges.

    If there is really an intention to give people the experience of centalised VCS – there should be a command, which merges the developer’s master to server’s master, and updates it _on server_ , and does all in one transaction, otherwise don’t change anything.

  5. We could add a config setting for “git pull” behavior. Maybe even two – the branch order and the default behavior, including “ask” or “none”. git could change the default in a future release. The same user may be upstream in one project and downstream in another, so per-project settings should also be available.

  6. Hi Felipe,

    the way I see it the point is not about avoiding “git pull” altogether, or replacing it with another command, but it’s rather about educating to restrict its use; maybe that’s the reason why most (git) core developers don’t consider the issues you are pointing out as grave: updating _automatically_ a branch which diverged from the upstream will cause problems to someone anyway.

    For example, I think I never experienced those issues with “pull” because I just happen to have some discipline:

    – always work in a local temporary branch;

    – pull only into a _clean_ downstream branch and rebase and/or merge the local temporary branch into that just before pushing upstream.

    In other words by using “git pull” in a “monotonic” fashion the window of conflicts reduces basically to the same of concurrent pushes. (the comment by @max630 hints at that too).

    You know, the git way is full of “mantras”, e.g.:

    – Commits are cheap

    – Use topic branches

    – Never rebase a public branch

    Let’s add:

    – Never pull into a non-ff-able branch

    Your “update” command seems to enforce –ff-only by default which is good, but IMHO the automatic merge should be avoided in principle.

    BTW do you happen to know why the “git pull –ff-only as default” patch was never merged?

    Ciao,
    Antonio

  7. Isn’t
    git config –global merge.ff only
    roughly an equivalent of what we want?

  8. @max630 It’s not up to you to decide how Git should be used. Git accepts all use-cases, and people certain can merge and push to master.

    @Pavel Roskin The problem is that no configuration suits everybody and all the time. I can see people wanting to use both behaviors, depending on whether they are integrating a remote branch, or updating a local one.

    @ao2 Yes, I also have that discipline, however, not all the people do. Git can supports many workflows, and “I don’t use that workflow” shouldn’t be an excuse not to fix the workflows of others.

    BTW do you happen to know why the “git pull –ff-only as default” patch was never merged?

    Probably because it was written by me. But either way I also gave up on that approach; we need a way to reverse the parents, and there’s no natural way of doing that with ‘git pull’.

    @matejcepltest

    Isn’t
    git config –global merge.ff only
    roughly an equivalent of what we want?

    No, first of all the error message you get with that is not friendly at all. Secondly, it doesn’t help the vast majority of the people where this problem is triggered: newbies. And thirdly, it doesn’t help the people that do want to make a merge, but with the parents reversed.

  9. Pingback: What’s missing in Git v2.0.0 | Felipe Contreras

  10. You really don’t understand the Git workflow.

    If you want newbies to not commit merges (3-way merges are not ‘evil’ just due to their existence, they just should not be done by someone who doesn’t know what they’re doing), then you do 2 things:

    1) Add a pre-receive hook on the master repo that doesn’t accept rebases or merges (except from certain allowed individuals). You should have this in place, anyways, to prevent doing something by accident. Regardless, this kind of work should be done in a branch, and that branch should be shared back to the master repo.

    2) Have everyone that uses Git do this: git config –global branch.autosetuprebase always . This will cause people to rebase on merge/pull, which is 99.9999% of the time what you want, and eliminates this hassle. If I would suggest anything to the Git team, it would be to make this the default setting. I find this akin to the push.default ‘simple’ issue, in that this should have been the default behavior, and was eventually added/changed to be so.

  11. @kenshaw

    You really don’t understand the Git workflow.

    Right, I’ve participated in all the discussions related to the problems with ‘git pull’ in the Git mailing list, which included many Git experts and developers, including Linus Torvalds. But I don’t understand “the Git workflow” (whatever that means).

    What you don’t seem to understand is that there is no “Git workflow”, there’s many Git workflows, and what you think is best for most projects is not really relevant, different projects will have different workflows.

    Add a pre-receive hook on the master repo that doesn’t accept rebases or merges (except from certain allowed individuals).

    You can’t prevent rebases.

    Have everyone that uses Git do this: git config –global branch.autosetuprebase always .

    As many people who participated in the discussions in the mailing list argued and showed; most newcomers don’t understand what a rebase is, and it’s best not to force them to learn it too early on.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s