Boost logo

Boost :

Subject: Re: [boost] [git] Mercurial?
From: Martin Geisler (mg_at_[hidden])
Date: 2012-03-22 05:31:59


Julien Nitard <julien.nitard_at_[hidden]> writes:

> Hi,
>
>> That's actually not true. DVCS tools are not magic and when there is
>> a genuine conflict in SVN (the same region was editited in parallel
>> by two developers) then you also get a conflict in Git and Mercurial.
>
> Could you please confirm this ? I am not an expert on that, but it
> seems that the diff algo in Git and Hg is completely different. For
> instance it is able to detect that lines were moved rather than some
> lines deleted and some (unrelated) lines inserted in the SVN way.

No, that's not true. It's part of a popular myth around Git: people say
that it tracks "content", not "files". This is being repeated and
repeated all over the web as a defining and amazing feature of Git. An
example is the bottom of this page:

  http://book.git-scm.com/3_normal_workflow.html

What it really means is that a particular file is stored in the
key-value store under a key derived from it's content. The key is
independent of the file name. This means that a rename file is stored
under the same key -- nothing more.

A *changed* file (the situation you have when you move a code block from
foo.c to bar.c) will stil be stored under a different key and will look
completely unrelated to Git.

It is true that 'git blame' has -M and -C options you can use to make it
look for moved blocks of code. But this is pure post-processing: Git is
comparing the versions it has stored and can detect moved code based on
that. Subversion could in principle also do this based on the data it
has stored.

Based on the "track content, not files" myth, people have been trying to
make Git magically recognize that code was moved in one branch and
changed in another. This question is a good example:

  http://stackoverflow.com/q/8843891/110204

You can try yourself it out with these repositories:

  https://bitbucket.org/mg/git-move-edit/changesets
  https://bitbucket.org/mg/hg-move-edit/changesets

So, to recap: Mercurial, Git, and even Subversion use three-way merges
to resolve conflicts. A three-way merge is a simple algorithm that I
sketch in this answer:

  http://stackoverflow.com/a/9533927/110204

The three-way merge uses a common ancestor version (Subversion can
mostly track this after version 1.5, Git and Mercurial has it as a core
concept) and the two divergent versions. For each hunk the merge table
looks like this:

  ancestor mine your -> merge
  old old old old (nobody changed the hunk)
  old new old new (I changed the hunk)
  old old new new (you changed the hunk)
  old new new new (hunk was cherry picked onto both branches)
  old foo bar <!> (conflict, both changed hunk but differently)

Put simply: a three-way merge uses the ancestor to decide which hunk is
new and which hunk is still old. Change trumphs, so new hunks are copied
to the merge result.

Finally, just to make sure nobody complains that I say Git and Mercurial
merges 100% identically: Git will create a virtual common ancestor if
there are more than greatest common ancestor. That can help resolve some
criss-cross merges. Currently, Mercurial does not let you select the
ancestor but I wrote a tiny extension for this:

  http://stackoverflow.com/a/9430810/110204

-- 
Martin Geisler
aragost Trifork
Professional Mercurial support
http://www.aragost.com/mercurial/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk