Boost logo

Boost :

Subject: Re: [boost] [git] Mercurial? easy merging in svn, how about git/hg?
From: Martin Geisler (mg_at_[hidden])
Date: 2012-03-29 16:18:33

Frank Birbacher <bloodymir.crap_at_[hidden]> writes:

> Hi!
> Thank you all for the thorough explanations. I really enjoy the
> feedback here.

As you can probably tell, I enjoy talking about this too :)

> Am 29.03.12 10:12, schrieb Martin Geisler:
>> It points out that the --reintegrate flag is critical for
>> reintegrating changes from a branch back into trunk. They are talking
>> about a scenario where you've continuously kept the branch up to date
>> with changes from trunk and now want to merge back:
> Correct.
>> trunk: a --- b --- c --- d --- e
>> \ \
>> branch: r --- s --- t --- u
>> Because of the t revision, you cannot just replay the r:u changes on
>> top of e: the u revision already contain some of the changes that are
>> in e (the b and c changes).
> My approach with svn without --reintegrate is: merge t into the trunk
> and use --record-only. This way the files stay unchanged, but the
> metadata (mergeinfo) will be updated to reflect that trunk now
> contains t. This is somewhat arkward, but in the end it enables a
> merge of the branch into the trunk. svn will then by itself patch r,
> s, and u into the trunk. So no three-way-merge here, but (maybe) a
> series of diffs that will get applied. The --reintegrate option will
> instead employ a three-way merge.
> I see how a merge of all of trunk into the branch just before merging
> the branch into trunk will help to reduce conflicts. So I consider
> --reintegrate now. Maybe this is the point where git and hg have
> better handling of merges. What will happen in the above example with
> git or hg when merging the branch into trunk? Do you have to do a
> final merge from mainline into the branch?

When you merge u into e, you start a three-way merge with

  ancestor: c
  local: e
  remote: u

Files from these three snapshots are compared with the normal three-way
merge logic: a hunk that has only been changed in one way from c to e or
from c to u is copied to the result. If a hunk has been changed in
different ways it's a conflict -- you have to resolve this in your merge
tool of choice.

> What will happen if you skip this step?

If you don't merge, then the branches remain diverged.

> I'm asking because I might want to cherry pick changes from either
> side and merge them into the other: some changes from trunk into the
> branch some other from the branch into trunk and at the end merge the
> whole branch into trunk. How is that supported?

When cherry-picking you're copying changes from one branch onto another.
Let's say we have two long-running branches like this:

  default: ... a --- b --- c --- d --- e
                     / /
  stable: ... --- x --------- y

New features go onto the default branch and bugfixes go into the stable
branch. The stable branch is always a *subset* of the default branch
since the default branch has more features plus the bugfixes that we
continuously merge in from the stable branch.

If a bugfix ends up on the default branch by mistake, then we can cherry
pick it onto the stable branch. Let's say c is such a bugfix. We run

  hg update stable # checkout stable branch (y) in working copy
  hg transplant c # re-apply b to c delta on top of y

This gives us

  default: ... a --- b --- c --- d --- e
                     / /
  stable: ... --- x --------- y --- z

The diff between y and z is like the diff between b and c.

We then merge stable into default again so that stable is a subset of

  default: ... a --- b --- c --- d --- e --- f
                     / / /
  stable: ... --- x --------- y --------- z

This merge is a no-op since a three-way merge doesn't care if a change
has been copied into both branches. That is, the merge sees a hunk that
changed from 'old' to 'new' in both branches. The merge result is
naturally the 'new' hunk and there's no conflict.

Blocking changes is harder because we always use three-way merges
instead of re-playing patches. If you know that you don't need a
particular changeset again, then you can back it out. This is just a way
of applying the reverse patch from that changeset.

           +x +y -x
  a --- b --- c --- d --- f

Here you can think of "+x" as meaning insert a line with "x" and "-x" as
meaning remove the line. The buggy changeset c is backed out and this
just means that you apply "-x" on top of d. The "y" line is still part
of f. Since three-way merges only consider the final states of the
branches, this can be used to block a changeset.

>> That gives you two independent working copies.
>> By making a local clone you avoid downloading anything again.
>> Furthermore, a local clone will make *hardlinks* between the files in
>> the .hg/ directory. This means that both clones share the disk space:
>> you only pay for creating a new working copy.
> Is that supported on Windows as well?

Yeah -- I was surprised too :-) NTFS supports hardlinks and has done so
for more than a decade. But people still come and ask me about this when
I give a Mercurial talk :)

>> With SVN you would have to make a new 'svn checkout' -- or I guess
>> you can copy an existing checkout with 'cp' and then 'svn switch'?
>> That way you avoid downloading the files that aren't affected by the
>> switch.
> Correct. And you will have to pay for duplicate files. SVN will keep a
> pristine copy of all files in its hidden directory. So every working
> copy will have its own set of pristine files, no hardlinks. With the
> pristine files you can view the current changes (svn diff) or revert
> files without contact to the repo.

Indeed. I measured the space taken up by the OpenOffice Mercurial
repository: the working copy is 2.0 GB and the .hg/ folder is 2.3 GB.

This means that you pay a 15% overhead for storing all 270,000
changesets locally -- compared to storing just one pristine copy like
SVN done. The delta compression is amazingly efficient!

>> Notice a fundamental difference in design here: Mercurial (and Git)
>> have branches. Subversion don't:
>> Instead, SVN has a cheap server-side copy mechanism and SVN allows
>> you to checkout a single subdirectory at a time. SVN also allows you
>> to merge changes made in a subdirectory into another subdirectory.
>> These features let you "emulate" branches and tags, but they are not
>> first-class citizens in the system.
> Yes, I always thought the emulation was an advantage of svn because
> you don't have to learn another concept. Just copying directories to
> create branches allows to employ whatever organization of branches and
> tags you like: create /branches/releases to hold release branches if
> you like, create /users/myusername to supply everyone with their own
> sandbox, or create /proj1/trunk and /proj2/trunk in the same repo.

I think it's very clever that you can use a cheap server-side copy
mechanism for this -- it gives you some extra freedom.

Martin Geisler
Mercurial links:

Boost list run by bdawes at, gregod at, cpdaniel at, john at