Boost logo

Boost :

Subject: Re: [boost] As a side note about source control
From: Eric Hopper (hopper_at_[hidden])
Date: 2010-09-01 00:52:49


Christoph Heindl wrote:
> > As a side note, and I'm sure I'm opening a can of worms by even saying
> > this... I find working on projects that do not use distributed source
> > control to be really irritating nowadays.
>
> Could you explain why? What kind of projects (organizational structure).

Well, the organizational structure of the project doesn't matter at all
in my answer to the question. I will describe to you what I did with
the Boost library in order to explain.

I wanted to generate a whole ton of small random numbers quickly and
found the smallint distribution. When I looked at the code, I was
certain I could make something a lot faster that was very specific to my
need, so I did, tested it and discovered that it was indeed much faster.
Then I realized that by carefully using templates I could make it work
for a much more general case.

Now I want to make something that's a library module. I morph my
special case into a more general case as a proof of concept. When that
works, I then consider how to integrate the library module into Boost
because that's where it belongs.

There are lots of different versions of Boost that have been released
and are installed on the various systems I work on. Not only that, but
regardless of whether or not Boost ever accepts my changes I want to
keep them locally so I will always be able to have a version of Boost
with my changes.

If I simply check out one of the tags for a version of Boost I have, I
can certainly make my changes there, but I can never check them in or
effectively track them against Boost itself as it moves forward. I
could check out trunk, but trunk does not correspond to any version of
Boost installed on my system and means I would be testing all kinds of
changes that may or may not be ready for use. I'm not interested in
being a Boost tester. And again, I really couldn't check my changes in
or effectively integrate them with later versions of Boost.

I eventually settle on doing an svn export of several successive
versions of Boost and checking each into Mercurial. This is complicated
by the need to undo Subversion's keyword replacement with a small perl
script. And actually converting the Boost Subversion tree to Mercurial
isn't very feasible because Subversion's idea of branching is so
dissimilar. For example, there are a few cases of a tag being removed
and replaced with a fresh copy of trunk, and this is very hard to model
in any other source control system.

Once I've done this conversion, things are pretty smooth. I can check
in all my changes to my local Mercurial clone. I can play around with
things, do code cleanup and various other things while still being able
to track my source. I can also do development on my desktop and laptop
and move changes smoothly between them even though the changes are
checked in when the laptop may not be connected to the Internet.

Later I discover Boost 1.44 has been released, so I repeat the process I
used before to convert 1.44.0.beta1 and 1.44.0 to Mercurial revisions.
Then I use Mercurial's merge feature to merge my changes back in. Poof,
I have a version of Boost that has my changes, and I can publish a patch
against Boost 1.44 as opposed to 1.43 if need be.

If Boost had used a distributed version control system from the very
beginning, that previous 6 paragraphs could've been condensed down to a
couple of paragraphs. And future merges with Boost would generally be
pretty trivial unless the random library changed significantly.

This is true of any project I want to make a change to.

I do not want to have to interact especially with the community for a
one off change to a project I'm not really interested in becoming a
regular contributor to. A centralized source system makes it hard to
make changes except on active branches which frequently contain changes
I'm not interested in testing. It makes it hard to track my changes
locally regardless of the decisions of the community running the
project. And it makes it hard to contribute a change back to the
community when I don't want to become a full member, which in turn
raises a barrier to entry to me being inexorably sucked into a community
I didn't think I was going to participate in.

All of these are regardless of the organizational structure of a
project. They have everything to do with my ability to control and use
the source code I download.

Simply put, distributed source control is what fits the Open Source
development model. I do not consider this an opinion so much as a flat
statement of fact.

The ability to fork is absolutely central to Open Source. It's harder
to fork when you lose the history and ability to track changes against
the original because a project's history and ability to track your
changes to it is as important as a dump of the source code at a moment
in time, if not more so. Distributed source control avoids these
problems and makes forking easier, almost trivial. Distributed source
control therefor fits the Open Source model better.

In my opinion, distributed source control is the best fit for almost any
development. But that's just my opinion, and I don't have nearly as
much in the way of hard logical argument to back that up. I just know
that I've used source control for nearly (and I'm rather embarassed that
the word 'nearly' is there, but a 17 yr old in 1988 isn't likely to have
heard of source control) as long as I've been a professional programmer.
And distributed source control systems are the first source control
systems I've used where I didn't feel like I was spending a lot of time
fighting the source control system because it didn't work the way I
needed it to. Discipline does not always require pain.

-- 
A word is nothing more or less than the series of historical
connotations given to it. That's HOW we derive meaning, and to claim
that there is an arbitrary meaning of words above and beyond the way
people use them is a blatant misunderstanding of the nature of language.
-- Anonymous blogger
-- Eric Hopper (hopper_at_[hidden] http://www.omnifarious.org/~hopper)--



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk