Boost logo

Boost :

Subject: Re: [boost] Losing history (Was: [git] Boost.Build location_
From: Dave Abrahams (dave_at_[hidden])
Date: 2012-12-27 16:10:35


on Thu Dec 27 2012, Vladimir Prus <ghost-AT-cs.msu.su> wrote:

> On 27.12.2012 04:28, Daniel Pfeifer wrote:
>> 2012/12/26 Rene Rivera <grafikrobot_at_[hidden]>:
>>> On 12/26/2012 11:48 AM, Vladimir Prus wrote:
>>>>
>>>> On 26.12.2012 21:30, Vladimir Prus wrote:
>>>>>
>
>>>>> On 26.12.2012 13:26, Daniel Pfeifer wrote:
>>>>>>
>>>>>> 2012/12/26 Vladimir Prus <ghost_at_[hidden]>:
>>>>>>>
>>>>>>>
>>>>>>> I am wondering what will be the location of Boost.Build after the git
>>>>>>> switch. It seems to be
>>>>>>> part of http://github.com/boost-lib/boost right now.
>>>>>>
>>>>>>
>>>>>> It has its own repository: https://github.com/boost-lib/build
>>>>>
>>>>>
>>>>> Oh, that's good. Will that eventually include full version history
>>>>> from SVN?
>>>>
>>>>
>>>> In fact, could this repository be adjusted to be content of current
>>>> tools/build/v2 ?
>>>>
>>>> There's very little practical reason to have history from Boost.Build V1
>>>> be present in git,
>>>> and the "v2" directory at the top level makes no sense.
>>>
>>>
>>> What I would suggest is that we convert from svn to git manually into our
>>> own repo (since the current bridge doesn't keep history). Then push the new,
>>> full history version, into the boost subrepo.
>>
>> The current bridge is preliminary, but the final conversion will not
>> convert history either.
>> All libraries will start with a fresh git repository without history
>> (History can be made accessible by grafting).
>
> Wait, what? As far as I'm concerned, conversion to git that loses
> history is simply unacceptable.

Ahem; I don't think Daniel was being very clear about how we're handling
history. I completely understand why you got alarmed given what was
written above.

Let me clarify. First of all, we are not going to "lose" history, by
any stretch.

A perfectly(**) accurate Git translation of Boost's history would be a
sequence of snapshots of each state of the SVN filesystem. We can
capture that at any time, for sure, if someone wants it. However,
because SVN doesn't have first class branches or tags (you're expected
to uses subdirectories as ad-hoc branches and tags), that probably
wouldn't be very useful, so maybe it would be better to simply preserve
the final SVN repository for that purpose

A nearly perfectly accurate---but-also useful---Git translation would
make all the ad-hoc SVN branches and tags into real Git branches and
tags. That is what we intend to capture in the boost-history
repository.

Now, storing that history takes ~233M on disk, which has some
implications. As noted in
https://svn.boost.org/trac/boost/wiki/ModCvtSvn2Git#History, it doesn't
make sense to reproduce that entire history in each modularized repo.
Therefore, when someone wants a continuous look at the past, they're
going to use the "git replace" command
(http://git-scm.com/2010/03/17/replace.html) to link up modularized
history with the monolithic history. With some standardized tag names
in each repository we can make this command a one-liner for any given
branch. I'm presently working on a detailed description of exactly how
this can work, and expect to post it shortly.

We *could* also attempt to "modularize the past" and so provide a
continuous view of each library's history in its own modularized repo
back to the library's first commit, by excluding all the other material
that's in SVN. I am wary of doing that, for two reasons. First, it
seems potentially error-prone. Second, it doesn't represent reality.
The past was monolithic, not modularized. If you rewind to a given
commit in one library's "modularized past," there's no way to know
against which commits of other libraries it was tested or was expected
to work. In fact, the second point is tied to the first point; it isn't
entirely clear what a "correct" modularization of the past should look
like.

However, the fact that I'm wary doesn't mean I've ruled it out
completely. It would certainly be possible to provide a best-effort
"modularization of the past," while adding the necessary tags so that
the "modularized past" can be quickly replaced with monolithic history
using git-replace. I'm open to discussion about this approach.

(**) Some Subversion concepts, such as properties, simply don't
     translate to Git, which is another reason to keep the original SVN
     repo archived somewhere.

> Version control is used for software development for a good
> reason. Not do I find it acceptable to impose history conversion task
> on every individual developer.

Fortunately, *that* was never in the plan. I hope the above makes that
clear.

Regards,

-- 
Dave Abrahams
BoostPro Computing                  Software Development        Training
http://www.boostpro.com             Clang/LLVM/EDG Compilers  C++  Boost

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk