Boost logo

Boost :

Subject: Re: [boost] [git] Mercurial?
From: Topher Cooper (topher_at_[hidden])
Date: 2012-03-22 11:50:02


On 3/22/2012 3:57 AM, Thomas Heller wrote:
> No, I am saying that altering history is dangerous! Which you
> described as one
> of the advantages of "the git approach".

Altering *code* is dangerous. The issue is to aim at wise judgment in
regards to cost vs benefit in choosing to make changes either to code or
to history (or documentation, or specification, or policies, or ...).

Extraneous detail in change history detracts from its usefulness. If I
set a backup point in my local history so I can easily back out of a
small experiment if it fails, it may be very unlikely that this
"snapshot" (and this would be as a snapshot -- not a unit of work) would
subsequently be of interest if the experiment succeeds (if it fails, it
is somewhat more likely that the history may be worth keeping since it
provides a record that the experiment was attempted, that it failed and
why it did so; however, a simple log entry -- an underutilized tool
since the advent of sophisticated VCSes -- might do as well, and the
experiment might be of such limited scope that this might not be of any
interest either). Extra detail obscures important detail, making it
more difficult to find what is relevant in the history.

I think that one of the most important areas where improvement could be
made in VCSes (at least in any that I know) is in better understanding
the contextual and hierarchical nature of history. Recent history
should be much finer grained than older history. What is of importance
to an individual developer during the coding of a "micro-task", is
frequently of no interest to a group, and most of the history within a
"sprint" (to use a term from agile methodology -- a unit of work
requiring between one to three weeks to perform whether that is
performed by the group or by an individual within the group) is
irrelevant afterwards. Branching, tagging and local sandboxes are the
traditional tools for this. The DVCS approach adds local repositories.
This is certainly a step in the right direction as I see it.

Can a non distributed VCS be used to effectively create a DVCS? Of
course, that has been done pretty much since the inception of
centralized VCS systems, just as individual code management systems were
used as centralized VCSes from their inception. A tool designed with
that use in mind, however, has the potential to perform the necessary
functions, more smoothly, conveniently, and accurately than something
cobbled together by conventions and ad hoc scripting. The question is
whether particular tools have succeeded in doing this without
sacrificing other important functionality. My limited experience with
git (I have none with Mercurial) leads me to believe that it has.

The statement has been made that (paraphrasing) "any change worth
committing is worth preserving". This sees version history as a blunt
instrument. Note the immediate and arguably harmful corollary (perhaps
not even a corollary but simply a rephrasing): "only changes worth
preserving (indefinitely) are worth committing". Personally, I try to
locally commit (with whatever tool I have for such things, including
simple directory copies) every few control-s'es (where few is a term
relative to context). This provides me an opportunity for backtracking
and to review recent changes. The chances that any but the last few of
these routine backups of this kind (again there is a hierarchical or
even fractal structure here, but lets keep it simple) are going to be of
use to me are small, and the chances that they will be comprehensible as
meaningful steps to anyone else without detailed study is virtually
nil. That history rapidly becomes not only useless but harmful, since
it obscures useful information.

The statement was made in response to a challenge as to whether the
claimant (I think it was Heller, but I'm not sure) recommended
preserving a record of every save from the editor or every keystroke
made during editing. It is a valid question which was answered based on
the bald assumption that interaction with a VCS represents a fundamental
difference in kind from other activities -- an assumption that I believe
to be completely false. The VCS is a tool whose implementation
*creates* a totally artificial boundary for workflow. There is only a
difference in mechanism, domain/specialization and degree, not of nature
between DCS commits, branches, tags, editor saves, commenting out code,
todo comments, temporary flags, code comments about motivation and even
temporary monitoring and logging insertions and debugger breakpoints.

A VCS version serves a purpose (one or more of a finite list of possible
purposes). What is of use to one developer may not be of any use to
others. What serves a critical function now may become useless at some
time in the future. A belief that all and any detail conventionally and
conveniently captured by an SVN like system is precisely and immutably
the correct level of detail to capture seems to me unlikely to be
correct an a poor assumption on which to base decisions. (The same is
absolutely true about git-like tools -- the issue is whether it tends to
steer choices closer to the ideal and/or make informed judgements easier
to carry out. The arguments of the anti-DVCSers on this list are
leading me to believe that this is so where previously I thought it only
an interesting claim whose truth I was neutral on).

Topher


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk