Boost logo

Boost :

Subject: Re: [boost] [git] Mercurial?
From: Thomas Heller (thom.heller_at_[hidden])
Date: 2012-03-21 13:58:38


On 03/21/2012 04:56 PM, Julian Gonggrijp wrote:
> (NB: it took me some time to write this post and in the meanwhile
> some of the issues I'm addressing have been covered. Hopefully what I
> wrote is still useful by making the central things very explicit.)
>
> Rene Rivera wrote:
>
>> On 3/20/2012 10:15 PM, David Bergman wrote:
>>
>>> We evidently have different styles of formal solving; mine is a
>>> balance between an internal - or semi-internal - process and an
>>> "accountable" collaborative effort. I do not see the value of
>>> everybody seeing every single key stroke I make, as long as they see
>>> certain sync points; actually, quite analogously to the operational
>>> semantics of C++ - that certain points at the execution have to
>>> follow some rules...
>> Hm.. I must be not understanding something.. Are you arguing that not all commits/check-ins you do to a local/private repository are important enough to merit the benefits of collaboration? I ask because my contention is that if it's important enough for you to put something into a VCS history, it's important enough for you collaborators to inspect it.. for perpetuity. And that the sooner that inspection happens the better it is for everyone. Hence that deleting such history is counter to collaboration.
> This already received several comments, but I think there is
> something very deep about this that deserves more attention. It
> revolves around the following fundamental question:
>
> What is the meaning of a commit?
>
> One possible interpretation is that a commit is a snapshot of your
> project. A snapshot is something that you store for future reference.
> Because in a sense it's a form of documentation, one will take care
> to submit well-crafted commits that include enough useful changes to
> license a new snapshot. In principle, every commit is assumed to
> introduce some form of progress compared to the previous. Making
> changes to such a history of snapshots is almost necessarily a form
> of fraud.
>
> This is the kind of mental model of a commit that is stimulated by
> svn. You can see it from the terminology: making a commit causes the
> repository to move to the next revision number.
>
> Another possible interpretation is that a commit represents a unit of
> work. This tends to favour many small commits over few big commits.
> Since anything you do before you're sure that it's the right thing is
> also work, shabby commits are part of the deal. The consequence is
> that it must be very cheap to isolate any messy state in temporary
> side tracks. Now the VCS is not only a collection of snapshots, but
> also a tool to manage your recent pieces of work before you finally
> commit* to some of them.
>
> This kind of mental model is stimulated by git. It explains why git
> users make a fuss about amending, rebasing and efficient branching
> and merging.
>
> There is no point in arguing that one mental model is superior to the
> other until you fully grasp both of them. I urge anyone who feels
> tempted to make agitated remarks to let this sink in for at least a
> few hours.
>
> That said, I'm confident enough to think that I can give two solid
> arguments why the units-of-work model is ultimately more productive.
>
> The first argument is provided by historical evidence, and nicely
> illustrated by Christof Donat's most recent post in this thread. The
> units-of-work model was first: local VCSs of the early generation
> invited developers to commit often. In the centralised VCSs of the
> next generation, committing became too expensive for such a workflow
> and developers adapted to the snapshot model instead. From that
> perspective the snapshot model was a workaround rather than a
> preferred solution. Distributed VCSs of the current generation
> specifically intend to address that problem by making commits cheap
> again. Developers are now using the opportunity to switch back to the
> units-of-work model.
>
> The second argument is more technical, and perhaps more convincing.
> It works even without branches or collaborators. All we need is a
> single developer who makes some changes to their working copy of a
> project.
>
> 1. If the developer applies the snapshot model, they'll implement all
> changes in one go and spend some time to verify that they seem to
> make sense. After that they'll probably make a commit.
> 2. If the developer applies the units-of-work model, they'll commit
> each change directly after implementing it. Let's say five commits
> are made in total.
>
> A little later, our developer finds that they want to undo one of the
> changes.
>
> 1. In de snapshot scenario, they look up the pieces of code that were
> affected by the faulty change and edit them again.
> 2. In the units-of-work scenario, they cut the faulty commit out of
> history and they're done.
>
> Result: the units-of-work developer is spending less time to get the
> same thing done with less opportunity for errors.
>
> Note that the pieces of history that tend to get altered in a units-
> of-work model generally don't make it into version control in a
> snapshot model at all.
>
> -Julian
>
> ----
> *) Commit as in, make a commitment. Pun not entirely unintended.

Sure, this all makes sense. Except that failures often only materialize
_after_ you made your changes public. As discussed in this thread
already, rewriting public history should be avoided. With boost this is
even more critical. As mentioned in another thread, we want, and need,
to test on various platforms. In order to do that, we need to make
changes public.
So, to repeat, this all sounds nice and dandy, but after digging deeper,
it doesn't sound like it is generally applicable. Unless you can test
_everything_ on your local machine, or you push onto a volatile branch,
which opens a completely other can of worms (from what i understand).


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk