Subject: [boost] [git] neglected aspects
From: Steven Samuel Cole (steven.samuel.cole_at_[hidden])
Date: 2012-02-08 01:30:05
First post here, so please bear with me.
A lot has been said in the last year on the subject of a possible
migration from svn to git. However, after reading pretty much every
message on the issue on this list (sic!), I have the impression that
some of the most important aspects did not get the consideration they
require. I would therefore like to add my outside perspective.
1. an svn --> git migration does cause some major, but one-time
disruption. However, svn right now causes minor, but continuous
disruption on a daily basis - which often goes unnoticed.
2. git's distributed concept is nothing less than one generation ahead
of centralized VCSs like svn. The benefits usually do not become
apparent until git is actually used in practice.
3. at the moment, boost is significantly missing out on fresh enthusiasm
that new contributors could bring into the project; svn and the policies
it mandates are partly too blame for that.
4. svn trunk (i.e. boost library) maintainers are too limited by the svn
concept in their decision which changes they merge at what point in time
and may be pressured into rushing suboptimal decisions; this can make
maintenance work painful and frustrating. git greatly alleviates that
5. IMO, it's not 'if', it is 'when'. The longer a migration is delayed,
the longer boost development will be slowed down without even noticing.
I am aware that some of my points technically are mentioned on the boost
git wiki page at https://svn.boost.org/trac/boost/wiki/Git/WhyGit, but
they still seem somewhat neglected. This is probably not surprising as
many participants in the discussion
1. presumably are established boost contributors and thus share a
'from the inside' view of the subject and
2. are only marginally affected by svn's conceptual problems
This is what happened here in the last couple of days:
I needed a tool for a job in my current project and I couldn't find one,
so I thought about what components I need to build my own. A little web
research - Boost.Iostreams looks perfect for the purpose.
You have all been through what comes next, be it with boost or with
other open source software: Download the latest release, install the
libraries, start reading the docs, run some sample code, etc.
While reading, I spot a typo in an HTML doc, then another in an include
file comment, a third one in another file... the fourth one at the
latest makes me think - this is open source, I should sync the latest
sources, fix all those typos as I come across them while reading and
contribute them back to the project.
A quick read on submitting bugs on boost.org, svn co yadi yadi, fix the
typos locally. Of course I don't have write privileges to the repo, so
the best I can do for now is create a patch and attach that to a bug
report. Within minutes, someone takes care of the bug, comments and
takes ownership. Excellent! Smithers, money fight!
I go back to reading, find more typos. My patch has not been committed
to the trunk yet, so my local changes are still sitting there on my
local hard drive. If I do another svn diff > file_2.patch, the patch
will also contain the changes I have already attached to the bug report.
What are my options now ? Revert my changes, base the new ones on the
head revision and send another patch ? Create a new patch file and
manually remove the diffs of the old changes I already sent ? How will
the upstream maintainer know which changes are based on what version ?
Also, I notice a few other things that in my opinion could be done
better to facilitate adoption of the iostreams library. I would be
willing to do the work and I would in the perfect position as I am
walking in the shoes of an adopter right now - but these are separate
from the typo fixes and they are larger issues; considering I can't
check my changes in, would I really want to have them sit on my local
harddrive, waiting possibly for months for someone upstream to review
and hopefully merge them ?
A couple of hours later: I get an email from boost bugs/svn; the
upstream maintainer has committed my changes. However, I actually
receive not one, but two emails, because the upstream guy chose to split
my patch into changes on documentation and changes on source code (which
presumably required extra work on his part).
I do an svn up and now have to deal with conflicts between my local
changes and the new commits: I make a note of my second set of changes
and overwrite my local files with the new repo versions, then bring the
second set back in.
Uff! That's a lot of work just to fix some typos! And these are just
cosmetic issues in discrete chunks that do not require any testing;
contributing changes to source code would take even more effort and more
caution - build and test on several platforms, peer review, etc.
Also, the upstream maintainer jumped on the bug report right away and
integrated my changes within hours. This is the ideal situation (thanks
again Daniel, great job! :-) - usually, things don't happen so quickly -
especially around major release time.
HOW THIS IS DONE IN GIT:
1. google 'github iostreams'
--> click first search result: https://github.com/boost-lib/iostreams
(this can be even reduced to 'gh: iostreams' by integrating github
into chrome search engines, but that's another story...)
2. on githup website, click 'fork', wait a few seconds
--> this auto-creates my personal online github iostreams repo
3. on my local machine, open a shell, enter 'git clone ', copy & paste
the repo .git url shown on the website in there and run e.g.
'git clone git_at_[hidden]:<username>/iostreams.git'
--> this creates a local fork of my personal online iostreams repo
+ the mailing list probably changes the line above to 'hidden'
+ there is github/desktop integration and a plethora of gui clients
4. i change local files and commit to the local repo. once a set of
changes is complete, i push them online into my personal online
repo; once my work has reached some maturity, I send a pull request
to the upstream maintainer of the official iostreams repo, aka
the keeper of the trunk.
disclaimer: I have omitted one-time steps to set up local git,
github.com account, ssh keys, lastpass integration, yadi yadi.
these steps are abundantly documented online, done in minutes and
required only once.
THE GIT ADVANTAGE:
I don't have to wait for the first set of changes to be merged into
the trunk before I resync, resolve conflicts and base the second
set on the new head revision; I simply create a new branch for
a new change set and send a new pull request once my work is complete.
The upstream maintainer has no peer pressure to merge any changes in:
He/she is sitting on the head revision and is looking at a number of
pull requests for change sets in various personal forks like mine; these
are more like 'change offers' the maintainer is free to merge in if
he/she so chooses (cherry-picking) - or postpone if this is not the time
because for example a major release is due and they seem too risky.
I as a contributor do not really care so much: Of course I am stoked if
my work actually makes it into the project eventually, but the point in
time when this happens technically does not matter to me - pending
changes will not get in my way if the merge takes a little longer.
This is the major difference between svn and git - and I can not stress
this advantage enough: I am decoupled from, but still connected to the
official trunk. I can make arbitrary changes and group them in any way I
see fit. I offer them to upstream and move on to new development.
As a side effect, this also provides an overview impression of quality
and continuity of the work of a potential new contributor.
Of course it is technically correct that svn also supports the concepts
of forks and branches, but in terms of tooling and mentality, they are
considered much more heavy-weight than in git and might get created a
couple of times a year - while in git, they might come and go a couple
of times a day.
Finally, the 'series of tubes' metaphor IMO does not really hit it.
Subversion to me seems more like a freight train where the waggons move
at different speeds, so they constantly bump into the car behind and in
front of them; movement of the entire train is jerky.
Git on the other hand is a bunch of space ships linked by hyperelastic
tractor beams: Even if one travels out as far as the delta quadrant, the
upstream connection will always make sure the collective does benefit
from any new development it brings back from there.
[snipped several paragraphs about cultural and other aspects; maybe some
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk