Boost logo

Boost :

Subject: Re: [boost] RE process (prospective from a retired FreeBSD committer)...
From: Dave Abrahams (dave_at_[hidden])
Date: 2011-01-28 13:58:31


At Thu, 27 Jan 2011 11:24:30 -0800,
Sean Chittenden wrote:
>
> As a retired FreeBSD committer (and a boost-digest lurker since
> 1.33), I wanted to point out how the FreeBSD community deals with
> release engineering and see if there are a few nugget policies that
> could be cherry picked for boost. FreeBSD's process has been in
> place and refined for the last 15 years and comes with already
> written documentation as a starting point. It's not perfect, but
> there's a pound of prevention that comes via policy that no tool can
> replicate.

FreeBSD is very carefully done, consists of many diverse and
interdependent modules, and would make a good model for Boost. I, for
one, appreciate your perspective very much.

> head/trunk/tip is known as the -CURRENT branch. Released versions
> for production used (or are in prep for being released) are known as
> -STABLE branches. When FreeBSD 8 was being developed, people
> committed to HEAD (aka -CURRENT). When the re@ team called for a
> freeze for release and -CURRENT was in sufficiently good condition,
> they branched HEAD in to RELENG_8_STABLE. Everything in RELENG_8_*
> is expected to be production quality (or nearing it). Upon release
> of 8.0, RELENG_8 was branched again to RELEASE_8_0. 8.1, 8.2, etc
> were all created from RELENG_8_STABLE. Fixes for things in stable
> go to -CURRENT first, then are back patched to the appropriate
> stable releases, colloquially known as MFC's or Merge From -CURRENT.
>
> -CURRENT is the wild west of OS development and repo trash. Use at
> -your own risk. Stability and likelihood of all tests passing is
> -marginal, at best. Towards the end of a given major version
> --CURRENT becomes more stable, but once -STABLE has been branched it
> -quickly degenerates some as people implement the impetus to checkin
> -new features.
>
> -STABLE should always be passing 100% of the regression tests
> -present in the branch at all times.

This is all pretty standard procedure for many projects, IIUC.

However, my vision for Boost is a bit different:

* each library has its own Git repo

* each library maintainer tests and produces versioned releases of
  that library on his/her own timetable

* the latest release of a library is tagged "STABLE"

* When assembling a new Boost release, the release manager marks the
  "STABLE" revision from each library and begins boost-wide
  integration/interoperability testing on those revisions

* When a library release fails integration testing, the release
  manager has several options:

  * Give the library maintainer a time window in which to produce a
    new STABLE release that passes integration testing

  * Use an earlier STABLE release of the library

  * Withold the library from the next Boost release (drastic)

  * If the failures appear to be due to breakage in another library,
    the release manager can apply these procedures to the culprit library

One idea behind this process is that it allows individual libraries to
"officially release" fixes and updates without forcing Boost's release
managers to coordinate a Boost-wide point release.

> A quick doc on merging between branches: http://wiki.freebsd.org/SVN_Merging
>
> I don't know how wide spread it is and despite svk having gone in to
> maintenance mode, but the use of svk has merit (partial checkouts):
>
> http://svk.bestpractical.com/view/HomePage

No offense intended to anyone, but SVK is at best a poor-man's Git :-)

> A schedule and handoff of hats/responsibilities for each of the
> branches is pretty well documented and is now time driven (it used
> to be feature driven, which didn't work as well as they'd hoped):
>
> http://www.freebsd.org/releng/

Not sure how that page illustrates what you're saying above. Care to
illuminate?

> The way that FreeBSD handles its permissions and merging from one
> branch to the next is via the commit message, commit hooks and
> approvals from different people wearing different hats. A few
> examples:
>
> http://lists.freebsd.org/pipermail/svn-src-stable/2011-January/008657.html
>
> The above has an MFC line in there to denote the changes included.
> The commit handler verifies that the diffed files largely match (at
> least it used to back in the day when I was committing via CVS).
>
>
>
> http://lists.freebsd.org/pipermail/svn-src-stable/2011-January/008683.html
>
> A 'Reviewed by' commit header.

Git has a mechanism for signing off on changes with cryptographic
security, so that you can have reasonable assurance that the change
was actually approved by the person claimed.

> This gives greater latitude in terms of what can be included in the
> commit.

Could you be more specific?

> http://lists.freebsd.org/pipermail/svn-src-stable/2011-January/008808.html
>
> And the 'Approved by' commit header. Once a release is frozen, all
> commits need to have this tag otherwise the commit will fail. If
> someone commits with that line and didn't have permission to do so,
> as a policy, the commit is always reverted 100% of the time as a
> matter of principle. Frequently it's re-committed, but it's a big
> slap to the back of the hands to have to go through the process
> again. Needless to say, the commit mailing list is the most active
> and widely read list as a result.

Sorry, I don't understand. How does that procedure result in many
people reading the commit list?

> And tons of code gets reviewed with many eyes viewing it as a
> result. Unlike boost, FreeBSD uses an abridged commit message that
> doesn't include the actual diff itself

Do you mean that the message sent to the commit mailing list is
abridged?

> (if you're a committer, then you see the diffs to areas of the tree
> that you subscribe to).

Nifty; that keeps down the noise.

> I don't know how many people actually follow boost's commit log on a
> per commit basis, but given 100% of all commits go to a single
> mailing list and the diffs can sometimes be massive, it seems like
> the current list infrastructure makes that kind of review process
> unsustainable for Joe Reviewer (vs. the shorter abridged commit
> messages w/ a link to the actual diff). Links to each of the diffs
> is useful and lets people click through if reviewing the diff is of
> interest.

Makes sense.

> I can't stress the organizational difference and peer pressure a
> widely used commit mailing list brings about.

Specifically?

> There are other bits worth noting, acls apply to different parts of
> the tree (doc commit bits and a doc-re@ hat vs a src-re@ hat and
> even ports@),

Splitting Boost up into separate repositories should handle most
access-control issues. On top of that, if we needed it, something
like git-o-lite would allow us to control access to paths and branches
within a single repo.

> but the mailing list/commit reviews and branching/merging are the
> biggies that I wanted to point out. Branching trunk in to a
> boost_1_46_prerelease branch three weeks before the release (or some
> sane interval) so that people can fix up the code seems like a
> pretty painless adjustment, then snapping/releasing boost_1_46_0
> when the pre-release is ready.

Believe it or not, we used to have a process very much like that.
Beman has explained here why the current process is an improvement for
release managers, but I don't remember the rationale. Beman, maybe
you'd like to repeat it?

> Post-1.46 release fixes could go in to 1.46.0 and (heaven forbid),
> maybe even a micro version with specific fixes for the .46 minor
> version.

Sorry, I don't understand the above at all. How could post-1.46 stuff
go into 1.46.0? 1.46 == 1.46.0! And I don't even grok the "micro
version..." part enough know what questions to ask you about it. Care
to try again?

> PostgreSQL ships with a contrib/ directory of "soon to be" or
> "possible candidates for being a core component" which serves the
> same purpose as FreeBSD's ports structure. This gets yet-to-be
> finalized modules out in the wild and helps garner interest.

having a contrib/ directory is an interesting idea for Boost. That'd
be very different from the sandbox. But who's responsible for
maintaining that code?

> PostgreSQL's autovacuum went from being a fringe project to a
> contrib/ module and a core feature in 2 minor releases because of
> the huge interest in its use/adoption. Boost.Log or Atomic or any
> of the other "we all really want this but it's not quite finalized"
> modules seem like ideal candidates for inclusion in such a directory
> because it could generate additional interest/eyes. A contrib/ or
> proposed/ would go a long way towards keeping boost lean-ish, too.

How so? It sounds like it means adding more to each release.

> At present it seems like boost-*.tar.bz2 is on track to including
> boost/kitchen/sink.hpp and boost/bath/water.hpp and that's something
> that is a bit concerning to me on the long-term scale.

That seems like a semi-random fear that I don't see being addressed by
a contrib/ directory.

> Me personally, I keep running 'svn ls
> http://svn.boost.org/svn/boost/branches/' to see if a branch pops up
> for release but I haven't yet.

Because it isn't done in the standard way here (c.f. Beman's explanation).

> The structure on the server is largely there, but it the svn tree
> looks pretty disorganized with lots of legacy clutter so it doesn't
> look like it's being used well.
>
> And lastly re: VCSs, lots of people fork FreeBSD to do experimental
> work out of the tree via git and hg, but the monolithic and
> serialized commit/review process seems to be working quite well from
> my perspective. A little bureaucratic but very stable and
> democratic without reliance on any one person to push the release
> forward. Anyway, food for thought. Hopefully there's something
> there that you can pick out of value.

One of the differences between FreeBSD and Boost is (I think) that the
vast majority of the actual code in FreeBSD is in the kernel, and thus
fairly highly coupled. IMO in Boost, maintainers quite properly have
much less interest in what is happening in other libraries. That may
make a difference in choosing a suitable procedure.

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk