Boost logo

Boost :

Subject: Re: [boost] RE process (prospective from a retired FreeBSD committer)...
From: Dave Abrahams (dave_at_[hidden])
Date: 2011-01-30 16:52:53


At Fri, 28 Jan 2011 13:30:53 -0800,
Sean Chittenden wrote:
>
> > This is all pretty standard procedure for many projects, IIUC.
> >
> > However, my vision for Boost is a bit different:
> >
> > * each library has its own Git repo
> >
> > * each library maintainer tests and produces versioned releases of
> > that library on his/her own timetable
> >
> > * the latest release of a library is tagged "STABLE"
> >
> > * When assembling a new Boost release, the release manager marks the
> > "STABLE" revision from each library and begins boost-wide
> > integration/interoperability testing on those revisions
> >
> > * When a library release fails integration testing, the release
> > manager has several options:
> >
> > * Give the library maintainer a time window in which to produce a
> > new STABLE release that passes integration testing
> >
> > * Use an earlier STABLE release of the library
> >
> > * Withold the library from the next Boost release (drastic)
> >
> > * If the failures appear to be due to breakage in another library,
> > the release manager can apply these procedures to the culprit library
> >
> > One idea behind this process is that it allows individual libraries to
> > "officially release" fixes and updates without forcing Boost's release
> > managers to coordinate a Boost-wide point release.
>
> This sounds reasonable to me, though what happens to software that
> works and is now in maintenance mode with an author that has left
> the boost community?

Note that I didn't mention the author above. :-)

> As for modularizing things out of the base, this is what FreeBSD has
> done with many pieces of software via the ports tree. In fact, many
> other traditional bits of software that many feel are default parts
> of an OS have been moved to the ports tree (e.g. perl is not in the
> base OS install on FreeBSD, and from what I hear gcc will be moved
> out as well in favor of llvm).

Cool.

> >> A schedule and handoff of hats/responsibilities for each of the
> >> branches is pretty well documented and is now time driven (it used
> >> to be feature driven, which didn't work as well as they'd hoped):
> >>
> >> http://www.freebsd.org/releng/
> >
> > Not sure how that page illustrates what you're saying above. Care to
> > illuminate?
>
> There are people who maintain roles and responsibilities for various
> branches as each branch goes through its life cycle. Each branch
> roughly moves through the following life cycle:
>
> HEAD = Any developer w/ commit access can commit
>
> pre-release branch (e.g. BOOST_1_46_prerelease) = Any re@ can
> approve commits to the branch but developers are required to get re@
> permission (re@ = release engineer hat)
>
> released branch (e.g. BOOST_1_46_0) = re@ hands the branch off to
> security-officer@ for security related fixes. Sometimes re@ will
> push out a micro version release for critical bugs, but
> security-officer@ takes ownership of the 1_46_0 branch until the
> branch is EoL'ed (I don't think boost has a concept of an EoL'ed
> branch at present).
>
> So in that scenario, there are three hats that could be worn
> (some/often the same people, but looking at a problem from different
> perspectives).

If we can push all or nearly all of Boost into modules, managing these
kinds of phases will become the responsibility of individual library
maintainers.

> The above link is just the time table and indicator for developers
> for who has the ability to bless a commit to a particular branch.
> Once a branch is handed off to a particular hat (e.g. re@ or sec@),
> then you have to go find the right person.
>
> One update that I would like to see for boost would be for the hats
> for each branch to be solidified. Maintainers of the 1.46 branch
> might be different than 1.45 vs 1.44 and there's a certain amount of
> institutional knowledge that comes with shepherding a release that
> results in slightly better judgement calls (i.e. "oh! that release
> had a funky interoperability bit with XYZ so we did ABC to fix
> this").

I'd rather see that kind of knowledge pushed upstream in the form of
permanent changes to the individual libraries involved.

> Actually, a real world example of this would probably be what
> happened with boost::serialize a release or two back. When 1.50
> comes around and someone else is doing the re@ process, but there's
> a fix for 1.40, the 1.40 re@ team will almost certainly remember the
> oddity, but the 1.50 re@ team may not.

And I'd like the maintainer of Boost.Serialization to be the
responsible party for that.

> my point was in pushing MFC-like approvals in to the
> commit messages so that people can identify when a change was
> approved or why, or spark a discussion on a mailing list.

Makes sense.

> >> This gives greater latitude in terms of what can be included in
> >> the commit.
> >
> > Could you be more specific?
>
> An MFC's commit-hook requires that filenames match, that's it. When
> you use the 'Reviewed By' commit message header, you can include
> different files that weren' in the original commit to -CURRENT. I
> don't know that this level of control needs to be in place, but
> there are specific process definitions for each of the commit
> headers.

OK. In my proposal it's up to each library maintainer to certify
particular states as "releasable," and he or she can choose the
process used to get there.

> >> If someone commits with that line and didn't have permission to do so,
> >> as a policy, the commit is always reverted 100% of the time as a
> >> matter of principle. Frequently it's re-committed, but it's a big
> >> slap to the back of the hands to have to go through the process
> >> again. Needless to say, the commit mailing list is the most active
> >> and widely read list as a result.
> >
> > Sorry, I don't understand. How does that procedure result in many
> > people reading the commit list?
>
> People read the commit mailing lists because that's the easiest way
> to quickly digest the changes that are made to the tree. If
> something catches your eye, you can quickly go from the mailing list
> to a diff to review the change. Because enough people do this and
> the attention level has reached critical mass, the commit list is
> the focal point for most of the technical/mechanical discussions.

Yeah, OK... but what motivates people to actually do that? Why do
people want to "quickly digest the changes that are made to the tree?"
I think the answer to that question is the key to what makes this
process work for FreeBSD.

> >> And tons of code gets reviewed with many eyes viewing it as a
> >> result. Unlike boost, FreeBSD uses an abridged commit message that
> >> doesn't include the actual diff itself
> >
> > Do you mean that the message sent to the commit mailing list is
> > abridged?
>
> Yes.

Nice idea.

> >> I can't stress the organizational difference and peer pressure a
> >> widely used commit mailing list brings about.
> >
> > Specifically?
>
> If many eyes are watching a commit list and a discussion starts,
> relevant and interested parties are quick to respond to the
> technical merits/problems with a particular commit. Because of this
> self-policing and pressure to perform in front of your peers, the
> quality of commit messages is very high (as well as commits since
> reverts show up on the commit mailing list as well).

That makes a *lot* of sense, thanks.

I believe we *could* set this up for Boost even if it were split into
separate modules/repositories.

> My gripe with git and all dcvs's is this frequent lack of a central
> review process and core community of reviewers.

I don't see what that has to do with the tools.

> I forgot to mention, but in the case of the BSDs, if you have commit
> access, you are involuntarily subscribed to the commit mailing list.

Didn't you just tell me that you can filter out messages about the
parts of the tree you're not interested in?

> >> Post-1.46 release fixes could go in to 1.46.0 and (heaven forbid),
> >> maybe even a micro version with specific fixes for the .46 minor
> >> version.
> >
> > Sorry, I don't understand the above at all. How could post-1.46 stuff
> > go into 1.46.0? 1.46 == 1.46.0! And I don't even grok the "micro
> > version..." part enough know what questions to ask you about it. Care
> > to try again?
>
> No worries, I was in a bit of a hurry after I double tapped send.
>
> Here's the significance of the tag/branch naming scheme in a boost
> hypothetical situation.
>
> BOOST_1_CURRENT = trunk for Boost 1.X development (plans/criteria
> for a 2.0?)

Let's not start a whole new bag of threads right now, OK? :-)

> BOOST_1_STABLE = the branch that will have the next release
> (e.g. 1.47.0). There isn't much in the way of API stability that
> needs to be maintained atm, but if there were, a BOOST_2_CURRENT
> with incompatible APIs from BOOST_1_CURRENT would probably need a
> head vs production branch.
>
> BOOST_1_46 = re@ has branched BOOST_1_CURRENT to create BOOST_1_46
> to get the tree in shape for a release. The only commits allowed to
> go in at this time are commits that pertain to stabilizing the
> release and getting tests to pass.
>
> BOOST_1_46_0 = re@ has shipped the 1.46.0 release. This branch is
> now managed by security-officer@ and to a lesser degree the re@
> team.
>
> BOOST_1_46_1 = A branch/tag that security-officer@ creates once
> there is a reason to release an updated release of 1.46.0. 1.46.0
> to 1.46.1 is required to be API compatible. In the case of the
> BSDs, all minor and micro versions are required to be ABI compatible
> (which is why gcc in the base system only gets updated in the BSDs
> along major release numbers).
>
> Is that a better explanation?

Sure, though if you hadn't confused me with the other part, probably
unnecessary; something like that is a pretty standard procedure across
many projects.

> > having a contrib/ directory is an interesting idea for Boost. That'd
> > be very different from the sandbox. But who's responsible for
> > maintaining that code?
>
> Whoever's the review manager and the review manager's mentee who
> originally wrote the module.

/me bites his nails

That idea gives me the willies.

> PostgreSQL's autovacuum went from being a fringe project to a
> contrib/ module and a core feature in 2 minor releases because of
> the huge interest in its use/adoption. Boost.Log or Atomic or any
> of the other "we all really want this but it's not quite finalized"
> modules seem like ideal candidates for inclusion in such a directory
> because it could generate additional interest/eyes.
>
> Like autovacuum, Boost.Atomic has the potential to be a fantastic
> piece of boost's infrastructure where many boost modules could make
> use of it.
>
> A contrib/ or proposed/ would go a long way towards keeping boost
> lean-ish, too. Here's how:
>
> Being able to modularize boost modules in to little packages that
> can be enabled/disabled independently of the main source tree/core
> will reduce clutter and make interdependencies explicit. Right now
> it takes ~10min and 1GB of ram on my 2.8Ghz Xeon to get bjam to
> complete its dependency checks before the first line of anything is
> compiled.

Most of that is not actually due to dependency analysis.

> Having smaller modules that can be enabled/disabled quickly because
> things are well contained in isolation should reduce the overhead
> for source installs. As time passes and the number of headers or
> modules grow inside of the main boost source tree, how long will
> that process take to complete in the future? Reducing the number of
> things that bjam has to keep track of in the source tree seems like
> the only way to mitigate this (for me personally, this has been a
> growing source of frustration and was a sore spot for me with boost
> since day #1 and why I initially was using cmake instead of bjam).

I'm all for modularization, but again, I don't see how *adding*
contrib/ to the boost distribution can help make boost leaner. I
think it's just a logical impossibility.

> >> At present it seems like boost-*.tar.bz2 is on track to including
> >> boost/kitchen/sink.hpp and boost/bath/water.hpp and that's something
> >> that is a bit concerning to me on the long-term scale.
> >
> > That seems like a semi-random fear that I don't see being addressed by
> > a contrib/ directory.
>
> See above re: bjam's install time. Patching in various boost
> libraries to be compiled and installed via bjam is laborious,
> especially when things are still under development and the compile
> bombs out.

Again, I don't see this being addressed by the presence of contrib/

> > One of the differences between FreeBSD and Boost is (I think) that the
> > vast majority of the actual code in FreeBSD is in the kernel, and thus
> > fairly highly coupled. IMO in Boost, maintainers quite properly have
> > much less interest in what is happening in other libraries. That may
> > make a difference in choosing a suitable procedure.
>
> I see boost as much closer to the FreeBSD port system, actually.

+1

> I think there is a boost "core" but things that are written using
> boost aren't explicitly core until something depends on them.

https://github.com/boost-lib/core

> Anyway, hopefully this has been useful (and a break from the string
> discussion, yikes!). -sc

:-)

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk