Subject: Re: [Boost-build] bjam 4.0.. in C++
From: Spencer E. Olson (olsonse_at_[hidden])
Date: 2010-05-23 02:00:01
My two, three, or four cents on the matter:
1. Issue with Boost dependence:
One of the main advantages of bjam currently is that it is written in ansi c
and compiles very, very easily and very quickly for every platform I have to
work with. Currently, bjam is not a heavy weight piece of code that is
difficult to "carry" from machine to machine, let alone compile. As a contrast,
think of the "other" main contender for boost attention: CMake. CMake is both
gigantic, a pain in the neck to compile (yes, Kitware does provide prebuilt
versions for many platforms already--but I prefer the small footprint, easy-
If you have bjam rely on any part of boost that can't be repackaged and
shipped with bjam code, the portability of bjam quickly becomes a nightmare.
Boost does not build completely on all the platform-compiler combinations that
I work on (IBM AIX cluster machines, Cray XT[345...], ..., Pathscale, PGI,
ICC, GCC, XLC, ...). Complicating either the footprint or compilation of
bjam/Boost.Build will cause me for one to reexamine the build system choice.
I have absolutely nothing against C++, I certainly prefer it over c, but we've
found many difficulties compiling all of boost already. If bjam went with a c++
underneath, I would prefer it to be portable, allowing me to continue using
Boost.Build even where Boost is not yet tractable.
2. Issue with Boost.Spirit
A while ago, I had to write another parser. Plain old yacc and lex don't play
too well with C++, especially where exceptions and errors are important. I
really wanted to try a C++ option, so I tried out Boost.Spirit. It worked
quite well and wasn't terribly difficult, although I would say that the
documentation is not very great--especially considering that Boost.Spirit uses
C++ in unconventional ways.
Before worrying about portability issues, I had only one beef with Spirit: it
took forever to compile. I ended up trying to isolate the parser in its own
object code as much as possible (I had parsing happening all over the place)
because it took so long.
After spending a month testing several Boost components (Spirit, smart_ptr,
parts of math, test, and a few smaller pieces) that I wanted to use on the
various platform+compiler combinations, my grievance list grew. Many things
worked quite well: Boost.test required a few small changes to compile
everywhere (except PGI 8,9,10 where I exposed a bug in their math code
generation), Boost.smart_ptr worked easily with one or two #ifdefs for
pathscale, and Boost.math had several problems--just not with the pieces I
needed. Boost.Spirit was quite a bit more problematic, and depended heavily
on the compiler. The real stopping point was when I tried compiling with
IBM's (on AIX) XLC [v 8,9,10]. Any real complicated tests failed to parse.
Simple tests parsed but never finished compiling--I waited for as long as the
sysadmin would let me (>15 minutes) without submitting the compilation to a
batch processing queue. Further testing showed that the IBM compilers had
severe problems implementing the standard with respect to default template
parameters. I hear that during the last BoostCon, Michael Wong claimed that
IBMs new 11.1 compiler can compile Boost 1.40, though I have to see it to
believe it--working on getting a copy to play with.
At this time, I did a little more research into other c++ parser options. I
found that GNU/Bison and GNU/Flex actually play very well with c++. I rewrote
my grammar/lexer in Bison/Flex including using their extensions to generate
c++ code. The result: a much easier to maintain parser. It compiles very
quickly, is very portable, and actually turned out more robust than the Spirit
version--I think it was easier to make robust. There is good documentation
for Yacc/Bison grammars.
I will say that I don't simply include the bison/flex code in my source code.
I have Boost.Build detect any installed bison/flex versions at build time. If
the version is new enough, I have it generate new c++ code. Otherwise, I use
pregenerated versions that I also package with my code. I also have to
package one c++ header file from Flex from the machine where the pregenerated
versions are created. This has been working very well and has greatly
improved compiling and portability issues.
Sorry about the long rant...
On Saturday 22 May 2010 07:57:05 Rene Rivera wrote:
> Once upon a time I started thinking, investigating, sketching, and doing
> some minor coding for a rewrite of bjam in C++ to address the various
> problems. Mainly the horrible memory management and hence speed.
> Unfortunately at the time I had decided that it would not depend on
> Bosot libraries in the hope of preventing a rather nasty boot-strapping
> problem. This presented problems as I couldn't find any good
> lexer/parser that played nice with C++ and wasn't a big PITA to use. It
> also loomed on my how much of Boost I would end up writing anyway. So
> I've just about decided to abandon my assertion that a new bjam would
> not depend on Boost. But I thought I would ask for feedback on this
> before continuing down this path. The new plan would be to:
> * Have bjam 4 depend on a *released* version of Boost.
> * Written with Spirit as the base lexer/parser.
> * Have a variety of syntax improvements and support for UTF8/Unicode.
> I'm cleaning up the branch where I started this work at the moment. But
> I'd like to hear ideas and comments on the new approach. And if you have
> feature request for such a rewrite now would be a good time to voice
> them, so we can add them to trac.
Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk