Boost logo

Boost-Build :

From: Ali Azarbayejani (ali_at_[hidden])
Date: 2003-03-18 12:27:41


Hi All,

I have arrived at a (almost) successful build of my test project using
Boost.Build. This message is a rather long, but organized, note
summarizing how Boost.Build compares to the existing GNU Make-based
build system, including exposing a number of Boost.Build problems.
Please comment on items in my Summary in part <3>.

<0> FYI, the existing GNU Make build system supports "modules"
(i.e. "projects" in Boost.Build terminology), build
configurations, multiple OS's, multiple compilers, etc, very much
like Boost.Build in spirit, in some ways less general, in some
ways more. The main problems are, it is

<A> slow, because of required recursive make invocations,
<B> hacked, because make doesn't support recursive function calls
and some other useful language constructs,
<C> unreadable, because of make syntax exacerbated by <B>.

I can live with <B> and <C> because the user doesn't see these,
but <A> is unworkable and there's no good way around it.
Boost.Build reads in all the Jamfiles at once, so my hope was that
it would solve <A>. BB should also solve <B> because it has
better language constructs. Unfortunately, given my experience so
far, I have to say BB doesn't solve <C> in a meaningful
way...indeed, Jam files have better syntax, but the BB system is
so complex that it remains incomprehensible to a regular user as a
practical matter.

<1> The GOOD NEWS is that I was able to convert to Boost.Jam /
Boost.Build by adding a very thin layer (a single 236-line .jam
file) on top of BB, and translating all my project Makefiles to
very similar-looking Jamfiles (because these are mostly simple
variable declarations and conditional modifications of default
build configurations).

(However, note that it took me 2-3 weeks to figure out how to
write those 236 lines!...Jam/BB is *very* difficult to learn.)

Size of GNU Make Build System 2739 lines
Size of Boost.Build, (i.e. new/*.jam) 13124 lines

Size of my.jam (on top of Boost.Build
to replicate function of GNU Make Build
System) 236 lines

(So the total size of Boost.Build (13124+236) is over 4 times that
of the GNU Make system (2739), but the size of the part of the
Boost.Build implementation that *I* have to maintain (236) is only
1/10th of that I currently have to maintain (2739).)

<2> The BAD NEWS is that Boost.Build can't quite do the job yet due to
bugs (hopefully they are bugs and not fundamental problems). In
order of severity

<A> SLOW, SLOW, SLOW! Worse than GNU Make in some cases. If this
can't be solved, BJam/Boost.Build is a dead end for me. My
project contains 18 "projects" (i.e. 18 Jamfiles), including 1
"exe" target and 17 "lib" targets (each in its own directory
with Jamfile). BB says it finds 3698 targets total when built
from the root project. It seems to read the 18 Jamfiles in
seconds, then sits there processing for 3 minutes, then
finally starts running commands.

On my 2.0 GHz, 1Gb machine running RedHat Linux 7.2

Bjam/Boost.Build
4:38 Total, including building 1 exe and 9 of the 17 libs
(the other ones can't be built as sub-projects because
of command-line overflow; see below)
2:57 Reading, binding, etc
1:41 Executing build actions

GNU Make/Make Build System
3:34 Building dependency graph (only needs to be done once)
2:08 Build actions, including reading and processing
sub-Makefiles

BB is slightly better on a CLEAN BUILD (4:38 vs 5:42), but it
ought to be a LOT better (i.e. around 1:45); i.e. once the
files are read in, my 2GHz machine should be able to process
dependencies in seconds, not minutes. The problem for
development is that every BB REBUILD has a 2:57 wait period,
even if you touch just one file (completely unworkable). The
following is a comparison

CLEAN BUILD PARTIAL-REBUILD FULL-REBUILD
----------------------------------------------------
Boost 4:38 2:57 + (part) 1:41 4:38
Make 5:42 (part) 2:08 2:08

<B> Command line overflow on link (Show-stopper!). Some reasons

<i> Spurious arguments (e.g. unnecessary -rpath args on static
link)
<ii> Duplicate libraries on command line (all my libraries and
all System libraries are twice on the command line).
<iii> Property-keyed binary directories
(bin/gcc/debug/link-static/...) are verbose.
<iv> Long absolute path names to libraries are used instead of
shorter relative path names

<C> Command line overflow on compile (Show-stopper!). Some reasons
<i> Super long path names
<a> Compiles are done relative to root project directory
(rather than the directory of the sub-project)
resulting in extra path in front of every single file,
directory, and include path.
<b> Property-keyed binary directories
(bin/gcc/debug/blah/blah/...) adds yet more extra path
in front of every file in the target directory

<D> Command line overflow on archive/dll (Show-stopper!). Some
reasons
<i> Super long path names
<a> Compiles are done relative to root project directory
(rather than the directory of the sub-project)
resulting in extra path in front of every object file
going in the archive or dll
<b> Property-keyed binary directories exacerbates <a>

<E> Library order wrong in link (Show-stopper!) Libraries do not
appear on the command line in dependency order.

There are other minor bugs and design issues that I have run
across that are not show-stoppers and can be dealt with in time.

<3> In SUMMARY, here are the major problems, classified as what I
understand to be BUGs or DESIGN issues. These are all problems
identified on Linux/GCC (I haven't tried Windows yet).

<A> (BUG?) Performance after reading and before actions is
unacceptably slow. (I hope this is merely an implementation
inefficiency rather than a fundamental limitation!!) I have
very little idea how to go about diagnosing this.

<B> (BUG) For Linux/GCC, "-rpath" appears in static link. These
should only appear if BOTH <hardcode-dll-paths>true AND
<link>shared. My workaround for this is to add conditional
"<link>shared:<hardcode-dll-paths>true" as a requirement
instead of simply "<hardcode-dll-paths>true" (the latter
should be sufficient, however).

<C> (BUG) For Linux/GCC, libraries are not in dependency order.
(Vladimir indicates this is a "bite-sized" problem to solve.
If so, let me at it.)

<D> (BUG? DESIGN?) For Linux/GCC, all libraries appear twice on
the link line. (This was clearly done deliberately, but it's
not clear whether it was done on a design principle or to fake
proper dependency order to work around the above bug by
putting each library before and after each other library. I
suspect the latter because we have, in principle, the proper
dependencies and should be able to construct a proper minimal
link line. The current implementation is not acceptable
because it has too much potential for command-line overflow,
especially when coupled with long directory and property
paths.)

<E> (DESIGN) Consider adding this functionality...when computing a
path to use on the command line, instead of just computing the
absolute path, also compute the relative path; use the shorter
one (a new function path.minimal() would be useful for this
purpose). This could help alleviate command line overflow,
particularly when the project tree is buried deeply in the
file system. It's a bit sneaky, so the user should be able to
enable or disable this feature; I don't care which is default.
Suggest three possibilities: "absolute" (always use absolute),
"relative" (always use relative), "minimal" (use shorter).

<F> (DESIGN) "Remote" builds are performed in root rather than
remote directory. Consider building each target in the
directory of its own Jamfile (rather than in the directory of
the root Jamfile). This will alleviate command line overflow,
guarantee consistent run-time behavior for a project when used
on its own or as a sub-project, and help avoid internal
implementation errors having to do with translating paths in
many different contexts.

<G> (DESIGN) Archive and DLL member count is limited by length of
command line and is therefore, in the current implementation,
dependent on a number of run-time factors (because all of
these factors contribute to the excessive path prefix on each
object file): (i) location of project in file system, (ii)
whether being built directly or indirectly as a sub-project,
and (iii) which build variant is being built.

Implementing suggestions above for <E> and <F> will alleviate
this problem and use of the function builtin.variant() can
address (iii), but there may still be a library that overflows
the command line (Lapack, e.g., has something like 1600
files). The overflow case should be detected and handled in
the following way. For an Archive, can add object members
incrementally (as many as will fit on the command line, a la
the "xargs" utility). For a DLL, make an Archive
incrementally as above and link to the Archive.

Please comment on any or all of the above. I am happy at this point
to devote my time to solving these issues, but I need guidance.

Thanks,
--Ali

 


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk