Boost logo

Boost Interest :

Subject: [Boost-cmake] Analysis of the current CMake system
From: Brad King (brad.king_at_[hidden])
Date: 2009-01-14 11:52:21


Hi Folks,

I'm considering attending BoostCon 2009 to provide developer-level
CMake expertise, and I'm looking into proposing a session as Hartmut
requested. In preparation I've downloaded and tried the current
system and read back through some of the discussion on this list:

  http://thread.gmane.org/gmane.comp.lib.boost.cmake/4
  http://thread.gmane.org/gmane.comp.lib.boost.cmake/10

The current system feels very bulky compared to CMake-generated build
systems I've used for projects of comparable size. This is primarily
due to the use of top-level targets for the tests. IMO the efforts to
provide test-to-library dependencies and to avoid log-scraping have
steered the system off-course.

One of the goals of CMake is to let developers use their favorite
native tools. These include Visual Studio and Xcode along with Make
tools. In order to support these tools CMake's build model separates
high-level targets from low-level file dependencies. The
add_executable(), add_library(), and add_custom_target() commands
create high-level targets. Each target contains file-level rules to
build individual sources.

In generated VS and Xcode projects the high-level targets become
top-level items in the GUIs. These IDEs define file-level rules
inside each target. In generated Makefiles there is a two-level
system in which the outer level knows only about inter-dependencies
among high-level targets and the inner level loads the file-level
rules inside each target. This design yields fast builds because each
make process sees a relatively small set of rules (and makes automatic
dependency scanning easy and reliable). It also makes the
representation of build rules in the IDEs tractable. The key is that
there should be relatively few high-level targets compared to the
number of file-level rules.

Currently Boost's CMake system creates about two high-level targets
for each test. One is the add_executable() to build the test and the
other is the add_custom_target() to run the test. This results in a
very large number of high-level targets. The generated Xcode and VS
projects are simply too large for the IDEs to load (I waited 10
minutes for VS to try), which already defeats one purpose of using
CMake. This leaves the Makefile generators as the only option.

The large number of high-level targets places many rules in the outer
make level which leads to very long startup times (look at
CMakeFiles/Makefile2, which make needs to parse many times). For
example, I run

  time make type_traits-rank_test VERBOSE=1

and get

  52.49s user 0.31s system 96% cpu 54.595 total

but only about 1s of that time was actually spent running the compiler
and linker.

Boost needs to build libraries, documentation, and other files to be
placed in the install tree. Rules to build these parts can fit in
relatively few high-level targets and should certainly use them. This
is currently done when not building the tests (BUILD_TESTING=OFF).
The problem lies in building the tests. These do not belong in
high-level targets for the reasons described above. I'd like to help
you create an alternative solution.

There are four kinds of tests:

  boost_test_run
  boost_test_run_fail
  boost_test_compile
  boost_test_compile_fail

Let's first consider the run and run_fail tests. In our projects we
typically link all tests for a given package into one (or a few)
executable(s). The executable's main() dispatches the individual
tests based on name. For example, one might manually run the test
mentioned above like this:

  bin/type_traits_tests rank_test

This reduces the number of top-level targets in the build to one per
library to be tested. It also reduces total link time and disk
usage (especially for static linking and when many tests share common
template instantiations). However, there are some disadvantages:

  (a) If one test fails to compile none of its tests can run
  (b) A bad test may accidentally link due to symbols from another test

Problem (a) is not a big deal IMO. If a test doesn't compile the
library it tests has a serious problem and needs manual attention
anyway. Problem (b) may or may not be a big problem for Boost (it
isn't for us). However, there is an alternative tied to the
treatement of compile_fail tests.

Let's now consider the compile and compile_fail tests. The compile
tests could be built into a single executable along with the run and
run_fail tests above, but the compile_fail tests cannot. Boost's
current solution drives the test compilation by generating an actual
compiler command line. It bypasses CMake's knowledge of the native
build tools and tries to run its own command line which may not work
on all compilers. Furthermore, this is not always representative of
how users build programs against boost (they may use CMake, or create
their own VS or Xcode project files).

CTest provides an explicit feature, its --build-and-test mode,
specifically for testing sample external projects built against the
library being tested. All four test types can be done using this
feature. Each test would consist of a complete build of a small
sample project just like a user might create. The tests could even be
run against an install tree if desired.

I gather from some earlier discussion on this list that one reason the
current solution performs tests at build-time is to permit tighter
integration and dependencies between test builds and library builds.
It allows one to ask for a test to be run without first building the
targets it needs in a separate step, and for one to avoid building
tests for a library that failed to compile. However, I encourage you
to re-instate CTest into your process. It will give pretty good
testing right now, and we can address specific issues or requirements
by improving CTest itself. Furthermore, CTest doesn't do log-scraping
to detect errors. Every test gets run individually and its output is
recorded separately. Boost's current system puts the tests inside the
build and then jumps through hoops to avoid log-scraping of the
results.

This brings us to the log-scraping issue in general. CMake permits
users to build projects using their native tools, including Visual
Studio, Xcode and Makefiles. In order to make sure builds with these
tools work, the test system must drive builds through them too.
Testing with just one type of native tool is insufficient. Since the
native tools do not support per-rule reporting log-scraping is
necessary. However, it does not have to be at global granularity as
it is now.

AFAICT the following requirements were used to justify log-scraping
avoidance:

  (a) Errors from each test should be separated
  (b) Errors from each library's build should be separated
  (c) Errors from each object file should be separated and reported
      with the command line that failed

All of these apply only to automatic testing and reporting. A
developer or user that is working interactively on a local build tree
will have enough information from the native build tools to fix the
problem. (Otherwise, a better native tool is needed for development.)

Problem (a) is automatically handled by the testing solution I propose
above since test results are recorded and reported individually.

Problem (b) can be addressed by teaching CTest to build one high-level
target at a time and record each build separately.

Problem (c) is a lack of convenience when the build error is subtle
enough to require the compiler command line to diagnose it (which in
my experience is very rare). This problem cannot be addressed for VS
and Xcode builds. Boost's current solution for Makefiles is a bit of
a hack but can work. There are alternatives such as creating a new
native build tool that provides the desired granularity (like
low-level jam?) and adding a generator for the tool to CMake. However
this new tool would only be for convenience of some developers and
testers. We still have to test use of the other native build tools,
which requires log-scraping to some degree.

In summary, I'd like to help you folks address these issues. Some of
the work will be in Boost's CMake code and some in CMake itself. The
work will benefit both projects. We can arrange to meet at BoostCon,
but we can probably get alot of discussion done on this list before
then. BTW, can anyone suggest a preferred format for a BoostCon
session from the boost-cmake-devs' point of view?

Thanks,
-Brad

P.S. Boost's CMake code currently uses file(TO_NATIVE_PATH) but then
references the result variable in CMake code. This causes a build
tree with spaces in the path to create files in a directory with
escaped backslashes in its name. Also, will python deal with
non-escaped backslashes in windows paths inside the strings configured
into the scripts?


Boost-cmake list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk