Boost logo

Boost Interest :

Subject: Re: [Boost-cmake] Analysis of the current CMake system
From: David Abrahams (dave_at_[hidden])
Date: 2009-01-15 12:15:56


on Wed Jan 14 2009, Brad King <brad.king-AT-kitware.com> wrote:

> Hi Folks,
>
> I'm considering attending BoostCon 2009 to provide developer-level
> CMake expertise,

Yes, please!

> and I'm looking into proposing a session as Hartmut
> requested.

Also yes please.

> In preparation I've downloaded and tried the current
> system and read back through some of the discussion on this list:
>
> http://thread.gmane.org/gmane.comp.lib.boost.cmake/4
> http://thread.gmane.org/gmane.comp.lib.boost.cmake/10
>
> The current system feels very bulky compared to CMake-generated build
> systems I've used for projects of comparable size. This is primarily
> due to the use of top-level targets for the tests. IMO the efforts to
> provide test-to-library dependencies and to avoid log-scraping have
> steered the system off-course.

While I'm fully prepared to believe that the project could be better
structured, I'm still convinced that:

* logfile scraping is too hopelessly fragile to make for a good testing
  system, and there are better and possibly even easier alternatives.
  Frankly I'm not sure what logfile scraping has to do with the
  structural problems you've mentioned.

* Boost developers need the ability to change something in their
  libraries and then run a test that checks everything in Boost that
  could have been affected by that change without rebuilding and
  re-testing all of Boost (i.e. "incremental retesting").

> One of the goals of CMake is to let developers use their favorite
> native tools. These include Visual Studio and Xcode along with Make
> tools.

Hurrah (not "horrors," at least, not for me)!

> In order to support these tools CMake's build model separates
> high-level targets from low-level file dependencies.

It's just like Boost.Build in that way.

> The add_executable(), add_library(), and add_custom_target() commands
> create high-level targets. Each target contains file-level rules to
> build individual sources.
>
> In generated VS and Xcode projects the high-level targets become
> top-level items in the GUIs. These IDEs define file-level rules
> inside each target. In generated Makefiles there is a two-level
> system in which the outer level knows only about inter-dependencies
> among high-level targets and the inner level loads the file-level
> rules inside each target. This design yields fast builds because each
> make process sees a relatively small set of rules (and makes automatic
> dependency scanning easy and reliable). It also makes the
> representation of build rules in the IDEs tractable. The key is that
> there should be relatively few high-level targets compared to the
> number of file-level rules.

Reminder: much of what's in Boost lives in headers.

The modularization work
(https://svn.boost.org/trac/boost/wiki/CMakeModularizeLibrary) could
help us maintain the ability to do incremental retesting without
representing the dependencies of every individual header file. It would
be nice if module dependencies could be used as a first-level work
eliminator and the system could still avoid retesting things that really
weren't affected, though. There are a few large-scale, highly modular,
library designs in Boost that only appear as dependencies in bits and
pieces in other libraries.

> Currently Boost's CMake system creates about two high-level targets
> for each test. One is the add_executable() to build the test and the
> other is the add_custom_target() to run the test. This results in a
> very large number of high-level targets. The generated Xcode and VS
> projects are simply too large for the IDEs to load (I waited 10
> minutes for VS to try), which already defeats one purpose of using
> CMake.

Big problem.

> This leaves the Makefile generators as the only option.
>
> The large number of high-level targets places many rules in the outer
> make level which leads to very long startup times (look at
> CMakeFiles/Makefile2, which make needs to parse many times).

Just curious: why does make need to parse the same file many times?

> For example, I run
>
> time make type_traits-rank_test VERBOSE=1
>
> and get
>
> 52.49s user 0.31s system 96% cpu 54.595 total
>
> but only about 1s of that time was actually spent running the compiler
> and linker.
>
> Boost needs to build libraries, documentation, and other files to be
> placed in the install tree. Rules to build these parts can fit in
> relatively few high-level targets and should certainly use them.

Sorry, what "should certainly use" what? Rules should use the targets?

> This
> is currently done when not building the tests (BUILD_TESTING=OFF).
> The problem lies in building the tests. These do not belong in
> high-level targets for the reasons described above. I'd like to help
> you create an alternative solution.

Yay

> There are four kinds of tests:
>
> boost_test_run
> boost_test_run_fail
> boost_test_compile
> boost_test_compile_fail
>
> Let's first consider the run and run_fail tests. In our projects we
> typically link all tests for a given package into one (or a few)
> executable(s). The executable's main() dispatches the individual
> tests based on name. For example, one might manually run the test
> mentioned above like this:
>
> bin/type_traits_tests rank_test
>
> This reduces the number of top-level targets in the build to one per
> library to be tested. It also reduces total link time and disk
> usage (especially for static linking and when many tests share common
> template instantiations).

Yes, those are big advantages. Boost is inconsistent about its test
structuring. The Boost.Test library is designed to make that sort of
test aggregation work well, but many projects have become wary of using
that library, and we don't have a viable alternative that makes it easy.

> However, there are some disadvantages:
>
> (a) If one test fails to compile none of its tests can run
> (b) A bad test may accidentally link due to symbols from another test

c) Adding a feature to a library requires modifying existing test code.

> Problem (a) is not a big deal IMO. If a test doesn't compile the
> library it tests has a serious problem and needs manual attention
> anyway. Problem (b) may or may not be a big problem for Boost (it
> isn't for us).

I don't think it's important.

> However, there is an alternative tied to the treatement of
> compile_fail tests.
>
> Let's now consider the compile and compile_fail tests. The compile
> tests could be built into a single executable along with the run and
> run_fail tests above, but the compile_fail tests cannot. Boost's
> current solution drives the test compilation by generating an actual
> compiler command line. It bypasses CMake's knowledge of the native
> build tools and tries to run its own command line which may not work
> on all compilers. Furthermore, this is not always representative of
> how users build programs against boost (they may use CMake, or create
> their own VS or Xcode project files).
>
> CTest provides an explicit feature, its --build-and-test mode,
> specifically for testing sample external projects built against the
> library being tested. All four test types can be done using this
> feature. Each test would consist of a complete build of a small
> sample project just like a user might create. The tests could even be
> run against an install tree if desired.

Might be overkill, but it probably wouldn't hurt much.

> I gather from some earlier discussion on this list that one reason the
> current solution performs tests at build-time is to permit tighter
> integration and dependencies between test builds and library builds.

I'm not sure.

> It allows one to ask for a test to be run without first building the
> targets it needs in a separate step, and for one to avoid building
> tests for a library that failed to compile. However, I encourage you
> to re-instate CTest into your process.

I don't remember why we decided it wasn't going to work for us.

> It will give pretty good
> testing right now, and we can address specific issues or requirements
> by improving CTest itself. Furthermore, CTest doesn't do log-scraping
> to detect errors. Every test gets run individually and its output is
> recorded separately. Boost's current system puts the tests inside the
> build and then jumps through hoops to avoid log-scraping of the
> results.

What kind of hoops?

> This brings us to the log-scraping issue in general. CMake permits
> users to build projects using their native tools, including Visual
> Studio, Xcode and Makefiles. In order to make sure builds with these
> tools work, the test system must drive builds through them too.
> Testing with just one type of native tool is insufficient.

That's at least somewhat debatable. To get results for all the
different toolchains would require more testing resources, would it not?
It may not be practical for us to do that in the first place.

> Since the native tools do not support per-rule reporting log-scraping
> is necessary.

Also somewhat debatable. If we can get xcode to invoke
"boost-g++-wrapper" instead of "g++," we can still get per-rule
reporting, right?

> However, it does not have to be at global granularity as it is now.
>
> AFAICT the following requirements were used to justify log-scraping
> avoidance:
>
> (a) Errors from each test should be separated
> (b) Errors from each library's build should be separated
> (c) Errors from each object file should be separated and reported
> with the command line that failed
>
> All of these apply only to automatic testing and reporting. A
> developer or user that is working interactively on a local build tree
> will have enough information from the native build tools to fix the
> problem. (Otherwise, a better native tool is needed for development.)

Automatic testing/reporting and local testing/development are the two
important use cases.

> Problem (a) is automatically handled by the testing solution I propose
> above since test results are recorded and reported individually.

Sorry, what did you propose above?

> Problem (b) can be addressed by teaching CTest to build one high-level
> target at a time and record each build separately.
>
> Problem (c) is a lack of convenience when the build error is subtle
> enough to require the compiler command line to diagnose it (which in
> my experience is very rare).

Rare, but when you need it, you really need it.

> This problem cannot be addressed for VS and Xcode builds.

Hard to believe, given the ability one has to integrate external tools
into VS. On the other hand, it might be that treating "cl" (or some
wrapper) as an external tool nullifies major advantages of using VS (I
don't know).

> Boost's current solution for Makefiles is a bit of
> a hack but can work. There are alternatives such as creating a new
> native build tool that provides the desired granularity (like
> low-level jam?) and adding a generator for the tool to CMake.

Yoiks! Better to avoid that if possible.

> However this new tool would only be for convenience of some developers
> and testers. We still have to test use of the other native build
> tools, which requires log-scraping to some degree.
>
> In summary, I'd like to help you folks address these issues. Some of
> the work will be in Boost's CMake code and some in CMake itself. The
> work will benefit both projects. We can arrange to meet at BoostCon,
> but we can probably get alot of discussion done on this list before
> then. BTW, can anyone suggest a preferred format for a BoostCon
> session from the boost-cmake-devs' point of view?

I don't have any brilliant ideas off-hand, but I would like to discuss it.

> P.S. Boost's CMake code currently uses file(TO_NATIVE_PATH) but then
> references the result variable in CMake code. This causes a build
> tree with spaces in the path to create files in a directory with
> escaped backslashes in its name. Also, will python deal with
> non-escaped backslashes in windows paths inside the strings configured
> into the scripts?

I can tell you lots about python, but I don't understand your question.
Could you rephrase?

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost-cmake list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk