Boost logo

Boost Interest :

Subject: Re: [Boost-cmake] Analysis of the current CMake system
From: David Abrahams (dave_at_[hidden])
Date: 2009-01-16 18:27:23


on Thu Jan 15 2009, Brad King <brad.king-AT-kitware.com> wrote:

> David Abrahams wrote:
>> * logfile scraping is too hopelessly fragile to make for a good testing
>> system, and there are better and possibly even easier alternatives.
>
> The question here is whether one wants to test with the same tools users
> might use to build the project. If one user's tool doesn't provide
> per-rule information then we need log-scraping to test it.

Except that I contest your premise that no intrinsic per-rule
information support implies log scraping. If there is support for the
use of replacement tools ("cl-wrapper" instead of "cl"), you can also
avoid log scraping.

> That doesn't mean we can't test some tools without log-scraping
> though.
>
>> Frankly I'm not sure what logfile scraping has to do with the
>> structural problems you've mentioned.
>
> I'm only referring to the test part of the anti-logscraping code. The
> python command wrappers are there to avoid log scraping,

Sorry, I'm not up on the details of the system, so I don't know what
"python command wrappers" refers to.

> but if the tests were run through CTest then no log scraping would be
> needed.

Now I'm really confused. On one hand, you say it's necessary to run
tests through the native toolchains, and that implies log scraping. On
the other, you suggest running tests through CTest and say that doesn't
imply log scraping. I must be misinterpreting something. Could you
please clarify?

>> * Boost developers need the ability to change something in their
>> libraries and then run a test that checks everything in Boost that
>> could have been affected by that change without rebuilding and
>> re-testing all of Boost (i.e. "incremental retesting").
>
> How does the current solution solve that problem (either Boost.Build or
> the current CMake system)?

Boost.Build does it by making test results into targets that depend on
successful runs of up-to-date test executables. Test executables are
targets that depend on boost library binaries and headers. From what I
gather of the current CMake system it is doing something similar.

Of course I understand the downsides of this arrangement that you are
describing (too much / too complicated dependency info leads to slow
execution). On the other hand, it turns out that Boost.Build spends
relatively little time doing actual dependency analysis (its slowness at
incremental re-test time comes from elsewhere).

>>> The large number of high-level targets places many rules in the outer
>>> make level which leads to very long startup times (look at
>>> CMakeFiles/Makefile2, which make needs to parse many times).
>>
>> Just curious: why does make need to parse the same file many times?
>
> I haven't looked at it in detail recently, but on quick inspection I
> think it is now just twice. Each of the two times works with separate
> rules so they could probably be split into two files. However, the file
> has never been very big for any of our projects because we don't have a
> huge number of top-level targets (since VS doesn't work in that case).

Thanks for explaining.

>>> Boost needs to build libraries, documentation, and other files to be
>>> placed in the install tree. Rules to build these parts can fit in
>>> relatively few high-level targets and should certainly use them.
>>
>> Sorry, what "should certainly use" what? Rules should use the targets?
>
> Bad wording on my part. I meant that it is fine to use top-level
> targets to drive the build of libraries and documentation.
>
>>> However, there are some disadvantages:
>>>
>>> (a) If one test fails to compile none of its tests can run
>>> (b) A bad test may accidentally link due to symbols from another test
>>
>> c) Adding a feature to a library requires modifying existing test code.
>
> I don't understand what you mean here. Are you saying that to test a
> new feature, the test dispatcher needs to be updated to link in the new
> test?

I don't know what a test dispatcher is. If you want to maximally
isolate the tests for the new feature, you can put them in a new
translation unit, but something has to call into that translation unit
from main if the tests are going to run.

> FYI, CMake provides a command to generate the dispatcher for you
> (create_test_sourcelist).

Oh, nice; problem solved.

>>> CTest doesn't do log-scraping to detect errors. Every test gets run
>>> individually and its output is recorded separately. Boost's current
>>> system puts the tests inside the build and then jumps through hoops
>>> to avoid log-scraping of the results.
>>
>> What kind of hoops?
>
> It runs all the tests through python command wrappers to capture
> individual output, and therefore has to generate its own compiler
> command line invocations instead of using CMake's knowledge of the
> native tools. Currently -c and -o options are hard-coded AFAICS.

Hm. It shouild be possible to use CMake's knowledge of native tools and
still inject a wrapper. If it isn't, maybe CMake should be extended to
allow it. In any case, you may be about to convince me that there's a
better way to reach the same goals, so I'm not insisting...

>>> This brings us to the log-scraping issue in general. CMake permits
>>> users to build projects using their native tools, including Visual
>>> Studio, Xcode and Makefiles. In order to make sure builds with
>>> these tools work, the test system must drive builds through them
>>> too. Testing with just one type of native tool is insufficient.
>>
>> That's at least somewhat debatable. To get results for all the
>> different toolchains would require more testing resources, would it
>> not?
>
> Yes. In our model we ask users to contribute testing resources for
> the toolchains they want supported which we don't have. If no one
> cares enough about a platform/compiler to submit tests, we don't need
> to support that platform.

It's a good model. If we made it easy enough for people to do, we'd
have a lot more people contributing the testing resources.

>>> Since the native tools do not support per-rule reporting log-scraping
>>> is necessary.
>>
>> Also somewhat debatable. If we can get xcode to invoke
>> "boost-g++-wrapper" instead of "g++," we can still get per-rule
>> reporting, right?
>
> If per-rule reporting is not available from a native tool we have to do
> log-scraping. What's debatable is whether we can work around a lack of
> explicit support in the native tools.

Okay, that's another way to say the same thing.

> We could make this a CMake feature by teaching the generators to wrap
> the compiler up with a tool we distribute with CMake. Then you won't
> have to hack the compilation rule variables for Boost or depend on python.

Sounds like a good plan.

>>> Problem (a) is automatically handled by the testing solution I propose
>>> above since test results are recorded and reported individually.
>>
>> Sorry, what did you propose above?
>
> Testing with ctest's --build-and-test feature. The entire build and
> execution of every test would be captured independently of other tests.

I don't see how that solves problem a). If one TU of a test executable
(corresponding to a feature) fails to compile, do you somehow build the
executable with all the remaining TUs?

>>> Problem (c) is a lack of convenience when the build error is subtle
>>> enough to require the compiler command line to diagnose it (which in
>>> my experience is very rare).
>>
>> Rare, but when you need it, you really need it.
>
> Well, one could always reproduce the build locally or get help from
> the person running the machine with the problem...hence "lack of
> convenience" :)
>
> However, I think our discussion above concludes that log-scraping
> avoidance is not the main problem. It can be made to work.

Sorry, just because I'm too literal-minded and want to be sure, do you
mean "log scraping can be made to work," or "log scraping *avoidance*
can be made to work?"

>>> P.S. Boost's CMake code currently uses file(TO_NATIVE_PATH) but then
>>> references the result variable in CMake code. This causes a build
>>> tree with spaces in the path to create files in a directory with
>>> escaped backslashes in its name. Also, will python deal with
>>> non-escaped backslashes in windows paths inside the strings configured
>>> into the scripts?
>>
>> I can tell you lots about python, but I don't understand your question.
>> Could you rephrase?
>
> The code in question tells CMake to generate a python script that looks
> like this (on Windows):
>
> sys.path.append("c:\path\with\backslashes\to\some\file.txt")
> # ^^ escape sequence?

Oh, that's easy enough. Either use forward slashes or precede the
string with r:

   sys.path.append(r"c:\path\with\backslashes\to\some\file.txt")

-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Boost-cmake list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk