Boost logo

Boost-Build :

Subject: Re: [Boost-build] Is there any way to prevent Boost.Buildfromrecursively scanning header files for #include directives?
From: Lee Winter (lee.j.i.winter_at_[hidden])
Date: 2009-04-28 04:48:06


On Tue, Apr 28, 2009 at 3:16 AM, Johan Nilsson
<r.johan.nilsson_at_[hidden]> wrote:
> Lee Winter wrote:
>>
>> On Mon, Apr 27, 2009 at 7:06 AM, Johan Nilsson
>> <r.johan.nilsson_at_[hidden]> wrote:
>>>
>>> Lin Luo wrote:
>>>>
>>>> Hi there,
>>>>
>>>> We would like to know that is there a way to limit the header files
>>>> that Boost.Build recursively scans for #include directives to a
>>>> particular directory or set of directories?
>>
>> [...]
>>
>>> I've mentioned something like this in the past. The general opinion
>>> seems to be that dependency scanning only takes an insignificant
>>> amount of time, which I really don't agree with (especially not for
>>> rebuilds after minor, local changes).
>>>
>>
>> Has the Pre-Scanned Headers (PSH) approach been considered? It is
>> similar in concept to pre-compiled headers (PCH), but it is much
>> simpler because it does not have all the contextural complexity of
>> PCH.
>>
>> Essentially a PSH file is a cache of the most recent scan. After a
>> source file has been scanned the list of included files is saved in
>> the respective PSH file.
>
> Sounds interesting, and is probably implementable in a non-intrusive way
> using Boost.Build.

I hope so. Since the PSH file is a kind of look-aside it can be quite
thoroughly tested based on requiring exactly the same results with it
on or off. Only the performance should change.

>> When a source file is a candidate for scanning the timestamp on its
>> respective PSH file is checked. If the PSH file is up to date the
>> scan is unnecessary because the results of the scan are already known
>> and available in the contents of the PSH file. If the PSH file is out
>> of date then it is deleted. If the PSH file does not exist (possibly
>> because it has been deleted) then the souce file is scanned and a new
>> PSH file created.
>
> There is (was?) something called "header cache" support in bjam (originating
> from FTJam or Perforce Jam).
>
> IIRC, this simply stored the names of all header files in a file together
> with their timestamps and used this as an optimization. Don't remember the
> details and thus don't know if this is something conceptually similar to
> PSH. It might also not be supported no more.

Sounds reasonable except for the part about storing the timestamps.
Was this a kind of in-memory cache to avoid stat()'ing and scanning
the files multiple times? Or was it stored on disk?

>
> Someone else (Rene/Volodya) probably remember the details.
>
>>
>> Asa bebeficial side effect the collection of PSH files can also be
>> useful for finding potential complexity reductions in the inclusion
>> tree. Lakos mentions this as a potential performance improvement in
>> his book about large scale software design.
>
> The Lakos book is one of my all-time favorites, but I can't remember ever
> reading about PSH.

It's not in there. The "this" I was referring to was the problem of
complicated, duplicative, and outright unnecessary header file
inclusion. Lakos describes how important it is to unsnarl the
inclusion tree in order to avoid processing unnecessary and
duplicative headers.

PSH is just one way to make it easier to fix the real problem, which
is bad header structure. It also reduces the build time by reducing
the amount of scanning. But if you pay for the latter you have the
info to accomplish the former as a cost-free side effect..

Lee Winter
NP Engineering
Nashua, New Hampshire


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk