Boost logo

Boost :

Subject: Re: [boost] [fusion] improving compile times
From: Doug Gregor (doug.gregor_at_[hidden])
Date: 2009-06-03 13:01:09


On Tue, Jun 2, 2009 at 10:09 AM, Eric Niebler <eric_at_[hidden]> wrote:
> Joel de Guzman wrote:
>>
>> Sebastian Redl wrote:
>>>
>>> Eric Niebler wrote:
>>>>
>>>> I confess that I'm not actually benchmarking compile speed; rather,
>>>> I'm benchmarking the number of template instantiations as reported by
>>>> Steven's template profiler. I'm profiling TMP-heavy code like some of
>>>> Proto's and xpressive's tests and cherry-picking the worst offenders.
>>>> The Fusion vector_n_chooser patch knocked off 100's of template
>>>> instantiations, for instance.
>>>
>>> That's not necessarily a good benchmark, especially if you replace it by
>>> preprocessor metaprogramming which leads to more non-template code. GCC
>>> is extremely slow at instantiating templates, but this is not
>>> necessarily true for other compilers - I believe, for example, that
>>> Clang will be faster at instantiating templates than parsing raw code.
>>> (No benchmarks - but I know the code.)
>
> Cool! I wonder how that's possible. I have it from Walter Bright (Zortech,
> Symantec, Digital Mars) that instantiating a template is inherently
> expensive, and certain features of the C++ language (ADL, partial
> specialization, etc.) force that to be the case.

Template instantiation is expensive, but you're probably seeing some
O(n^2) or worse effects because of a poor choice in data structures.
In GCC, for example, a surprising amount of time is wasted determining
whether the class template specialization X<T1, T2, ..., TN> refers to
an already-known template instantiation (or specialization), because
GCC stores all of the template instantiations in a linked list. Thus,
you pay each time you name X<T1, T2, ..., TN>, even if you don't
instantiate it. That's why we see quadratic (or worse) behavior for
template metaprograms with GCC. I suspect that other compilers have
the same problem.

> If Clang has found a way to
> solve these problems, that's good news indeed.

We're working on it. I did some simple benchmarking with the
ultra-boring Fibonacci template metaprogram last Friday, just to see
how Clang is doing, and posted the results here:

  http://lists.cs.uiuc.edu/pipermail/cfe-dev/attachments/20090529/a392b024/attachment-0001.pdf

The cost of template *instantiation* for the Fibonacci example is
quite small for both compilers, since we're just talking about
creating a class with its special member functions and a single static
data member. However, GCC is exhibiting quadratic behavior because
every time we name Fibonacci<I> for some value I, it's doing a linear
search to see if there's already a specialization for that value of I.
Clang is scaling much better here because our search for an
already-named specialization is constant time in the average case.

I can't promise that the improvements we see in Fibonacci will extend
to real template metaprograms, because I haven't tried it. Nor can I:
Clang lacks both member templates and class template partial
specialization [*], which means that we can't compile a serious
template metaprogram with Clang. Obviously, template metaprogramming
is important to me, personally, so we'll do our best to scale this
well for real template metaprograms.

> I read form the Wikipedia
> entry that Clang's C++ support is 2-3 years from being usable, though.

I can't comment on that, but I appreciate Dave's wager ;)

  - Doug

[*] And it lacks function templates, for you snarky folks trying to
fool compilers into handling template metaprograms better :)


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk