Boost logo

Boost :

Subject: Re: [boost] [fusion] improving compile times
From: Eric Niebler (eric_at_[hidden])
Date: 2009-06-03 16:46:12

Doug Gregor wrote:
> On Tue, Jun 2, 2009 at 10:09 AM, Eric Niebler <eric_at_[hidden]> wrote:
>> Cool! I wonder how that's possible. I have it from Walter Bright (Zortech,
>> Symantec, Digital Mars) that instantiating a template is inherently
>> expensive, and certain features of the C++ language (ADL, partial
>> specialization, etc.) force that to be the case.
> Template instantiation is expensive, but you're probably seeing some
> O(n^2) or worse effects because of a poor choice in data structures.

Indeed, I recall Walter saying it's an N^2 problem. I've convinced him
to write a blog entry about why template instantiation in C++ is
inherently slow, so soon we'll know why he thinks so. Maybe your work in
Clang can prove him wrong.

> In GCC, for example, a surprising amount of time is wasted determining
> whether the class template specialization X<T1, T2, ..., TN> refers to
> an already-known template instantiation (or specialization), because
> GCC stores all of the template instantiations in a linked list. Thus,
> you pay each time you name X<T1, T2, ..., TN>, even if you don't
> instantiate it.

That sucks.

> That's why we see quadratic (or worse) behavior for
> template metaprograms with GCC. I suspect that other compilers have
> the same problem.
>> If Clang has found a way to
>> solve these problems, that's good news indeed.
> We're working on it. I did some simple benchmarking with the
> ultra-boring Fibonacci template metaprogram last Friday, just to see
> how Clang is doing, and posted the results here:

Lookin' good!

> The cost of template *instantiation* for the Fibonacci example is
> quite small for both compilers, since we're just talking about
> creating a class with its special member functions and a single static
> data member.

Would it go faster if the compiler didn't have to create the special
member functions? Could we use the declared-but-not-defined trick to
suppress their generation and speed up template instantiations for

> However, GCC is exhibiting quadratic behavior because
> every time we name Fibonacci<I> for some value I, it's doing a linear
> search to see if there's already a specialization for that value of I.
> Clang is scaling much better here because our search for an
> already-named specialization is constant time in the average case.
> I can't promise that the improvements we see in Fibonacci will extend
> to real template metaprograms, because I haven't tried it. Nor can I:
> Clang lacks both member templates and class template partial
> specialization [*], which means that we can't compile a serious
> template metaprogram with Clang. Obviously, template metaprogramming
> is important to me, personally, so we'll do our best to scale this
> well for real template metaprograms.

Where can I read more and follow the team's progress?

Eric Niebler
BoostPro Computing

Boost list run by bdawes at, gregod at, cpdaniel at, john at