Boost logo

Boost :

Subject: Re: [boost] [proto] Looong compile times and other issues
From: Joel de Guzman (joel_at_[hidden])
Date: 2011-09-12 02:57:21

On 9/12/2011 1:16 PM, Joel Falcou wrote:
> Le 12/09/2011 05:42, Eric Niebler a écrit :
>> On 9/11/2011 9:20 PM, Joel de Guzman wrote:
>>> On 9/12/2011 1:44 AM, Joel Falcou wrote:
>>>> - fusion is also a big hitter : lots of PP and lots of forced instanciation instead of
>>>> lazy specialization.
>>> PP: right, this can be fixed.
>>> lots of forced instanciation: I don't know what you mean.
>>> Can you please be more specific?
>> I can't speak for Joel F. here, but consider the templates instantiated
>> simply to access the Nth element of a fusion vector. From a cursory
>> inspection of sequence/intrinsic/at.hpp, the following call:
>> at_c<N>(v);
>> where v is a fusion vector instantiates:
>> lazy_disable_if
>> is_const
>> result_of::at_c
>> result_of::at
>> mpl::int_
>> detail::tag_of
>> extension::at_impl
>> extension::at_impl::apply
>> mpl::at
>> detail::ref_result
>> add_reference
>> I believe that it also must compute the return type of the const
>> overload of at_c in order to do overload resolution, so that many of the
>> above templates must be instantiated twice: once for a const vector and
>> once for non-const, and throw in an additional add_const. (And come to
>> think of it, Proto probably suffers from this problem too!)
>> That's a lot of templates for a simple element access. I didn't chase
>> the template breadcrumbs into mpl, so there may be more.
> ^ This
> and the fact that the _impl struct are all made like :
> template<class Tag>
> struct at_impl;
> template<>
> struct at_impl<some_tag>
> instead of a more CT friendly
> template<class Tag, class Dummy=void>
> struct at_impl;
> template<class Dummy>
> struct at_impl<some_tag,Dummy>
> I think heller started played with that and got some measurable CT performance increase
> THe C++11 rewrite is obviously a long term project, MPL has to go this way too (My secret
> dream is to merge Fusion and MPL and have MPL be deltype over Fusion calls) and I think at
> some point we should start
> thinking of doign it. Fusion laready have a 0x implementation in the SOC sandbox folder
> but I think it can be pushed a bit more but it'll require us to have access to a couple of
> strong C++11 enabled compilers.
> The CT performances of our infrastructure trifecta (Fusion/MPL/Proto) should become target
> #1 at some point.

The one you show above is also a very simple tweak. I welcome any CT
improvements we can do as long as the code is kept in a reasonably
comprehensible state. I applaud what you and Heller are doing.

Let me make it clear though that it is an unfair characterization
to say that Fusion is the cause of CT slowdown for Proto.
First, as Eric says, Proto avoids Fusion and Second, there's a
clear indication that a library without Proto is still faster,
regardless of the intense CT perf tweaks done thus far.

For example, here is the current CT status of Phoenix2 vs
Phoenix3 comparing the elapsed (CT) time for the phoenix2 vs.
phoenix3 lambda_tests.cpp (**):

MSVC 10:
  Phoenix2: 00:04.5
  Phoenix3: 00:29.9

G++ 4.5:
  Phoenix2: 00:02.6
  Phoenix3: 00:04.7

You all know that Phoenix2 uses Fusion exclusively. Phoenix3
uses proto, which according to Eric does not use Fusion,
although IIRC the core of Phoenix3 uses some Fusion still
(quick check: Thomas uses an optimized-PP version of fusion::
vector for phoenix3).

Heller did a helluva perf-tweaks for Phx3 to get that number
for g++ (alas, not MSVC). In fairness, I did absolutely no
CT perf-tweaks for both Phoenix2 and Fusion.

(** I made sure both tests have exactly the same code, so I
removed the last test. I can post the exact code if need be)


Joel de Guzman

Boost list run by bdawes at, gregod at, cpdaniel at, john at