|
Boost : |
Subject: Re: [boost] [proto] Looong compile times and other issues
From: Thomas Heller (thom.heller_at_[hidden])
Date: 2011-09-12 04:06:27
On Monday, September 12, 2011 02:57:21 PM Joel de Guzman wrote:
> On 9/12/2011 1:16 PM, Joel Falcou wrote:
> > Le 12/09/2011 05:42, Eric Niebler a écrit :
> >> On 9/11/2011 9:20 PM, Joel de Guzman wrote:
> >>> On 9/12/2011 1:44 AM, Joel Falcou wrote:
> >>>> - fusion is also a big hitter : lots of PP and lots of forced
> >>>> instanciation instead of lazy specialization.
> >>>
> >>> PP: right, this can be fixed.
> >>>
> >>> lots of forced instanciation: I don't know what you mean.
> >>> Can you please be more specific?
> >>
> >> I can't speak for Joel F. here, but consider the templates
> >> instantiated
> >> simply to access the Nth element of a fusion vector. From a cursory
> >>
> >> inspection of sequence/intrinsic/at.hpp, the following call:
> >> at_c<N>(v);
> >>
> >> where v is a fusion vector instantiates:
> >> lazy_disable_if
> >> is_const
> >> result_of::at_c
> >> result_of::at
> >> mpl::int_
> >> detail::tag_of
> >> extension::at_impl
> >> extension::at_impl::apply
> >> mpl::at
> >> detail::ref_result
> >> add_reference
> >>
> >> I believe that it also must compute the return type of the const
> >> overload of at_c in order to do overload resolution, so that many of
> >> the
> >> above templates must be instantiated twice: once for a const vector
> >> and
> >> once for non-const, and throw in an additional add_const. (And come to
> >> think of it, Proto probably suffers from this problem too!)
> >>
> >> That's a lot of templates for a simple element access. I didn't chase
> >> the template breadcrumbs into mpl, so there may be more.
> >
> > ^ This
> >
> > and the fact that the _impl struct are all made like :
> >
> > template<class Tag>
> > struct at_impl;
> >
> > template<>
> > struct at_impl<some_tag>
> >
> > instead of a more CT friendly
> >
> > template<class Tag, class Dummy=void>
> > struct at_impl;
> >
> > template<class Dummy>
> > struct at_impl<some_tag,Dummy>
> >
> > I think heller started played with that and got some measurable CT
> > performance increase
> >
> > THe C++11 rewrite is obviously a long term project, MPL has to go this
> > way too (My secret dream is to merge Fusion and MPL and have MPL be
> > deltype over Fusion calls) and I think at some point we should start
> > thinking of doign it. Fusion laready have a 0x implementation in the SOC
> > sandbox folder but I think it can be pushed a bit more but it'll
> > require us to have access to a couple of strong C++11 enabled
> > compilers.
> >
> > The CT performances of our infrastructure trifecta (Fusion/MPL/Proto)
> > should become target #1 at some point.
>
> The one you show above is also a very simple tweak. I welcome any CT
> improvements we can do as long as the code is kept in a reasonably
> comprehensible state. I applaud what you and Heller are doing.
I am trying to come up with a patch today. The changes Joel Falcou suggests
are really easy to do. And already promise to show significant CT
improvements.
> Let me make it clear though that it is an unfair characterization
> to say that Fusion is the cause of CT slowdown for Proto.
> First, as Eric says, Proto avoids Fusion and Second, there's a
> clear indication that a library without Proto is still faster,
> regardless of the intense CT perf tweaks done thus far.
>
> For example, here is the current CT status of Phoenix2 vs
> Phoenix3 comparing the elapsed (CT) time for the phoenix2 vs.
> phoenix3 lambda_tests.cpp (**):
>
> MSVC 10:
> Phoenix2: 00:04.5
> Phoenix3: 00:29.9
>
> G++ 4.5:
> Phoenix2: 00:02.6
> Phoenix3: 00:04.7
I wasn't aware that Phoenix3 was so bad under MSVC 10.
> You all know that Phoenix2 uses Fusion exclusively. Phoenix3
> uses proto, which according to Eric does not use Fusion,
> although IIRC the core of Phoenix3 uses some Fusion still
> (quick check: Thomas uses an optimized-PP version of fusion::
> vector for phoenix3).
>
> Heller did a helluva perf-tweaks for Phx3 to get that number
> for g++ (alas, not MSVC). In fairness, I did absolutely no
> CT perf-tweaks for both Phoenix2 and Fusion.
>
> (** I made sure both tests have exactly the same code, so I
> removed the last test. I can post the exact code if need be)
FWIW, there are some unit tests that outperform the compile times of Phoenix2
(with gcc), the current bad hit on compile times seem to only occur with let,
lambda and switch/case expressions.
> Regards.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk