
Boost : 
Subject: Re: [boost] [multiprecision] Some rvalue reference experiments (performance inhancements #1)
From: John Maddock (boost.regex_at_[hidden])
Date: 20120805 04:56:02
> The problem is illustrated via something like
>
> a = move(a) + b;
>
> Of course, the above is better written as "a += b", but it's exemplary of
> more elaborate assignments, e.g., "a = c * move(a) + b". Indeed, an even
> simpler example is just "a = move(a)", given John's implementation of
> operator+ above and ignoring the arithmetic (which is not relevant). And
> the problem with "a = move(a)" in a generic context is that
> T::operator=(T&& x) may freely assume that this != &x [1], mostly for
> efficiency purposes (Dave, correct me if I'm wrong here).
>
> So, in other words, don't return an rvalue reference from a function which
> is a reference to one of its arguments unless this is explicitly intended
> and documented (e.g., move and forward).
>
> Note that changing the return type from Number&& to Number cancels the
>> allocation gain when using a type like GMP that doesn't have an empty
>> state.
>>
>
> Huh, really? That's no good.
Don't panic it's OK ;)
The current sandbox code, does have these rvalue ref operator overloads,
does return by value for safety, and still manages to avoid the extra
allocations  so for example in my horner test case, evaluating:
Real result = (((((a[6] * x + a[5]) * x + a[4]) * x + a[3]) * x + a[2]) * x
+ a[1]) * x + a[0];
Reuslts in just one allocation even when expression templates are turned
off  the first operator overload called generates a temporary, which then
gets moved and reused, eventually ending up in the result. Of course
rvalue refs can't help in simple cases such as:
a = b * c;
For that you still need expression templates if you want to avoid
temporaries.
I've posted these results before, but here are the current Bessel function
evaluation tests using mpf_t:
With rvalue refs:
Testing Bessel Functions at 50 digits.....
Time for mpf_float_50 = 5.00486 seconds
Total allocations for mpf_float_50 = 2592797
Time for mpf_float_50 (no expression templates = 5.30183 seconds
Total allocations for mpf_float_50 (no expression templates = 4174980
And again with BOOST_NO_RVALUE_REFERENCES defined
Time for mpf_float_50 = 4.93078 seconds
Total allocations for mpf_float_50 = 2594701
Time for mpf_float_50 (no expression templates = 5.68103 seconds
Total allocations for mpf_float_50 (no expression templates = 6498294
As expected the biggest hit is in the noET code, where the number of
allocations rises dramatically.
Cheers, John.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk