Subject: Re: [boost] [multiprecision] Some rvalue reference experiments (performance inhancements #1)
From: John Maddock (boost.regex_at_[hidden])
Date: 2012-08-05 04:56:02

> The problem is illustrated via something like
> a = move(a) + b;
> Of course, the above is better written as "a += b", but it's exemplary of
> more elaborate assignments, e.g., "a = c * move(a) + b". Indeed, an even
> simpler example is just "a = move(a)", given John's implementation of
> operator+ above and ignoring the arithmetic (which is not relevant). And
> the problem with "a = move(a)" in a generic context is that
> T::operator=(T&& x) may freely assume that this != &x [1], mostly for
> efficiency purposes (Dave, correct me if I'm wrong here).
> So, in other words, don't return an rvalue reference from a function which
> is a reference to one of its arguments unless this is explicitly intended
> and documented (e.g., move and forward).
> Note that changing the return type from Number&& to Number cancels the
>> allocation gain when using a type like GMP that doesn't have an empty
>> state.
> Huh, really? That's no good.

Don't panic it's OK ;-)

The current sandbox code, does have these rvalue ref operator overloads,
does return by value for safety, and still manages to avoid the extra
allocations - so for example in my horner test case, evaluating:

Real result = (((((a[6] * x + a[5]) * x + a[4]) * x + a[3]) * x + a[2]) * x
+ a[1]) * x + a[0];

Reuslts in just one allocation even when expression templates are turned
off - the first operator overload called generates a temporary, which then
gets moved and reused, eventually ending up in the result. Of course
rvalue refs can't help in simple cases such as:

a = b * c;

For that you still need expression templates if you want to avoid

I've posted these results before, but here are the current Bessel function
evaluation tests using mpf_t:

With rvalue refs:

Testing Bessel Functions at 50 digits.....
Time for mpf_float_50 = 5.00486 seconds
Total allocations for mpf_float_50 = 2592797
Time for mpf_float_50 (no expression templates = 5.30183 seconds
Total allocations for mpf_float_50 (no expression templates = 4174980

And again with BOOST_NO_RVALUE_REFERENCES defined

Time for mpf_float_50 = 4.93078 seconds
Total allocations for mpf_float_50 = 2594701
Time for mpf_float_50 (no expression templates = 5.68103 seconds
Total allocations for mpf_float_50 (no expression templates = 6498294

As expected the biggest hit is in the no-ET code, where the number of
allocations rises dramatically.

Cheers, John.

