|
Boost : |
Subject: Re: [boost] several messages
From: John Maddock (boost.regex_at_[hidden])
Date: 2012-08-05 06:52:06
>> When ET's are turned on, this will always IMO be slower, consider an
>> operator returning an expression template - it's basically returning a
>> pair of references (and larger objects still for more complex
>> expressions), where as say a wrapped integer is actually returning a
>> smaller cheaper to copy object in that case. So for a wrapped integer,
>> returning the result by value will always win out, and that's before you
>> even consider the cost of unpacking the expression template.
>
> I believe you are a bit pessimistic. Compilers are not bad at inlining
> functions and removing wrappers that do nothing, even for
> expression-templates. And yes, returning a pair of references is doing
> nothing if your function is inlined. Some analysis of what remains would
> be interesting. What usually happens is that:
> * expression template wrappers sometimes contain runtime checks (e.g. for
> aliasing between variables), some of which can be impossible to determine
> at compile-time;
> * some compiler optimization opportunities that would have happened early
> are now only exposed after a lot of inlining / simplification has taken
> place, which may be too late for some compilers (but then it is not too
> hard for compilers to make progress there, if it is pointed out to them).
Nod. Lot's to investigate I guess ... of course if a returned ET can be
optimised away, so can the wrapped type that's returned directly (if it's
small enough).
>>> Note that changing the return type from Number&& to Number cancels the
>>>> allocation gain when using a type like GMP that doesn't have an empty
>>>> state.
>>>>
>>>
>>> Huh, really? That's no good.
>>
>> Don't panic it's OK ;-)
>>
>> The current sandbox code, does have these rvalue ref operator overloads,
>> does return by value for safety, and still manages to avoid the extra
>> allocations - so for example in my horner test case, evaluating:
>>
>> Real result = (((((a[6] * x + a[5]) * x + a[4]) * x + a[3]) * x + a[2]) *
>> x + a[1]) * x + a[0];
>>
>> Reuslts in just one allocation even when expression templates are turned
>> off - the first operator overload called generates a temporary, which
>> then gets moved and reused, eventually ending up in the result.
>
> Uh?
> This is indeed what happens, but for GMP types, unless you added in your
> wrapper a special 0 state (which you then have to test in every
> operation), every constructor has to allocate, including the move
> constructor, since a moved-from object must still be in a valid state.
>
> Did you add an empty state then?
The move constructor doesn't allocate - it takes ownership of the GMP
variable, and sets the variable in the moved-from object to a null state.
The *destructor* then has an added check to ensure it doesn't try and clear
null GMP objects: that's basically the only change. IMO the cost of the
extra if statement in the destructor is worth it - and should be trivial
compared to calling the external library routine to clear the GMP variable.
John.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk