Boost logo

Boost :

From: David Abrahams (david.abrahams_at_[hidden])
Date: 2002-07-03 08:04:10


I've got no problem with accepting this patch, provided:

1. It passes all the existing tests, with BOOST_NO_NRVO in both states
2. Aleksey and Daryle agree
3. The macros are prefixed with BOOST_
4. The backslashes are lined up for readability (emacs has a key which does
this):

#define BOOST_BINARY_OPERATOR_NON_COMMUTATIVE( NAME, OP ) \
template <class T, class U, class B = ::boost::detail::empty_base> \
struct NAME##2 : B \
{ \
  friend const T operator OP( const T& lhs, const U& rhs ) \
    { T nrv( lhs ); nrv OP##= rhs; return nrv; } \
}; \
                                                                        \
template <class T, class U, class B = ::boost::detail::empty_base> \
struct NAME##2_left : B \
{ \
  friend const T operator OP( const U& lhs, const T& rhs ) \
    { T nrv( lhs ); nrv OP##= rhs; return nrv; } \
}; \
                                                                        \
template <class T, class B = ::boost::detail::empty_base> \
struct NAME##1 : B \
{ \
  friend const T operator OP( const T& lhs, const T& rhs ) \
    { T nrv( lhs ); nrv OP##= rhs; return nrv; } \
};

-Dave

----- Original Message -----
From: "Daniel Frey" <d.frey_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Wednesday, July 03, 2002 3:34 AM
Subject: [boost] Proposed changes to operators.hpp

>

---------------------------------------------------------------------------
-----

> Hi Boosters,
>
> I'd like to propose two changes to boost/operators.hpp:
>
> ----- Provide a safer return type -----
>
> The first proposal is the change of the return type of most operators
> from 'T' to 'const T'. Without the change, the following code would be
> legal:
>
> (a+b) = c;
> a++++;
>
> Most probably, this is not what the user would expect. The first one
> assigns 'c' to the temporary result of 'a+b', which is mostly
> useless. It usually happens as a accident when you want to write
> something like '(a+b) == c'. The second statement may look as the
> logical equivalent to the allowed and useful '++++a', but it doesn't
> increment 'a' twice, as the second ++ increments the temporary object
> of the first ++ after copying it. Both the result and the effect of
> the statement is not what you probably had in mind when writing it.
>
> See also Scott Meyers "More Efficient C++", Item 2.2
>
> ----- NRVO-friendly implementation -----
>
> The second change affects the way of implementing operators to allow
> compilers to apply the NRVO (named return value optimization). The
> current implementation yields two problems: It doesn't allow the most
> efficient implementation and it is not symmetric. From the theory,
> when defining an operator+ (or -, *, / etc.) you need exactly one new
> object to hold the result. It should be our goal to provide an
> implementation that allows (good) compilers to build code that doesn't
> yield any additional intermediate objects. Now, let's look at the
> current implementation style and see what's wrong with it:
>
> friend const T operator+( T lhs, const T& rhs )
> {
> return lhs += rhs;
> }
>
> This looks clean, fast and beautiful. But the parameter 'lhs' is
> copied, the operation is applied to the copy and the result
> is... copied! This is required, because the copy of the parameter is
> done by the caller of the operator, rather then the callee. Also, this
> may have side-effects and the standard forbids optimizations that may
> change the observable result of the code. The standard also names
> several optimizations that are explicitly allowed to violate this
> rule, but here, none of these exceptions applies.
>
> When the above operator+ is used to calculate the expression 'a+b+c',
> there is an optimization made. The result of the first sub-expression
> 'a+b' is passed as the first parameter for the second call directly,
> without copying it. Thus, one intermediate object is optimized
> away. Still, 'a+b+c' needs to construct three objects instead of the
> theoretical minimum of two objects. If we call 'c+(a+b)', there are
> four objects, thus the current implementation is asymmetric.
>
> What is the NRVO and how could it solve the problems? The NRVO is an
> optmization that the standard allows to remove an intermediate object
> even if there are observable side-effects. An implementation of the
> operator which allows to apply the NRVO looks like this:
>
> friend const T operator+( const T& lhs, const T& rhs )
> {
> T nrv( lhs );
> nrv += rhs;
> return nrv;
> }
>
> With this implementation, the compiler is allowed to construct and use
> the object 'nrv' in the functions return slot directly, thus no
> unnecessary object is involved here. Also, this function is symmetric,
> as both arguments are taken by reference.
>
> See also Scott Meyers "More Efficient C++", Item 4.7. If you read the
> book, you might get the impression I misunderstood it completely, but
> please read Scott's errata for this item, available at:
>
> http://www.aristeia.com/BookErrata/mec++-errata_frames.html
>
> ----- The real world, part 1: Performance -----
>
> Some people doubt that this will ever make any difference. This is
> true for several reasons. First of all, you need a compiler that
> actually implements the NRVO, otherwise, the new implementation will
> produce *more* objects (see the 'a+b+c'-example mentioned above). The
> next point is, that for small classes like 'complex' or similar
> classes, the compiler optimizes lots of things in the later
> compilation passes. Intermediate 'objects' are still there, but the
> assembler code which copies the values is optimized in a later
> optimization pass. This hides the fact, that there still is a
> superfluous object from the C++'s point of view in between. So, if
> there is no effect, why bother?
>
> The real value shows, when you apply operators.hpp to large classes,
> e.g. matrices, vectors etc. When the compiler can't optimize the code
> in the background, it really helps to remove intermediate objects as
> early as possible in the compilation process. To test this, I wrote a
> small example programm (benchmark.cc) which allows you to compare the
> old and the new implementation. For me, the new version is 15% faster
> than the old version (using GCC 3.1), YMMV. Applications that work
> with large matrices etc. are typically very performance-hungry, thus
> it is a very important area for an operator-library to keep in mind.
>
> ----- The real world, part 2: Compilers -----
>
> The NRVO is still not very common, AFAIK. The GCC had no NRVO before
> version 3.1, The GCC 3.1 implements it correctly and receives good
> results as reported above. Some compilers way implement the NRVO but
> still have bugs or don't follow the standard closely. An example for
> this is the Intel C++ 6.0, which allows the NRVO to be applied only if
> the type of the local variable matches the return type of the function
> exactly. The standard only requires the cv-unqualified types to match,
> which is required for the 'const' in the return type. (1)
>
> A one-size-fits-all approach is a neat idea, but not more. For a
> compiler without a (correct) NRVO, the old code is faster, but not
> symmetric. Most users will prefer it anyway, so I supplied both
> versions for operators_new.hpp, they are switched by
>
> BOOST_NO_NRVO
>
> For compilers without NRVO, we need to change the config-headers. As a
> bonus, the user might set
>
> BOOST_FORCE_SYMMETRIC_OPERATORS
>
> to force the use of the new, symmetric implementation even for those
> compilers. This may result in slower code for these compilers, but for
> newer compilers that have the NRVO, there will be no difference.
>
> ----- Education :) -----
>
> The NRVO is not widely known, thus I decided to call the local
> variable 'nrv' instead of 'tmp' or 'result' or something similar. The
> reason is, that 'nrv' is a hint to the unaware. If it is called 'tmp',
> people will remove it and return to the old implementation style
> without noticing what they have done when writing their own
> operators. When the variable is called 'nrv', chances are that anyone
> who reads the code wonders about the name and hopefully will start to
> ask questions.
>
> ----- Fin -----
>
> Any comments, suggestions, improvements, ...?
>
> Regards, Daniel
>
>
> PS: Thanks to John Potter for explaining why certain optimizations are
> not allowed, see csc++ "Temporaries and optimizations".
>
>
> (1) If you want to use the Intel C++ AND you want to use the NRVO,
> consider using this code:
>
> friend const T operator+( const T& lhs, const T& rhs )
> {
> const T nrv( lhs );
> *const_cast< A* >( &nrv ) += rhs;
> return nrv;
> }
>
> Please note that I don't want to show that the Intel compiler is a bad
> compiler - in fact it creates faster code than the GCC 3.1 once the
> work-around is applied. I just don't have any other compilers
> installed. :)
>
>
>
>

---------------------------------------------------------------------------
-----

> _______________________________________________
> Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk