Boost logo

Boost :

From: Daniel Frey (d.frey_at_[hidden])
Date: 2002-10-13 04:06:06

On Sat, 12 Oct 2002 22:28:01 +0200, Terje Slettebø wrote:

> I've done some preliminary testing (only tested on one compiler, Intel
> C++ 7.0 beta), to test this hypothesis, to test the various ways of
> implementing operator+(). I made the following test program:

(Note that the Intel C++ 6 had a bug in it's NRVO which prevented a NRV of
type T in a function which returns const T prevented the NRVO from being
applied. I already told Intel and it's in their pipeline, but I have not
yet received a notification that the issue is solved. Maybe it's already
corrected in the v7)

> --- Start ---
> class Test
> {
> public:
> Test(int n) : num(n) {}
> Test &operator+=(const Test &other)
> {
> num+=other.num;
> return *this;
> }
> int num;
> int array[1024]; // Just so that copying shows up
> };

You might want to place some cout's in the ctor/dtor to see when objects
are created and destroyed. This is also important as it prevent
assembler-level optimizations and shows what *object* are really optimized

> Test test_num1(1);
> Test test_num2(2);
> int num;
> int array;
> int main()
> {
> Test test_num=test_num1+test_num2;

It is also important to test expressions like x = a+b+c and x = a+(b+c) to
see the real difference of taking parameters by value :) Especially the
difference for taking the first or the second parameter by value...

> num=test_num.num;
> array=test_num.array[0];
> }
> First operator+():
> Test operator+(const Test &t1,const Test &t2) {
> return Test(t1)+=t2;
> }
> [snip]
> Note the two "rep movs". This shows copying of the "array" member.
> Next version:
> inline Test operator+(const Test &t1,const Test &t2) {
> Test nrv(t1);
> nrv+=t2;
> return nrv;
> }
> [snip]
> Preliminary tests seem to confirm what Howard and Daniel said, that
> using a named temporary, rather than the constructor call with "+=", may
> make more optimised code. There's only one "rep movs" (for copying the
> array) in the code above, compared to two in the first one. The one copy
> is needed for the receiving variable, "test_num", so the above is in
> fact optimal code, with no unnecessary temporaries being created.
> Let's try the third alternative:
> inline Test operator+(Test t1,const Test &t2) {
> t1+=t2;
> return t1;
> }
> [snip]
> Hm. Back to having two copies, again (two "rep movs").
> Note, this is only tested on _one_ compiler, but it may give us
> something to go on. From these results, Daniel's suggestion (second
> version here) turned out to be the most optimised one.
> It seems that, at least for this compiler, Andrei's suggestion to pass
> by value if you need to make a copy, anyway, resulted in less optimised
> code. Considering that, in that case, it has to make a copy, to call the
> function, then it's already too late to use the NRVO in the function, as
> it's already a copy, so the above results makes sense.

The point IMHO is, that taking the parameter by value may lead to equally
optimized code for *some* cases. For the general case, only the NRVO may
lead to optimized code for all cases. And a function which takes a const
T& and makes a copy of it is IMHO not lying. If it makes a copy, it's an
implementation detail. I have seen implementation of operator+ which don't
make a copy of the arguments, but why should all these details be
reflected in the function's signature?

> To quote again from above:
>> Taking const& T
>> as arguments in /any/ function when you actually *do* need a copy
>> chokes
> the
>> compiler (and Zuto) and practically forbids them to make important
>> optimizations.
> At least for Intel C++, this turns out to be the other way around.
> Calling by value prevents the NRVO.

Yes. And it's not limited to the Intel C++, as the standard itself
requires compilers to behave this way. A compiler is basically allowed to
remove temporaries only if it can figure out that this does not have any
observable side effects. And I have never seen any compiler which is smart
enough to do this for objects like the above 'Test'-objects. Or if their
are special rules which allows to remove temporaries even if there are
observable side effects. This is the reason why I think it is important to
apply the NRVO as it can do an optimization that the compiler cannot
figure out itself.

Regards, Daniel

Boost list run by bdawes at, gregod at, cpdaniel at, john at