Boost logo

Boost :

From: Andrei Alexandrescu (andrewalex_at_[hidden])
Date: 2002-10-21 20:31:49


"Daniel Frey" <d.frey_at_[hidden]> wrote in message
news:ap1v2g$ttq$1_at_main.gmane.org...
> Please apply or discuss as needed :)

Cool. I'd like to use this as a pretext for trying to understand the way RVO
in its different flavors is implemented in various compilers.

1. What all compilers are guaranteed to do

Consider a bona fide T type:

struct T
{
    T(const T&);
};

T Fun()
{
    return expression;
}

T obj(Fun());

In this setting, the compiler is guaranteed to copy result into a temporary
value and a temporary value into obj. The copy constructor is invoked at
most two times.

2. What most compilers actually do

When generating code, most compilers do something like that:

void Fun(void* __pResult)
{
    new(__pResult) T(expression);
}

char __buf[sizeof(T)]; // assume proper alignment
Fun(&__buf);
T& obj = *reinterpret_cast<T*>(&__buf);

As you see, there's a copy constructor there that takes expression as a
source.

3. The URVO (Unnamed RVO)

This is the easiest optimization to perform. Whenever /expression/ is a
temporary of type T, the compiler is smart enough to fuse the temporary with
the constructor call. For example, if expression is T(a, b, c), then the
compiler will be smart enough to say:

    new(__pResult) T(a, b, c);

instead of:

    new(__pResult) T(T(a, b, c));

So far, so good. However, when a named value is used, such as 'result'
(result being a variable of type T), still the copy constructor is used.

4. The NRVO (Named RVO)

This is a more advanced optimization. The compiler is smart enough to detect
patterns such as:

T Fun()
{
    T result(a, b, c);
    ....
    return result;
}

and generates code such as:

void Fun(void* __pResult)
{
    new(__pResult) T(a, b, c);
    ...
}

so it basically creates result at the address received from the caller.

It is unclear to me whether (and which) compilers that do NRVO can do RVO as
well (in the presence of multiple returns).

----------------

An interesting tweak that simulates NRVO on not-so-smart compilers is:

struct T
{
    T(const T&);
    T(T&, bool move);
};

If the second constructor is called, a move construction is done.

On a compiler that doesn't know NRVO, you can say:

T Fun()
{
    T result;
    ...
    return T(result, true);
}

Now what happens is, a temporary will be created via a move constructor.
Then, because it's a temporary, the compiler will nicely apply URVO to it
and will bind it to the final result.

I guess I am discussing things that have been hashed to death before. Are my
assesments correct, and which compilers implement which optimizations?

Thanks,

Andrei

--
All new!  THE C++ Seminar: Oct. 28-30 in Vancouver, WA.
http://www.thecppseminar.com/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk