Boost logo

Boost :

Subject: Re: [boost] [Hana] Formal review for Hana
From: Louis Dionne (ldionne.2_at_[hidden])
Date: 2015-06-20 17:17:45


Zach Laine <whatwasthataddress <at> gmail.com> writes:

>
> On Thu, Jun 18, 2015 at 4:27 PM, Louis Dionne <ldionne.2 <at> gmail.com> wrote:
> > [...]
> >
> > To get there, I'd like to make sure I understand exactly what operation
> > you're trying to avoid. Let's assume you wrote the following instead of
> > a transform_mutate equivalent:
> >
> > hana::tuple<T...> xs{...};
> > hana::tuple<U...> ys;
> > ys = hana::transform(xs, f);
> >
> > This will first apply f() to each element of xs(), and then store the
> > temporary values in a (temporary) tuple by moving them into place. This
> > will then move-assign each element from the temporary tuple into ys.
> >
> > __Is the first move what you are trying to avoid?__
> >
>
> No, I'm trying to get rid of the temporary altogether. We all know that
> copies of temporaries get RVO'd out of code like the above a lot of the
> time, *but not always*. I want a guarantee that I don't need to rely on
> RVO in a particular case, if efficiency is critical.

Sorry I'm being so slow, but do you mean get rid of the temporary tuple or
the temporary value? Regarding the temporary value, I think there just isn't
a way to get rid of it. When you write

    T y = f(x);

there is a temporary object created by f(x) and then moved into y, right?
Similarly, if you have

    T y;
    y = f(x);

there's a temporary created by f(x) that gets move-assigned to y. In all
cases, there's a temporary value created, and you're relying on the optimizer
to elide it. Am I misunderstanding something fundamental about C++, or just
being thick?

I'll take it that you want to get rid of the temporary tuple. In this case, it
is true that using a mutating algorithm will avoid the creation of a temporary
tuple.

To achieve this, I see three main solutions. The first one is to provide
mutating algorithms. I don't like that, but it solves your problem.

The second one is to provide lazy views a la Fusion that would compute the
results on the fly. When you assign a view to a sequence, each element would
be computed and then assigned directly, without creating a temporary tuple.
I like this better, but it might have a non-trivial impact on the design of
the library and it also represents a lot of work.

The third one is to consider this as a corner case, pretend the optimizer
does its job properly most of the time, and to let performance freaks write

    for_each(range(int_<0>, int_<n>), [&](auto i) {
        output[i] = f(input[i]);
    });

I'm not sure which one is the best resolution.

Regards,
Louis


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk