Boost logo

Proto :

Subject: Re: [proto] Holding children by copy or reference
From: Luc Danton (lucdanton_at_[hidden])
Date: 2013-10-15 01:51:12

On 2013-09-30 13:54, Mathias Gaunard wrote:
> Hi,
> A while ago, I recommended to set up domains so that Proto contains its
> children by value, except for terminals that should either be references
> or values depending on the lvalue-ness. This allows to avoid dangling
> reference problems when storing expressions or using 'auto'.
> I also said there was no overhead to doing this in the case of Boost.SIMD.
> After having done more analyses with more complex code, it appears that
> there is indeed an overhead to doing this: it confuses the alias
> analysis of the compiler which becomes unable to perform some
> optimizations that it would otherwise normally perform.
> For example, an expression like this:
> r = a*b + a*b;
> will not anymore get optimized to
> tmp = a*b;
> r = tmp + tmp;
> If terminals are held by reference, the compiler can also emit extra
> loads, which it doesn't do if the the terminal is held by value or if
> all children are held by reference.
> This is a bit surprising that this affects compiler optimizations like
> this, but this is replicable on both Clang and GCC, with all versions I
> have access to.
> Therefore, to avoid performance issues, I'm considering moving to always
> using references (with the default domain behaviour), and relying on
> BOOST_FORCEINLINE to make it work as expected.
> Of course this has the caveat that if the force inline is disabled (or
> doesn't work), then you'll get segmentation faults.


as a heads-up, I've made it a habit in C++11 to structure generic
'holders' or types as such:

     template<typename Some, typename Parameters, typename Here>
     struct foo_type {
         // Encapsulation omitted for brevity

         foo_type(Some some, Parameters parameters, Here here)
             // Don't use std::move here
             : some(std::forward<Some>(some))
             , parameters(std::forward<Parameters>(parameters))
             , here(std::forward<Here>(here))

         Some some;
         Parameters parameters;
         Here here;

         /* How to use the data members: */

         /* example observer */
         Some& peek() { return some; }
         /* can be cv-qualified */
         Some const& peek() const { return some; }
         /* can be ref-qualified */
         Parameters fetch() &&
         { return std::forward<Parameters>(parameters); }

         /* meant to be called several times per lifetimes
         decltype(auto) bar()
           /* can be cv- and ref-qualified indifferently */
             // don't forward, don't move
             return qux(some, parameters, here);

         /* meant to be called at most once per lifetime */
         void zap()
             // forwarding is a low-hanging optimization

     template<typename Some, typename Parameters, typename Here>
     foo_type<Some, Parameters, Here> foo(Some&& some
                                          , Parameters&& parameters
                                          , Here&& here)
         return {
             , std::forward<Parameters>(parameters)
             , std::forward<Here>(here)

Note that either auto f = foo(0, 'a', "c"); or auto&& f = foo(0, 'a',
"c"); is fine, with no dangling reference. Rvalues arguments to the foo
factory are stored as values, lvalue arguments as lvalue references. You
can still ask for rvalue reference members 'by hand' (e.g.
foo_type<int&&, int&&, int&&> f { std::move(i), std::move(i),
std::move(i) };, although I don't really use that functionality (save
with std::tuple, but that's another story).

For something like auto f = foo(1, 2, 3); auto g = foo(f, 4, 5); then
inside g the ints would be held as values, and f as a reference. If
std::move(f) would have been used, it would have been moved inside a
copy internal to g. In terms of an EDSL, then both nodes and terminals
can be held indifferently as references or values, depending on how they
are passed as arguments.

As I've said, I use this technique as a default and I do have a
run-off-the-mill lazy-eval EDSL where I put it to use. I cannot report
bad things happening (incl. with libstdc++ debug mode, and checking with
Valgrind). IME, when looking at the generated code, the compiler can see
through most of the time. I have to warn though that I do not use the
technique for the sake of efficiency. I simply find it the most
convenient and elegant.

Proto list run by eric at