
Boost : 
Subject: Re: [boost] [move] new rvalue reference emulation
From: Jeffrey Lee Hellrung, Jr. (jeffrey.hellrung_at_[hidden])
Date: 20110410 21:22:02
On Mon, Apr 4, 2011 at 10:36 AM, Jeffrey Lee Hellrung, Jr. <
jeffrey.hellrung_at_[hidden]> wrote:
[...snipping original description...]
I have a followup to my original message regarding a typeless emulated
rvalue reference. I was disappointed in the lack of feedback (other than
Dave) on the original message, which might mean that those who took the time
to parse the message didn't think it was worth pursuing; maybe C++03 is
already too outdated and we should forget about it? However, at the moment,
I prefer to think that everyone (except, again, Dave; thanks Dave!) was
simply too busy to delve into what I was trying to convey. So let me try
again, with a longer, more detailed, and more leisurely explanation, and
this time I additionally present a partial solution to extend the technique
to quasipolymorphic result types (I'll explain what I mean below).
If you're already plenty familiar with the difficulty of capturing arbitrary
rvalues in C++03, you can skip to "TYPELESS EMULATED RVALUE REFERENCES".
The "new" stuff dealing with quasipolymorphic result types is even further
down, at "QUASIPOLYMORPHIC RESULT TYPES".
[In the exposition below, whenever I say "lvaluetoconst" or
"lvaluetononconst", I usually mean "lvaluereferencetoconst" and
"lvaluereferencetononconst", respectively.]
First, the big picture: In C++03, it is notoriously difficult (and, until I
discovered this use of typeless emulated rvalue references, I thought it was
impossible) to capture *arbitrary* rvalues *and* distinguish between
captured rvalues and lvaluestoconst. To be specific, one generally uses a
"template< class T > void foo(T const &)" overload to capture arbitrary
rvalues, but unfortunately this overload also binds to lvaluestoconst.
This makes it impossible to distinguish between a captured rvalue and an
lvaluetoconst, hence the function body must go the safe route and assume
the argument is const, precluding any kind of move optimizations.
Now, if we restrict the problem to capturing *specific* rvalues instead of
*arbitrary* rvalues, then things look better. Boost.Move defines a class
template rv<T> and introduces the convention that rv<T>& represents an
(emulated) rvalue reference. A moveemulationenabled class X defines a
nonconst conversion operator to rv<X>&, which basically just amounts to a
static cast: "return static_cast< rv<X>& >(*this);" (see the Boost.Move
documentation, either in trunk or the sandbox; or past mailing list
messages; if this is unfamiliar). This means that overloads with argument
rv<X>& are candidates to capture rvalues of type X, and leads to defining
the following 3 overloads for foo:
template< class T >
void foo(T&); // 1
template< class T >
typename boost::disable_if< boost::is_convertible< T&, rv<X>& > >
void foo(T const &); // 2
void foo(rv<X>&); // 3
If one goes through all the various argument binding possibilities, one
finds that: the "foo(T&); // 1" overload captures lvaluestononconst, X
const & (due to the use of SFINAE via disable_if in the "foo(T const &); //
2" overload), and explicitly created emulated rvalue references *except*
rv<X>& (these are captured by the "foo(rv<X>&); // 3" overload); the "foo(T
const &); // 2" overload captures lvaluestoconst *except* X const & (these
are captured by the "foo(T&); // 1" overload due to disable_if) and rvalues
*except* Xrvalues (again, thanks to disable_if); finally, the "foo(rv<X>&);
// 3" overload captures Xrvalues and explicitly created
emulatedrvaluereferencestoX. To be absolutely clear:
template< class T > T make();
// ...
X x;
X const cx;
Y y; // Y != X
Y const cy;
foo(x); // 1, T = X
foo(cx); // 1, T = X const
foo(make<X>()); // 3 (*not* 2 !!!), as desired :)
foo(y); // 1, T = Y
foo(cy); // 2, T = Y
foo(make<Y>()); // 2, T = Y, since we didn't tell the foo overloads about Y
:(
// and if Y is moveemulationenabled...
foo(static_cast< rv<Y>& >(y)); // 3
Thus, Xrvalues are *accurately* captured, meaning that an Xrvalue argument
to foo is bound to an emulatedrvaluereferencetoX, rv<X>&, allowing the
argument to be moved rather than copied. This is better than using only the
first two foo overloads, especially if the class X plays a prominent role in
the specific application of the foo overload set. For example, the
push_back member function of vector<X> can accurately capture Xrvalues
using this technique, which is highly desirable.
Clearly, this technique can be extended to any finite, compiletimeknown
set of "special" moveemulationenabled types X0, ..., X[n1] such that all
Xirvalues (i = 0, ..., n1) are accurately captured. This would be
applicable to, say, variant< X0, ..., X[n1] > constructor and assignment
overloads.
The ability to accurately capture any rvalue of a type within a specific,
compiletimeknown set of "special" types is quite powerful, and is, indeed,
sufficient for many applications. However, many other applications don't
have such a set of special types; instead, it is desirable to accurately
capture arbitrary rvalues of a priori unknown types. At first, one might
think that the emulated rvalue reference rv<T>& might come to the rescue and
try to use a "template< class T > void foo2(rv<T>&)" overload. However,
this overload cannot bind to rvalues:
template< class T >
void foo2(rv<T>&);
// ...
foo2(make<X>()); // will not bind to foo2<T>(rv<T>&)! :(
Unfortunately, the "T" in the "rv<T>&" argument is a deduced template
parameter, so there's no way for the compiler to know what T should be when
looking for conversions from X to rv<T>&. It might look obvious that T
should be X, especially considering that X provides an implicit conversion
to rv<X>& and this fact is plainly available to the compiler, but, trust me,
things just don't (and can't) work that way. The problem appears to lie in
the template parameter deduction, and this motivates the introduction of a
TYPELESS EMULATED RVALUE REFERENCE
What we're going to try to do is capture arbitrary rvalues in a typeless
fashion, thereby obviating the need for any template parameter deduction and
allowing the compiler to locate a conversion operator from the rvalue to the
typeless emulated rvalue reference:
template< class T >
void foo3(genrv< ??? > const x)
{ /* ... */ }
genrv< ??? > is basically just a wrapper around a void pointer which points
to the typeerased object ("gen" being short for "generic"). Of course, we
have to be able to recover the original type of the object prior to
typeerasure, which we effect via dynamic dispatch through a function
reference. We use the function reference together with the visitor pattern
to be able to "do something" with caught object. A basic formulation of the
pieces looks as follows:
// New typeless emulated rvalue reference class.
template< class Visitor >
struct genrv
{
// Adding support for nonvoid result types is easy, but we'll use void
for now to simplify exposition.
typedef void result_type;
template< class T >
explicit genrv(T* const p)
// Set the function reference to a specific instantiation of the
static apply member function.
// This way, we can recover the type of the object pointed to by mp
later.
: m_apply(apply<T>),
mp(static_cast< void* >(p))
{ }
result_type operator()(Visitor visitor) const
{ return m_apply(visitor, mp); }
private:
template< class T >
static result_type apply(Visitor visitor, void* const p)
// Explicitly cast the object to an emulated rvalue reference to ensure
the right overload is selected.
{ return visitor(static_cast< rv<T>& >(*static_cast< T* >(p))); }
result_type (&m_apply)(Visitor, void* const);
void* const mp;
};
// ...
// Assume X is just a typical moveemulationenabled class...
struct X
{
// ...
// The usual conversion to rv<X>&, as introduced by Boost.Move.
operator rv<X>&() { return static_cast< rv<X>& >(*this); }
// New conversion operator to genrv<V>!
template< class V >
operator genrv<V>() { return genrv<V>(this); }
};
// ...
// Typical function object...
struct foo4
{
template< class T >
void operator()(T&) const; // 1
typedef genrv< foo4 > genrv_type;
template< class T >
typename boost::disable_if< boost::is_convertible< T&, genrv_type >
>::type
operator()(T const &) const; // 2
// The disable_if in "operator()(T const &) const; // 2" forces rvalues
of moveemulationenabled classes to bind here.
void operator()(genrv_type const x) const // 3
// Dispatch to overload "operator()(T&) const; // 1", where T = rv<
[whatever type x is wrapping] >.
{ return x(*this); }
};
Let's go through what's happening here. Suppose foo4 is passed an Xrvalue
as an argument. This rvalue cannot bind to the "operator()(T&) const; // 1"
overload; the fact that X has an implicit conversion to foo4::genrv_type
means that it cannot bind to "operator()(T const &) const; // 2" due to the
disable_if *and* that it *can* bind to "void operator()(genrv_type const x)
const // 3". So overload 3 is chosen, requiring the implicit conversion
from the Xrvalue to a genrv_type. The genrv_type is constructed such that
its m_apply member variable references genrv_type::apply<X>(Visitor, void*
const) and its mp member variable points to the Xrvalue. Then we *visit*
the genrv_type object with the foo4 function object (this is the "x(*this)"
expression), causing a runtime dispatch to genrv_type::apply<X>(Visitor,
void* const) via the m_apply function reference.
genrv_type::apply<X>(Visitor, void* const) "knows" the "real" type of the
object pointed to be the void pointer is X, so it can safely cast up to the
emulated rvalue reference rv<X>& and pass this to the visitor object.
Execution thus passes to the "void operator()(T&) const; // 1" overload,
with T = rv<X>, which is the desired outcome. The Xrvalue is (eventually)
bound to an emulated rvalue reference, without foo4 possessing any special
knowledge of X itself!
One can also verify that passing an lvalue or a nonmoveemulationenabled
rvalue as an argument to foo4 will bind to one of the first two overloads,
as desired.
That's the basic idea. Now, the drawbacks. First, of course, there's the
dynamic dispatch through the function reference to recover the original type
of the object. Unfortunately, it looks like MSVC9 and the version of GCC I
tried (a relatively recent version) were unable to eliminate the dispatch
(default release mode settings for MSVC9; only added O3 to GCC), implying
some runtime overhead. However, the cost of such an indirection would
likely be compensated many times over by through the avoidance of
unnecessary copying of movable rvalues. In other words, this is still
better than capturing by "T const &".
The second (and, in my opinion, more serious) drawback is a pretty severe
restriction on the result type. Fully polymorphic result types (that is,
result types that depend arbitrarily on the argument type(s)) are out of the
question. In other words, there's no way to do the equivalent of the
following C++0x code in C++03 without result_of::foo5 possessing some kind
of almost trivial "structure", like being a constant metafunction.
namespace result_of
{ template< class T > struct foo5 { typedef /* ... */ type; }; }
template< class T >
typename result_of::foo5<T>::type
foo5(T&& x);
One can support nonvoid monomorphic result types fairly easily by only a
slight adjustment of the genrv class template and the genrv conversion
operator:
// Adding a Result template parameter...not much else changes.
template< class Visitor, class Result = void >
struct genrv
{
typedef Result result_type;
template< class T >
explicit genrv(T* const p)
: m_apply(apply<T>),
mp(static_cast< void* >(p))
{ }
result_type operator()(Visitor visitor) const
{ return m_apply(visitor, mp); }
private:
template< class T >
static result_type apply(Visitor visitor, void* const p)
{ return visitor(static_cast< rv<T>& >(*static_cast< T* >(p))); }
result_type (&m_apply)(Visitor, void* const);
void* const mp;
};
// ...
struct X
{
// ...
operator rv<X>&() { return static_cast< rv<X>& >(*this); }
template< class V, class R >
operator genrv<V,R>() { return genrv<V,R>(this); }
};
// ...
struct foo6
{
typedef /*...*/ result_type;
template< class T >
result_type operator()(T&) const; // 1
// Just need to specify the Result template parameter in genrv...
typedef genrv< foo6, result_type > genrv_type;
template< class T >
typename boost::disable_if< boost::is_convertible< T&, genrv_type >,
result_type >::type
operator()(T const &) const; // 2
result_type operator()(genrv_type const x) const // 3
{ return x(*this); }
};
This was as far as I had gotten before, in the original message. Today, I
got one step further, figuring out how to support
QUASIPOLYMORPHIC RESULT TYPES
By a quasipolymorphic result type, I mean that the result type *can* depend
on the argument type, but only in a very limited way. Specifically, the
following technique allows one to support any finite, compiletimeknown set
of result types. The basic idea here is to be able to "tag" certain genrv
instantiations and limit the implicit conversions to these instantiations.
Let's use a concrete example with just two result types. Suppose we
partition our argument type space up into 2 sets, those which satisfy predA
and those which satisfy predB, where pred[A,B] are Boost.MPL metafunctions.
Thus, every argument type satisfies either predA or predB, but never both
simultaneously (so, informally, predB(T) = !predA(T)). Let resultA and
resultB be the result types after application on arguments types satisfying
predA and predB, respectively. In other words:
struct foo7
{
template<class> struct result;
// Boost.ResultOf protocol
template< class This, class T >
struct result< This ( T ) >
: boost::mpl::if_<
boost::mpl::apply1< predA, T >,
resultA,
resultB
>
{ };
// ...
};
To support this result structure for foo7, we add a couple extra template
parameters to genrv:
template<
class Visitor, class Result = void,
// Add a couple more template parameters to aid the suppression of
certain implicit conversions, as we'll shortly see...
class Cond = boost::mpl::always< boost::true_type >,
bool Enable = true
>
struct genrv
{
typedef Result result_type;
template< class T >
explicit genrv(T* const p)
: m_apply(apply<T>),
mp(static_cast< void* >(p))
// Make sure we didn't f*** up.
{ BOOST_MPL_ASSERT((boost::mpl::apply1< Cond, T >)); }
result_type operator()(Visitor visitor) const
{ return m_apply(visitor, mp); }
private:
template< class T >
static result_type apply(Visitor visitor, void* const p)
{ return visitor(static_cast< rv<T>& >(*static_cast< T* >(p))); }
result_type (&m_apply)(Visitor, void* const);
void* const mp;
};
We likewise supplement the genrv conversion operator in X:
// MSVC9 ICE's without this helper struct :(
template< class Cond, class T >
struct apply_cond
{ static const bool value = boost::mpl::apply1< Cond, T >::type::value; };
struct X
{
// ...
template< class V, class R, class C >
operator genrv<V,R,C,apply_cond<C,X>::value>()
{ return genrv<V,R,C,apply_cond<C,X>::value>(this); }
};
Now, when the compiler is looking to convert an X to a genrv<V,R,C,true>,
that conversion operator will only exist if X satisfies the Boost.MPL
predicate C. If X does *not* satisfy C, then X can only be converted to
genrv<V,R,C,false>. Now, our foo7 overloads with genrv arguments will
always use the default Enable parameter, which is true, thus ensuring that
anything that binds to genrv<V,R,C> really does satisfy C. Thus, the foo7
overloads look like the following:
struct foo7
{
template<class> struct result; // Boost.ResultOf protocol
template< class This, class T >
struct result< This ( T ) >
: boost::mpl::if_<
boost::mpl::apply1< predA, T >,
resultA,
resultB
>
{ };
// Add a specialization to transform rv<T>& to plain T.
template< class This, class T >
struct result< This ( rv<T>& ) >
: result< This ( T ) >
{ };
template< class T >
typename result< foo7 ( T& ) >::type
operator()(T&) const; // 1
typedef genrv< foo7, resultA, predA > genrvA_type;
typedef genrv< foo7, resultB, predB > genrvB_type;
template< class T >
typename boost::lazy_disable_if<
boost::mpl::or_<
boost::is_convertible< T&, genrvA_type >,
boost::is_convertible< T&, genrvB_type >
>,
result< foo7 ( T const & ) >
>::type
operator()(T const &) const; // 2
// Captures moveemulationenabled rvalues satisfying predA.
resultA operator()(genrvA_type const x) const // 3A
{ return x(*this); }
// Captures moveemulationenabled rvalues satisfying predB.
resultB operator()(genrvB_type const x) const // 3B
{ return x(*this); }
};
This is still pretty far from supporting fully polymorphic result types, but
it is still (maybe only slightly) better than monomorphic result types, and
likely the best we can do with this technique, given that none of the genrv
overloads can have deduced template parameters.
CONCLUSION
Despite the drawbacks of this technique (the probable runtime overhead,
limitations on the result type, and additional complexity), I think this
would be a worthwhile addition to the Boost.Move library. I look forward to
hearing what others think about it...
 Jeff
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk