Boost logo

Boost :

Subject: Re: [boost] [type_erasure] Review started (July 18-27, 2012)
From: Steven Watanabe (watanabesj_at_[hidden])
Date: 2012-08-05 19:38:00


AMDG

On 08/05/2012 12:22 PM, Chapman, Alec wrote:
> Lorenzo,
> Thank you for managing the review, and I apologize that this reply along with Steven's previous message have gone past the review period. I understand if you don't have the time to continue following the discussion.
>
> I'll just summarize that I still feel that some of the advanced features of this library are currently too experimental, and that I think that Steven's reluctance so far to acknowledge their limitations or consider alternatives makes me doubt that these features will reach a mature state unless this is a condition for their acceptance (although I still believe that the library would be very useful without these features).
>
> More detailed responses are below.
>
> On 08/04/2012 1:15 PM, Steven Watanabe wrote:
>> I think this is a bug in boost::shared_ptr.
> Okay. Can you file a bug report?
>

IIRC, this was already reported and fixed.

>> - Wrapping the any increases the cost of all any
>> operations. I'm not convinced that a 1-to-1
>> ratio of calls/conversions is realistic usage.
>
> I'm aware there is a cost and I think a reasonable debate can be had about which is the better default behavior.
>
> Is wrapping difficult to provide as an option?
>

I really don't like the idea of providing options
to control the implementation details of the
library. I'm certainly not going to consider it
until after all the easier optimizations are in place.

> The test I gave is realistic usage for me. Here's a real example:
>
> any_interval1 (implemented using my any_map) has lower_bound and upper_bound functions.
> any_interval2 contains additional functions.
> I have a vector of any_interval2 and want to use std::count_if with a separately compiled predicate (which to give one example might test whether the interval length is greater than some threshold). The predicate accepts const any_interval1&. Here the number of calls is very low, the predicate logic is simple, and the current conversion overhead really would be the dominant cost. There are many other interval calculations I need to do where this is the case.
>
> Keep in mind that with a 30% function call cost vs a 700% conversion cost, you would need 23 function calls to break even. Moreover, the user has to write a decent amount of code to choose the correct binding; the default behavior has 5600% overhead.
>
>> - This particular example can (and will)
>> be optimized to re-use the original vtable.
>
> Okay, I'll reevaluate when it is ready. In my test, just doing a map lookup added 1.1s (test 5 vs 4) or 150% overhead, so this would seem to be a lower bound.
>

The overhead of the conversion once I
implement this optimization will be zero.

> <snip>
> I don't think I understand. Are you saying that the only way to give the copy constructor access to the vtable is to also give every other function access as well? It's true that I have special logic for my copy constructor, but so does your library.

Not really. The only special handling of
constructors is that any knows how to
call them. There's nothing special in
the dispatching layer. Besides, it isn't
just constructors. Any function that
returns an any would need to know about
the vtable.

> I imagine your copy constructor at line 211 of any.hpp looking something like:
>
> any(const any& other)
> : table(other.table.copy()),
> data(::boost::type_erasure::call(constructible<T(const T&)>(), other))
> {}
>

It's just table(other.table). Copying the table
just copies pointers right now.

> Or if you want to avoid the extra virtual function call:
>
> any(const any& other)
> {
> other.table.clone(table, data);
> }
>
> I think you'll need to do something like this anyway for certain optimizations. See my comment near the end about dynamic vtables.
>
> Can you be more specific about which advanced features would be harder? I have implemented many of them and haven't run into any difficulties.
>

In particular, it's tricky to preserve the
binding interface unchanged. I'm not sure
how far any changes would ripple, since
binding is used everywhere.

Also, and more importantly, if anything in the
vtable has to know the whole vtable, it's impossible
to implement the optimization I mentioned earlier
of just offsetting the vtable pointer for conversions.

>> Also,
>> - It can't handle const references correctly
>
> You can do:
> const int x = 0;
> const any<...> y(x, as_const_reference);
> const any<...> y(const_cast<int&>(x), as_reference); // if you want to make it stand out more
> any<...> w(x, as_const_reference); // compiles but incorrect
>

Exactly. This is a major hole in the
type system, which I'm not willing to open.

> This would internally hold x in a (mutable) void*, but it would be cast back to const int* for the const member functions. Not declaring the any as const would be a problem, but you document a similar hole in your system:
>
> const any<incrementable<>, _self&> z(x);
> any<incrementable<>, _self&> w(z); // compiles but incorrect
>
> To me this seems harder to detect since you have to look back to the declaration of z to know there is a problem.
>

Where do I say that this is incorrect? There's
nothing wrong with this code.

>> - It doesn't work with type erased function
>> functions that need to return a reference
>
> My experience has been that in most cases you can return just any<...>&. Consider std::max, for example.

That only works for functions that return one
of their arguments.

> In the other cases I have used a proxy that holds an any and upon copying constructs the any to capture by reference. This behaves similarly to your approach.
>
> The most notable need I've had is for writing an any_iterator that would allow any_iterator<any<addable<>>> to be constructed from any_iterator<int>. I've tried two different approaches, both of which have drawbacks:
>
> 1) Make the iterator's reference type be any<addable<>, _self&> and value_type be any<addable<>>, which of course limits it to being an input iterator.
>
> 2) Have the iterator store an any<addable<>>, capturing the underlying value by reference with my technique. This avoids the problem with the reference type, but increases the iterator size and limits the referent lifetime to the lifetime of the iterator. This also limits it to an input iterator:
> https://svn.boost.org/trac/boost/ticket/2640
>
> I'd be interested to hear if you have better ideas on how to approach this.
>

I don't. I think generally anything that
needs to use proxy references needs to be
implemented directly instead of using a
function template.

>> I think such a feature is too limited for the costs it imposes.
>
> This is required for type erased functions that take references. There are plenty of these in Boost and my own code. Why do you believe this is a limited case while you point to a subset of functions returning a reference as a limitation of my approach? All I can say is that in my applications this really is needed.
>

I understand the problem, but I'm not willing to take
a half-way approach like this. As far as I am
concerned, references to const have to be supported
and making it possible to cast away const silently
is not an option. I'm less concerned about type-erased
functions that return references, but this case makes
proxy references necessary, so capturing references
in a regular any would have to be an additional feature
rather than a complete replacement for proxy references.

I've also considered something like this:

reference<any<C> > ref(x1);
reference<const any<C> > cref(x2);

ref.get(); // any<C>&
cref.get(); // const any<C>&

I haven't thought through all the consequences of
this yet, but it at least makes it possible
to handle const references correctly and also
imposes no cost when it isn't used.

> I'm having a hard time understanding why you believe certain features justify their costs while others do not. Perhaps this would be clearer to me if you gave some examples of how you see the more advanced features being used.
>
> As another example, I'm not terribly opposed to dynamic vtables, but this does come at a cost and has not been necessary for me.

Dynamic vtables aren't exactly a feature. They're an
implementation detail. They're the easiest way to
implement the interface that I wanted.

> In addition to making it more difficult for the user to do efficient conversions, it doubles the size of the any (I didn't see a reply to an earlier email about the binding size: http://permalink.gmane.org/gmane.comp.lib.boost.devel/233002).
>

I can reduce binding to two pointers instead
of three fairly easily.

> I can see that this cost could be optimized out by storing the reference count in the vtable itself, but this requires exactly the same coupling of the vtable to the constructor/destructor that you oppose for my reference technique.

No it doesn't. The vtable reference count can be
handled independently from anything else.

> Is this an optimization that you plan on doing?
>

I considered this, but getting it right is tricky.

- The reference count must be atomic.
- The reference count must support static initialization.
- Adding a reference count limits sharing of converted vtables.

In Christ,
Steven Watanabe


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk