Subject: Re: [boost] [type_erasure] Review started (July 18-27, 2012)
From: Lorenzo Caminiti (lorcaminiti_at_[hidden])
Date: 2012-08-06 15:51:02
On Sun, Aug 5, 2012 at 3:22 PM, Chapman, Alec <archapm_at_[hidden]> wrote:
> Thank you for managing the review, and I apologize that this reply along with Steven's previous message have gone past the review period. I understand if you don't have the time to continue following the discussion.
Thanks to you for submitting a review and participating in the discussions!
> I'll just summarize that I still feel that some of the advanced features of this library are currently too experimental, and that I think that Steven's reluctance so far to acknowledge their limitations or consider alternatives makes me doubt that these features will reach a mature state unless this is a condition for their acceptance (although I still believe that the library would be very useful without these features).
I understand your "yes" was conditional and I understand the topics
you have been discussing with Steven. I will of course consider your
review together with all other reviews in my report.
> More detailed responses are below.
> On 08/04/2012 1:15 PM, Steven Watanabe wrote:
>> I think this is a bug in boost::shared_ptr.
> Okay. Can you file a bug report?
>> - Wrapping the any increases the cost of all any
>> operations. I'm not convinced that a 1-to-1
>> ratio of calls/conversions is realistic usage.
> I'm aware there is a cost and I think a reasonable debate can be had about which is the better default behavior.
> Is wrapping difficult to provide as an option?
> The test I gave is realistic usage for me. Here's a real example:
> any_interval1 (implemented using my any_map) has lower_bound and upper_bound functions.
> any_interval2 contains additional functions.
> I have a vector of any_interval2 and want to use std::count_if with a separately compiled predicate (which to give one example might test whether the interval length is greater than some threshold). The predicate accepts const any_interval1&. Here the number of calls is very low, the predicate logic is simple, and the current conversion overhead really would be the dominant cost. There are many other interval calculations I need to do where this is the case.
> Keep in mind that with a 30% function call cost vs a 700% conversion cost, you would need 23 function calls to break even. Moreover, the user has to write a decent amount of code to choose the correct binding; the default behavior has 5600% overhead.
>> - This particular example can (and will)
>> be optimized to re-use the original vtable.
> Okay, I'll reevaluate when it is ready. In my test, just doing a map lookup added 1.1s (test 5 vs 4) or 150% overhead, so this would seem to be a lower bound.
>> >>> I propose an overloaded constructor that signals that the underlying
>> >>> object
>> >> should be held by reference but copied when the any object is copied:
>> >>> int i = 0;
>> >>> any<requirements> a(i, capture_reference); any<requirements> b(a);
>> >>> ++a; // i == 1
>> >>> ++b; // i == 1 still, any_cast<int>(b) == 2
>> >>> This has worked well in my own code and should be easy to implement.
>> >>> I think
>> >> this would also avoid the bug mentioned at the bottom of the
>> >> references page of the documentation.
>> >> I don't think this is a good idea, as it would require an extra flag
>> >> whether it's needed or not.
>> > Steven, do you have any thoughts on my previous reply to this? I have
>> implemented this without any cost.
>> It can't be implemented without cost. There are really two options:
>> 1) Add a flag
>> 2) Manage the vtable /inside/ the dispatching layer.
>> - Implies that every function knows about
>> all the other functions. This extra coupling
>> makes the more advanced features of the library
>> hard and also increases code bloat.
>> I'm assuming that you had already taken option (2), so it didn't impose any
>> additional cost.
> I don't think I understand. Are you saying that the only way to give the copy constructor access to the vtable is to also give every other function access as well? It's true that I have special logic for my copy constructor, but so does your library. I imagine your copy constructor at line 211 of any.hpp looking something like:
> any(const any& other)
> : table(other.table.copy()),
> data(::boost::type_erasure::call(constructible<T(const T&)>(), other))
> Or if you want to avoid the extra virtual function call:
> any(const any& other)
> other.table.clone(table, data);
> I think you'll need to do something like this anyway for certain optimizations. See my comment near the end about dynamic vtables.
> Can you be more specific about which advanced features would be harder? I have implemented many of them and haven't run into any difficulties.
>> - It can't handle const references correctly
> You can do:
> const int x = 0;
> const any<...> y(x, as_const_reference);
> const any<...> y(const_cast<int&>(x), as_reference); // if you want to make it stand out more
> any<...> w(x, as_const_reference); // compiles but incorrect
> This would internally hold x in a (mutable) void*, but it would be cast back to const int* for the const member functions. Not declaring the any as const would be a problem, but you document a similar hole in your system:
> const any<incrementable<>, _self&> z(x);
> any<incrementable<>, _self&> w(z); // compiles but incorrect
> To me this seems harder to detect since you have to look back to the declaration of z to know there is a problem.
>> - It doesn't work with type erased function
>> functions that need to return a reference
> My experience has been that in most cases you can return just any<...>&. Consider std::max, for example. In the other cases I have used a proxy that holds an any and upon copying constructs the any to capture by reference. This behaves similarly to your approach.
> The most notable need I've had is for writing an any_iterator that would allow any_iterator<any<addable<>>> to be constructed from any_iterator<int>. I've tried two different approaches, both of which have drawbacks:
> 1) Make the iterator's reference type be any<addable<>, _self&> and value_type be any<addable<>>, which of course limits it to being an input iterator.
> 2) Have the iterator store an any<addable<>>, capturing the underlying value by reference with my technique. This avoids the problem with the reference type, but increases the iterator size and limits the referent lifetime to the lifetime of the iterator. This also limits it to an input iterator:
> I'd be interested to hear if you have better ideas on how to approach this.
>> I think such a feature is too limited for the costs it imposes.
> This is required for type erased functions that take references. There are plenty of these in Boost and my own code. Why do you believe this is a limited case while you point to a subset of functions returning a reference as a limitation of my approach? All I can say is that in my applications this really is needed.
> I'm having a hard time understanding why you believe certain features justify their costs while others do not. Perhaps this would be clearer to me if you gave some examples of how you see the more advanced features being used.
> As another example, I'm not terribly opposed to dynamic vtables, but this does come at a cost and has not been necessary for me. In addition to making it more difficult for the user to do efficient conversions, it doubles the size of the any (I didn't see a reply to an earlier email about the binding size: http://permalink.gmane.org/gmane.comp.lib.boost.devel/233002).
> I can see that this cost could be optimized out by storing the reference count in the vtable itself, but this requires exactly the same coupling of the vtable to the constructor/destructor that you oppose for my reference technique. Is this an optimization that you plan on doing?
>> > I still feel that the features I've commented on are at too much of an
>> experimental stage and that better solutions exist. My experience has been
>> that erased types need to behave as similarly as possible to their underlying
>> types, especially if they are to be used in generic code. I don't see the
>> advantages of forcing the user to deal with what is essentially a parallel
>> version of the type system for references when there is IMHO a more natural
>> approach that carries no cost and is more generally useable. See my comments
>> above as well. Conversions become less useful if they can't be used naturally
>> like other types.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk