Boost logo

Boost :

Subject: Re: [boost] [review] Review of PolyCollection starts today (May 3rd)
From: Joaquin M López Muñoz (joaquinlopezmunoz_at_[hidden])
Date: 2017-05-11 07:53:07


El 10/05/2017 a las 22:33, Thorsten Ottosen via Boost escribió:
> Hi Joaquín (and Ion),
>
> [...]
>
> A. OO programming and copyability of classes is not something that
> everybody is going to like. The normal thing is to derive from
> boost::noncopyable to get rid of a bunch of undesired behaviors. Would
> it be possible for the library to pick up a private or protected
> copy-constructor when needed, possible via a friend declaration? I
> think many uses of this library will also store pointers to the objects
> in other collections, e.g. vector<const base*>, and as members so
> being able to have your hierarchy non-copyable while allowing
> copyability within the container is important IMO.

Technically, this is doable and, at first blush, simple, since all
element copying and
assignent is centralized in a single wrapper class:

https://github.com/joaquintides/poly_collection/blob/master/include/boost/poly_collection/detail/value_holder.hpp

Another possibility is to use a copying traits class to let users
customize this point,
along the custom RTTI thing proposed by Ion. This looks to me even
cleaner and
more powerful.

I have no strong objection against including ths feature in whatever
form, but don't
particularly like it either --in the end, elements in a base_collection
must be copyable
(strictly speaking, moveable) and this will force users to change their
code by
writing the necessary ctors, so requiring an extra layer of boilerplate
(declaring
friends etc.) is just more hassle.

> B. I miss range versions of the various iteration patterns so I can
> use them directly with a range-based for. E.g. instead of
>
> for( auto i = c.begin<warrior>(); i != c.end<warrior>(); ++ i )
>
> I should be able to say
>
> for( auto& : c.range<warrior>() );

Yes, this is trivial and looks useful. At the risk of bikeshedding,
c.segment<warrior>()
might be a better name. Unfortunately there's no std naming practice here to
follow on, unless someone proves me wrong.

> C. I think we talked about an make_overload() function to devirtualize
> uses of base_collection. Do we have such a beast in boost already? If
> not, it might be a worth providing it with this library.

hanna::overload is already here to the rescue:

https://lists.boost.org/Archives/boost/2017/03/233001.php

> D. Inside the segments you store std::vector. I could easily imagine
> the need to store something else, say a flat_set or a container
> suitable for non-movable, non-conpyable types (say a variant of deque).
> Therefore I would love to have a second defaulted argument set to
> std::vector. Now this is a template, template parameter which
> complicate things ... so in a sense a policy with a nested template
> alias could do the trick.

Of all your proposals this is the one I have most concerns about:
element contiguity is such
a key property that basically the whole library implicitly depends on
it, code- and
designwise. Policying this aspect away looks to me, in the absence of
cogent use cases,
an exercise in overdesign and a lot of work.

> [...]
>
>> - Do you think the library should be accepted as a Boost library?
>>
>
> Yes.

Thank you!

> *Specific comments*
>
> A. why is subaddress() returning void* instead of T* ?

Because only void* is needed: subaddress() is used in
poly_collection/detail/segment.hpp
as a client of the type-erased virtual interface defined in

https://github.com/joaquintides/poly_collection/blob/master/include/boost/poly_collection/detail/segment_backend.hpp

> B. is the void* stuff like this
>
> virtual range emplace( const_base_iterator p,void
> (*emplf)(void*,void*),void* arg)
> {
> return range_from(s.emplace(iterator_from(p),emplf,arg));
> }
>
> needed ? That is, is there no way to use real ans specific types?

This is connected to A, and in fact gives me the opportunity to comment
on one of
the more beautiful (IMHO) pieces of the lib :-)

The implementation of a typical emplace(args...) function just forwards the
args to the corresponding allocator construction call. The problem with
Boost.PolyCollection is that this construction happens inside a class
implementing
a virtual interface (the one mentioned before), and this interface can't
have
a member function template able to accept(args...). We have a type
erasure wall,
if you wish. So the workaround is to define a type-erased callback (the
emplf bit)
that is invoked from the virtual class implementation and provided by the
frontend of poly_collection, where the types of (args...) haven't been
erased.
Take a look at

https://github.com/joaquintides/poly_collection/blob/master/include/boost/poly_collection/detail/poly_collection.hpp#L1015-L1034

> C. Does value_holder<T>::data() need to return void*

It could return T*, but that's not needed, as data() is used for
placement new.

> D. Implementation of equality: do we need to be set like semantics
> (two-pass)? Why don't you just delegate to the segments after checking
> the size? I don't think the documentation specifies what the
> implementation does.

We can't blindly delegate to the segments (if I'm getting your question
right) because
the order and associated types of the segments in x and y can differ: we
need to first
match them up, hence the two passes.

> E. should the test check (perhaps via static_assert) the conditions
> under which move-construction and move-assignment is noexcept? (sorry
> if this is already done). Specifically, if std::unordered_map has
> noexcept for these, then you can guarantee the same for
> base_collection etc.

I can't guarantee noexcept because whether an exception is thrown or not
depends
on the nature of the actual types in the container, which is runtime info.

> F. I understand the layout to be roughly
> std::unordered_map<type_index,std::unique_ptr<segment>> ... this does
> not quite match diagram on page 2 of documentation, that is, you are
> missing an indirection.

You're right, but the diagram is merely explanatory and I wouldn't want
to complicate
it to no avail. In fact, the segments are pointed to from what looks
like a small
vector of three elements (light blue), which is also not a correct
depiction of
a std::unordered_map internal structure. In the end, all these
indirections don't
affect performance in any significant way (workload is in the segments
proper).

> G. tutorial: consider using override where appropriate

Noted, will do.

>
> H. local iterator and undefined behavior?: does the code have an
> assertion at least? Can we not get rid of undefined behavior of
>
> c.insert(c.begin(typeid(warrior)),juggernaut{11});
>
> ? That would surely be nice.

There's an assertion: take a look at

https://github.com/joaquintides/poly_collection/blob/master/include/boost/poly_collection/detail/poly_collection.hpp#L645-L658

> I. why emplace_pos/hint but not insert_hint/pos ?

This is insert(it,x). As for the naming, we have

emplace(args...)
emplace_hint(it,args...)
emplace_pos(local_it,args...)
insert(x)
insert(it,x)
insert(local_it,x)

which follows std conventions except for "emplace_pos". I had to made
that name up
because, in the standard, fixed position emplace only happens in
sequence containers
where the name is simply "emplace", but I can't use "emplace" as
emplace(it,args...)
would collide with emplace(args...) (which sequence containers don't have).

> J. why ==,!= but not <,>,>=, <=

Followed the interface of std::unordered_set. As segment order is
unspecified,
less-than comparison seems to make no sense.

> K. clarify that
>
> boost::poly_collection::for_each
> <warrior,juggernaut,goblin,std::string,window>( //restituted types
> c.begin(),c.end(),[&](const auto& x)
>
> loops over all elements or only the specific.

Loops over all elements, doing type restitution on those marked. Noted,
will clarify.

> L. It could be good to have a performance test of erase in the
> middle/front.

I can do that. Interpretation is going to be tricky, as the compared
data structures
differ wildly wrt erasure.

> M. using the identifier "concept" may require change in the near future?

Good observation! But, is "concept" not going to be a context-dependent
keyword
à la "final"?

> O. I wouldn't mind seeing separate graphs for 0-300 elements.

I can extend the graphs to cover that part, but I don't see the point of
using
this library for a handful of elements,don't you think?

> P. clarify test are without use of reserve

Noted, will do.

> Q. is there no normal container size()/empty() ? docs say e.g.
>
> cc.size(index)
> cc.size<U>()
>
> but then later when we see the full base_collection interface they
> are mentioned.

There are normal size() and empty() member functions, they're described
in the
reference.

> R. dumb question: is there any thing in your code that requires
> std::is_polymorphic_v<Base> for base_collection ? If not, then perhaps
> one should be able to avoid that and work completely with type
> restitution ? In some sense, you could create a hierarchy where
> classes a memcopyable and without a single virtual function.

Actually, std::is_polymorphic_v is enforced on the reference but not
anywhere in the code.
I guess I preferred to be conservative here. As for working with type
restitution alone,
I think this scenario would be best served by a potential
variant_collection, which was
discussed at

https://lists.boost.org/Archives/boost/2017/03/233001.php

> S. The test are testing best possible scenario; a more realistic test
> would be to copy pointers to an std::vector<base*>, shuffle it and run
> update on that.

Sorry if I am not getting your question, but is this not the line
labeled "shuffled_ptr_vector"
in the plot?

> That's it. Great work :-).

What a thorough review! Thank you,

Joaquín M López Muñoz


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk