Boost logo

Boost :

Subject: Re: [boost] [review] Review of PolyCollection starts today (May 3rd)
From: Thorsten Ottosen (tottosen_at_[hidden])
Date: 2017-05-10 20:33:32


Hi Joaquín (and Ion),

I took a look at this a few months ago, and my view is positive today as
it was then. The code looks clean, the documentation good, and the tests
look thorough.

Summary: The library provides containers that offer great data locality
and which stores static type information. The net result is that
increased performance can be had, sometimes significantly. To achieve
this, objects are stored in per-type segments. Small buffer
optimizations may be advantages for vector<any> or
vector<function<...>>, but this library goes beyond that. The price you
pay is that iterators are forward only (so no sorting) and in-the-middle
erase could be slower (in some cases). I would mostly be using
base_collection in my work, but I can definitely see the need for the
other two collections. To get the optimization offered by this library
you would have to do something like

vector<Derived1>
vector<Derived2>
...
vector<DerivedN>

and then manually loop over each container each time you wanted to
process each element. This quickly becomes tedious. With this library I
can have a single container in my program and get approximately the same
speed.

So overall my impression is very positive. This is great work. I do have
a few things that I think needs careful considerations.

A. OO programming and copyability of classes is not something that
everybody is going to like. The normal thing is to derive from
boost::noncopyable to get rid of a bunch of undesired behaviors. Would
it be possible for the library to pick up a private or protected
copy-constructor when needed, possible via a friend declaration? I think
many uses of this library will also store pointers to the objects
in other collections, e.g. vector<const base*>, and as members so being
able to have your hierarchy non-copyable while allowing copyability
within the container is important IMO.

B. I miss range versions of the various iteration patterns so I can use
them directly with a range-based for. E.g. instead of

   for( auto i = c.begin<warrior>(); i != c.end<warrior>(); ++ i )

I should be able to say

   for( auto& : c.range<warrior>() );

C. I think we talked about an make_overload() function to devirtualize
uses of base_collection. Do we have such a beast in boost already? If
not, it might be a worth providing it with this library.

D. Inside the segments you store std::vector. I could easily imagine the
need to store something else, say a flat_set or a container suitable for
non-movable, non-conpyable types (say a variant of deque).
Therefore I would love to have a second defaulted argument set to
std::vector. Now this is a template, template parameter which complicate
things ... so in a sense a policy with a nested template alias could do
the trick.

I also have other minor comments, but those given below.

>
> - What is your evaluation of the design?
>

It's excellent.

> - What is your evaluation of the implementation?

Very high quality, as usual.

> - What is your evaluation of the documentation?
>

Also high quality.

> - What is your evaluation of the potential usefulness of the library?

Very high.

> - Did you try to use the library? With what compiler? Did you have any
> problems?

No/No/No.

> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?

About half a day.

> - Are you knowledgeable about the problem domain?
>

Somewhat.

> And most importantly:
>
> - Do you think the library should be accepted as a Boost library?
>

Yes.

*Specific comments*

A. why is subaddress() returning void* instead of T* ?

B. is the void* stuff like this

virtual range emplace( const_base_iterator p,void
(*emplf)(void*,void*),void* arg)
   {
     return range_from(s.emplace(iterator_from(p),emplf,arg));
}

  needed ? That is, is there no way to use real ans specific types?

C. Does value_holder<T>::data() need to return void*

D. Implementation of equality: do we need to be set like semantics
(two-pass)? Why don't you just delegate to the segments after checking
the size? I don't think the documentation specifies what the
implementation does.

E. should the test check (perhaps via static_assert) the conditions
under which move-construction and move-assignment is noexcept? (sorry if
this is already done). Specifically, if std::unordered_map has noexcept
for these, then you can guarantee the same for base_collection etc.

F. I understand the layout to be roughly
std::unordered_map<type_index,std::unique_ptr<segment>> ... this does
not quite match diagram on page 2 of documentation, that is, you are
missing an indirection.

G. tutorial: consider using override where appropriate

H. local iterator and undefined behavior?: does the code have an
assertion at least? Can we not get rid of undefined behavior of

   c.insert(c.begin(typeid(warrior)),juggernaut{11});

? That would surely be nice.

I. why emplace_pos/hint but not insert_hint/pos ?

J. why ==,!= but not <,>,>=, <=

K. clarify that

boost::poly_collection::for_each
   <warrior,juggernaut,goblin,std::string,window>( //restituted types
   c.begin(),c.end(),[&](const auto& x)

loops over all elements or only the specific.

L. It could be good to have a performance test of erase in the middle/front.

M. using the identifier "concept" may require change in the near future?

O. I wouldn't mind seeing separate graphs for 0-300 elements.

P. clarify test are without use of reserve

Q. is there no normal container size()/empty() ? docs say e.g.

   cc.size(index)
   cc.size<U>()

   but then later when we see the full base_collection interface they
are mentioned.

R. dumb question: is there any thing in your code that requires
std::is_polymorphic_v<Base> for base_collection ? If not, then perhaps
one should be able to avoid that and work completely with type
restitution ? In some sense, you could create a hierarchy where classes
a memcopyable and without a single virtual function.

S. The test are testing best possible scenario; a more realistic test
would be to copy pointers to an std::vector<base*>, shuffle it and run
update on that.

That's it. Great work :-).

kind regards

Thorsten

-- 
Best regards,
Thorsten Jørgen Ottosen, Ph.D.
Director of Research
+45 23308797
Dezide (www.dezide.com)

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk