
Boost : 
Subject: Re: [boost] [poly_collection] Memory allocation and allocator propagation
From: Joaquin M LÃ³pez MuÃ±oz (joaquinlopezmunoz_at_[hidden])
Date: 20170516 18:28:16
El 16/05/2017 a las 13:35, Ion GaztaÃ±aga escribiÃ³:
> On 16/05/2017 9:23, Joaquin M LÃ³pez MuÃ±oz wrote:
>>> where s is the concrete segment type that can be obtained via Model
>>> (the address of the aligned storage inside "class storage" is casted
>>> to Mode::get_segment_type<T>::type*) because the ConcreteType we are
>>> going to construct is known at compile time in emplace_back_impl.
>>>
>>> I also wonder if insert<T> and other functions should avoid slicing
>>> detection so that they can also avoid the virtual call overhead. If T
>>> is assumed to be the same type of the final ConcreteType, you can use
>>> typeid(T) instead of typeid (x) (which can require to trip through the
>>> vptr) when searching in the unordered map and obtain a pointer to the
>>> concrete segment type just casting. Insertion times could be improved,
>>> specially range insertions could shine (as vector could be resized
>>> just once). Or am I missing something?
>
> For the emplace function, as the ConcreteType it's always statically
> known (type T is the type to be constructed, whereas in "insert" T is
> the type used to construct the concrete type), then I think no and
> lambda is needed, it could be directly translated to a simple call to
> the statically known segment type:
>
> s.emplace_back(forward<Args>(args...));
Correct.
>
>> Ion, I think you're on to something here. As it happens, poly_collection
>> can
>> statically determine when T is acceptable and dynamically detect the
>> cases
>> where x's subtype is actually T, in which case virtual calls can be
>> omitted
>> altogether A reference to the internal
>> packed_segment/split_segment<Model,T,Allocator>
>> can be statically obtained from segment<Model,Allocator>.
>
> Ok, I see it. If typeid(x) == typeid(T), the static path (let's call
> it statically_known_insert(x)) is taken. Otherwise the virtual
> function is used.
>
> In any case, the runtime slicing detection will slow down ranges, as
> you'll need to check the typeid of each *it reference in the range.
Not in every case, see for instance the two insert overloads at:
where you can notice than when the type is terminal we don't repeat
typeid checking.
But in general we need to typeid individually, and I don't see this as
something we
can dispense with: otherwise there'd be no way to do typeerased
insertions as
simple as:
boost::ptr_vector<Base> v;
...
c.insert(p.begin(),p.end());
> If all the range is assumed to be the same runtime type (or slicing is
> assumed) then the operation is a simple:
>
> s.insert(s.end(), first, last);
>
> which will use distance(first, last) for forward iterators, allocate
> all the storage once and avoid a lot of objet moving/copying. Maybe a
> new overload for insert taking a range that the user does not care to
> slice could be a good idea for performance reasons:
>
> base_collection.insert<warrior>(warriors_first, warriors_last);
We sort of already have that:
bc.insert(bc.end<warrior>(),warriors_first, warriors_last);
See:
This is efficient typeidwise, not so with respect to preallocating
based on distance.
If we go ahead with your proposal of staticfying the interface when the
concrete type
is known, we can get much better actually, as fast as possible.
> For absolute zero overhead (no repeteated typeid, no repeated hash
> search), a segment
> handle utility comes to my mind:
>
> [...]
I think this is not needed in light of my previous comment.
> 1) I also noticed that bc.size() and bc.empty() are not constanttime.
> It seems a bit surprising, it might be noticeable when the number of
> segments grows. This avoids common data in poly_collection but I
> imagine most users assume these functions are free.
Strictly speaking, they're constant time wrt number of elements, assuming
(correctly, I'd say) that the number of registered types is bounded and
don't
increase linearly with size(). I'm very against caching size for
consistency reasons
and because we are adding a little fat to every operation of the container.
> 2) The number of additional operations that could take advantage of
> the "I know your static type" optimization is important, although the
> speedup will depend on the cost of each operation (eg. size()/
> reserve() vs. a virtual function call):
>
> template<typename T> void reserve(size_type n);
> template<typename T> void shrink_to_fit();
> template<typename T> size_type capacity() const;
> template<typename T> size_type max_size() const;
> template<typename T> size_type size() const;
> template<typename T> bool empty() const;
> template<typename T> void clear();
If/when we staticfy the interface of course we'll go all the way and improve
as much as can be improved. The functions you list above, though, are not
expected to be a performance problem in any sensible program.
JoaquÃn M LÃ³pez MuÃ±oz
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk