Boost logo

Boost :

Subject: Re: [boost] [poly_collection] Memory allocation and allocator propagation
From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2017-05-16 11:35:07


On 16/05/2017 9:23, Joaquin M López Muñoz wrote:
>> where s is the concrete segment type that can be obtained via Model
>> (the address of the aligned storage inside "class storage" is casted
>> to Mode::get_segment_type<T>::type*) because the ConcreteType we are
>> going to construct is known at compile time in emplace_back_impl.
>>
>> I also wonder if insert<T> and other functions should avoid slicing
>> detection so that they can also avoid the virtual call overhead. If T
>> is assumed to be the same type of the final ConcreteType, you can use
>> typeid(T) instead of typeid (x) (which can require to trip through the
>> vptr) when searching in the unordered map and obtain a pointer to the
>> concrete segment type just casting. Insertion times could be improved,
>> specially range insertions could shine (as vector could be resized
>> just once). Or am I missing something?

For the emplace function, as the ConcreteType it's always statically
known (type T is the type to be constructed, whereas in "insert" T is
the type used to construct the concrete type), then I think no and
lambda is needed, it could be directly translated to a simple call to
the statically known segment type:

s.emplace_back(forward<Args>(args...));

> Ion, I think you're on to something here. As it happens, poly_collection
> can
> statically determine when T is acceptable and dynamically detect the cases
> where x's subtype is actually T, in which case virtual calls can be omitted
> altogether --A reference to the internal
> packed_segment/split_segment<Model,T,Allocator>
> can be statically obtained from segment<Model,Allocator>.

Ok, I see it. If typeid(x) == typeid(T), the static path (let's call it
statically_known_insert(x)) is taken. Otherwise the virtual function is
used.

In any case, the runtime slicing detection will slow down ranges, as
you'll need to check the typeid of each *it reference in the range. If
all the range is assumed to be the same runtime type (or slicing is
assumed) then the operation is a simple:

s.insert(s.end(), first, last);

which will use distance(first, last) for forward iterators, allocate all
the storage once and avoid a lot of objet moving/copying. Maybe a new
overload for insert taking a range that the user does not care to slice
could be a good idea for performance reasons:

base_collection.insert<warrior>(warriors_first, warriors_last);

For absolute zero overhead (no repeteated typeid, no repeated hash
search), a segment handle utility comes to my mind:

typedef base_collection<sprite> bc_t;
bc_t bc;

//register type
bc.register_types<warrior>();

//Obtain segment handle (maybe just a struct holding the pointer to
//the concrete segment)
bc_t::segment_handle<warrior> sh = bc.segment_handle<warrior>();

//Insert directly in the segment (just vector insertion)
//and updating bc's member data if any.
bc.insert(sh, pos, warrior);

Additional comments:

1) I also noticed that bc.size() and bc.empty() are not constant-time.
It seems a bit surprising, it might be noticeable when the number of
segments grows. This avoids common data in poly_collection but I imagine
most users assume these functions are free.

2) The number of additional operations that could take advantage of the
"I know your static type" optimization is important, although the
speedup will depend on the cost of each operation (eg. size()/ reserve()
vs. a virtual function call):

template<typename T> void reserve(size_type n);
template<typename T> void shrink_to_fit();
template<typename T> size_type capacity() const;
template<typename T> size_type max_size() const;
template<typename T> size_type size() const;
template<typename T> bool empty() const;
template<typename T> void clear();

Additional possible "zero cost" calls using the segment handle:

template<typename SegmentHandle>
   void reserve(SegmentHandle sh, size_type n);

template<typename SegmentHandle>
   void shrink_to_fit(SegmentHandle sh);

template<typename SegmentHandle>
   size_type capacity(SegmentHandle sh) const;

template<typename SegmentHandle>
   size_type max_size(SegmentHandle sh) const;

template<typename SegmentHandle>
   size_type size(SegmentHandle sh) const;

template<typename SegmentHandle>
   bool empty(SegmentHandle sh) const;

template<typename SegmentHandle>
   void clear(SegmentHandle sh);

Best,

Ion


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk