Boost logo

Boost :

Subject: [boost] [poly_collection] User-defined polymorphism model
From: Ion Gaztañaga (igaztanaga_at_[hidden])
Date: 2017-05-13 23:32:26


Hi Joaquín and all reviewers,

Before we reach the end of the review period, I wanted to comment an
(IMHO) interesting feature used in the internal design of
PolyCollection. The library offers three of polymorphism models:
base-derived, boost::any and signature-sharing callable types.

Under the hood, the library abstracts those polymorphic models into a
generic class in the detail namespace:

   template<typename Model,typename Allocator>
   class poly_collection;

and each concrete container is just implemented in terms of poly_collection:

   template<typename Base,typename Allocator>
   class base_collection
     : public detail::poly_collection<detail::base_model<Base>,Allocator>

The first template parameter (Model) is the abstraction ("concept") that
implements the non-generic part of each polymorphism model. A model
implements the following types and operations (comments on each
attribute are my interpretation, they can be completely wrong):

template<class ModelArg>
class Model
{
   //
   //Value-related attributes
   //

   //The type we want to expose as value_type of the container
   using value_type= ...;

   //Is T a subtype of the general type (acceptable as value)?
   template<class T> using is_subtype= ...

   //Is T the "most refined type"?
   template<class T> using is_terminal=...

   //Id of a value
   template<class T> static const std::type_info& subtypeid(const T& x);

   //Obtain a type-erased pointer to the value
   template<class T> static void* subaddress(T& x);
   template<typename T> static const void* subaddress(const T& x);

   //
   //Segment-related
   //

   //Segment iterators that point to type_erased value_types
   using base_iterator=...
   using const_base_iterator=...

   //const<->non-const conversions
   static base_iterator nonconst_iterator(const_base_iterator it);
   template<typename T>
   static iterator<T> nonconst_iterator(const_iterator<T> it);

   //Concrete iterators for Derived
   template<typename Derived> using iterator=...;
   template<typename Derived> using const_iterator=...

   //Generic type, comparable with base_iterator.
   //base_iterator == the last returned base_sentinel
   //means that base_iterator points one-past the last
   //element of a segment.
   using base_sentinel=...;
   using const_base_sentinel=...

   //Construct a segment for type T using the given allocator and
   //return it type-erased.
   template<typename T,typename Allocator>
   static detail::segment_backend<Model>* make(const Allocator& al);
};

I have several questions on this polymorphism model.

1) Wouldn't be a good idea to make this model-based polymorphism (or a
refined one) public? Maybe the current model needs more work to support
additional polymorphism models. Of course requirements for each
operation in a model should be detailed. A correct abstraction to define
user-provided polymorphism models is hard but this seems really a great
feature that makes this library more valuable for users.

2) If understand this model abstraction correctly, each Model chooses
the segment type to store values (it could be any sequence type, I
guess). The segment_backend abstraction does not only abstract
ConcreteTypes from value_types (using void pointers that will be casted
to real types in the implementation of the virtual function), but also
the segment type (the sequence holding the concrete values).

Current implementation uses two segment types: split_segment (stores
Concrete types and generic value_type's in two separate vectors) and
packed_segment (stores ConcreteTypes which already contain value_type's
as a subobject). function_collection and any_collection use the first
segment, base_collection uses the second.

Why are different segment types needed? Can't we store value_types and
ConcreteTypes in a struct and use only one type of segment? If this is
possible, I guess iteration performance would be improved for
function_collection and any_collection. Model definition could be
simplified.

On the other hand, if the sequence type can be left out of the model
definition, the underlying sequence type (change vector with deque,
small vector or any other) could be made configurable. Would this be
useful for users?

3) I understand that caching a base_sentinel in each segment is an
optimization to avoid a virtual function call when iterating with
base_iterators. Is this assumption correct? Why is a different
base_sentinel type needed to mark the end of each segment? Wouldn't a
"base_iterator" pointing to that on-past-the end position suffice? Is an
optimization (maybe base_sentinel is cheaper) or a must have?

Thanks,

Ion

PD: I have more design questions related to other topics but I will post
a different thread for each one.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk