Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2005-10-10 11:48:17


Matthias Troyer wrote:
> On Oct 9, 2005, at 11:15 PM, Robert Ramey wrote:
>
>> Attached is a sketch of what I have in mind. It does compile without
>> error on VC 7.1
>>
>> With this approach you would make one fast_oarchive adaptor class
>> and one small and trivial *.hpp file for each archive it is adapted
>> to.
>
>
>
> ---- SUMMARY ---------
>
> If I may summarize this solution as follows:
>
> template<class Base>
> class fast_oarchive_impl :
> public Base
> {
> public:
> ...
> // custom specializations
> void save_override(const std::vector<int> & t, int){
> save_binary(t, sizeof(int) * t.size());
> }
>
> // here's a way to do it for all vectors in one shot
> template<class T>
> void save_override(const std::vector<T> & t, int){
> save_binary(t, sizeof(T) * t.size());
> // this version not certified for more complex types !!!
> BOOST_STATIC_ASSERT(boost::is_primitive<T>::value);
> // or pointers either !!!
> BOOST_STATIC_ASSERT(boost::is_pointer<T>::value);
> }
>
> ...
> };

A fair characterization. But my point isn't to suggest or promote a
specific override. Rather my point is to show that the library can be
extended without altering the internals of the library itself.

All the included archive classes have a similar structure

/////////////////////////////////////////////////////////////////////////
// class basic_text_iarchive - read serialized objects from a input text
stream
template<class Archive>
class basic_text_iarchive :
    public detail::common_iarchive<Archive>
{
    ...
    // intermediate level to support override of operators
    // fot templates in the absence of partial function
    // template ordering
    template<class T>
    void load_override(T & t, BOOST_PFTO int){
        archive::load(* this->This(), t);
    }
    ...
};

where archive::load is declared and defined in the file iserializer.hpp.

This latter file includes all the "basic" functionality required for
prmitive types
supported by C++. I have made huge efforts not to couple the code
in iserializer.hpp to any other types. (nvp might be an exception).

Within iserializer.hpp the function archive::load dispatches to different
implementation
depending on traits of the type being serialized.

Now if one wants to handle a particular type in a special way (e.g.
vector<T> where T
is not pointer. then one could aument serializer.hpp. But one could just as
well do

/////////////////////////////////////////////////////////////////////////
// class basic_text_iarchive - read serialized objects from a input text
stream
template<class Archive>
class basic_text_iarchive :
    public detail::common_iarchive<Archive>
{
    ...
    // intermediate level to support override of operators
    // fot templates in the absence of partial function
    // template ordering
    template<class T>
    void load_override(T & t, BOOST_PFTO int){
        // your own dispatch code here for particular cases.
       // fall through to default/universal implementation
        archive::load(* this->This(), t);
    }
    ...
};

There is no need to alter the default/universal/basic serialization
implemenation.

Of course one doesn't have to do the above. Since the code uses the CRTP
to call load_overload in the most derived class, then the class above can be
unchanged and the following can be included in the most derived class.

    template<class T>
    void load_override(T & t, BOOST_PFTO int){

        // your own dispatch code here for particular cases.
       // fall through to default/universal implementation
        basic_text_iarchive<Archive>::load_override(t, 0);
    }

Adding your own dispatch code in the indicated place will be exactly the
same
as incorporating your code into iserializer.hpp - Except that your special
dispatch code
will only be included when requested and won't have to be bypassed
conditionally with
an new type trait.

The problem with the above is that it applies only to one specific archive
class. So my proposal was to make an "archive adaptor" to permit your
overrides to be added to any functioning archive class.

>
> then I see several major disadvantages of this approach:
>
> 1.) it fixes the value types for which fast array serialization can
> be done

> 2.) for M types types to be serialized and N archives there is an MxN
> problem in this approach.

> 3.) it leads to a tight coupling between archives and all classes
> that can profit from fast array serialization (called "array-like
> classes" below), and makes the archive depend on implementation
> details of the array-like classes

> 4.) it is not easily extensible to new array-like classes

I believe that the above points really refer to the specific override I used
in my example.

I have no issue at all with your particular overrides. In fact, I'm pleased
that people are finding that the library can be extended to handle more
specific cases. I just want to keep these add-ins as exactly that -
optional additions. Your override can be as elaborat as you want including
your own trait - is_contiguous or whatever.

We're doing the same code - its just placed in different source modules.
Yours places it in parts of the library that everyone uses, mine places it
in separate header modules.

The point is that we would have two orthogonal components to maintain.

> Let me elaborate on these points below and provide a possible
> solution to each of them. The simplest solution, as I see it will be
> to
>
> - provide an additional traits class has_fast_array_serialization
> - archives offering (the optional) fast array serialization provide a
> save_array member function in addition to save and save_binary
> - The dispatch to either save() or save_array() is the responsibility
> of the serialization code of the class, and not the responsibility of
> the archive

That sounds very good to me. Maybe I spoke too soon.
I don't see how this would require and changes at all to the serialization
library.
I don't see why has_fast_array_serialization has to be part of the
serialization library.
Maybe all the code can be included in boost/serialization/fast_arrary.hpp ?
This header would be included by all the classes that use it and no others.

> These are minor extensions to the serialization library, that do not
> break any existing code, that do not make it harder to write a new
> archive or a new serialize function, but they allow new types of
> archives and can give huge speedups for large data sets.

Its clear I'm missing something here. I'll have to look more deeply into
this
when I get a couple of other monkeys of my back.

>
> ------- DETAILS -----------
>
> Now the details
>
> ad 1.:
...

no problem here.

> ad 2.:
    ...
> a) a potential portable binary archive might need to do byte
> reordering:
    ... no problem

>
> b) an XDR archive, using XDR streams needs to make a call to an XDR
> function,
   ... no problem
>
> c) .. and again, I need type information and cannot just call save_binary.
   .. no problem
>

> d)
   .. no problem

> For this reason my proposed solution is to dispatch to a save_array
> function for those types and archives supporting it:
>
> template<class Base>
> class fast_oarchive_impl :
> public Base
> {
> public:
>
> // here's a way to do it for all vectors in one shot
> template<class T>
> void save_override
> (
> const std::vector<T> & t, int,
> typename
> boost::enable_if<has_fast_array_serialization<Base,T> >::type *=0
> )
> {
> save_binary(&(t[0]),t.size());
> }
>
> ...
> };

I'm quite satisfied with this. My point is that none of this has to be part
of the
serialization library itself. It can be a separate module like
serialization/variant.hpp is.

> where all archive classes provide a function like
>
> void Archive::save_array(Type const *, std::size_t)
>
> for all types for which the traits
> has_fast_array_serialization<Archive,Type> is true.

Now here is where we're going to part company.

> That way a
> single overload suffices for all the N=5 archive types presented
> above, and the MxN problem is solved. Note also, that archives not
> supporting this fast array serialization do not need to implement
> anything, as the default for
> has_fast_array_serialization<Archive,Type> is false.

But maybe not. If has_fast_array_serialization<Archive,Type> is defined
boost/serialization/fast_arrary.hpp I'm still OK with it.

> ad 3.:

> This introduces implementation details of the mtl_dense_matrix class
> into the archive, breaks orthogonality, and leads to a tight
> coupling. Change in these implementation details of the
> mtl_dense_matrix might require changes to the archive classes.

we certainly want to avoid that!!!
>
> The solution is easy:
>
> - some archives provide fast array serialization through the
> save_array member function
> - let the MTL be responsible for serialization of it own classes, and
> use save_array where appropriate

Just great !!!

> ad 4.:
 ...

I'll agree with that also

> To summarize, with three minor extensions to the serialization
> library, none of which breaks any existing code, we can get 10x
> speedups for serialization of large date sets, enable new types of
> archives such as MPI archives, and all of that without introducing
> any of the four problems discussed here.

The only thing I'm missing here is why the serialization library itself has
to be
modified to support all this. It seems that all this could easily be
encapulated in
one (or more) separate optional headers. This would be the best of all
possible
worlds.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk