Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2005-10-11 11:45:05


Matthias Troyer wrote:
> On Oct 10, 2005, at 6:48 PM, Robert Ramey wrote:
>>
>> That sounds very good to me. Maybe I spoke too soon.
>> I don't see how this would require and changes at all to the
>> serialization
>> library.
>> I don't see why has_fast_array_serialization has to be part of the
>> serialization library.
>> Maybe all the code can be included in boost/serialization/
>> fast_arrary.hpp ?
>> This header would be included by all the classes that use it and no
>> others.
>>
>> [snip]
>
>
>> If has_fast_array_serialization<Archive,Type> is defined
>> boost/serialization/fast_arrary.hpp I'm still OK with it.
>
> That's where it is defined in my proposal.
>
>> The only thing I'm missing here is why the serialization library
>> itself has
>> to be
>> modified to support all this. It seems that all this could easily be
>> encapulated in
>> one (or more) separate optional headers. This would be the best of
>> all
>> possible
>> worlds.
>
> There are actually only very few modifications:
>
> 1. boost/archive/detail/oserializer.hpp and iserializer.hpp require
> modifications for the serialization of C-arrays of fixed length. In
> my version, the class save_array_type is modified to dispatch to
> save_array when fast_array_serialization is possible. The underlying
> problem here is that oserializer.hpp implements the serialization of
> a type here (the C array!). The optimal solution to this problem
> would be to move the array serialization to a separate header boost/
> serialization/array.hpp, as is done for all C++ classes.

My intention was to include all types "built-in" to the C++ language in
?serializer.hpp
so that's why it includes C++ arrays. Separating this into another header
would
break this "concept" and would result in a small additional *.hpp file that
would
have to be included explicitly. So I would be against it in this case. On
the
other hand, I do appreciate ideas that remove pieces from the "core" of the
library
and turn them into more independent modules so don't get discouraged here.

> 2. boost/serialization/vector.hpp is also modified to dispatch to
> save_array and load_array where possible. I don't think that this is
> a problem?

That would mean that users are including save/load_array even if they
don't want them or want to use their own versions. Oh - then documentation
has to be enhanced to explain all this internal behavior. I would prefer
something like the following:

class my_class {
    stl::vector<int> m_vi;
    ...
};
template<class Archive>
my_class::serialize(Archive &ar, const unsigned int version){
    // standard way
    ar & m_vi;
    // or fast way which defaults to standard way in appropriate cases.
    save_array(ar, m_vi);
}

This
    a) keeps the stl portion of the library smaller
    b) leave the user in control of what's going on
    c) permits development of save/load array to be on an independent
parallel track
    with everyting else.

If we eventually discover that that everyone is always using save/load array
then
we can study whether we want to just enhance stl::vector, etc to include it
as
default functionality.

> 3. I had to introduce a new strong typedef in basic_archive.hpp:
>
> BOOST_STRONG_TYPEDEF(std::size_t, container_size_type)
> BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type,
> primitive_type)
>
...
>
> I don't think that you will object to this.
...
I looked at this and right away I noticed that, as written this would make
all existing binary archives unreadable. The real fix for is:

a) bump up the library version from 3 to 4
b) alter save/load collections to use unsigned int or size_t depending
upon he library version.

It has to be done this way since stl collections are marked as
non-versioned.

I was sort of reluctant to do this as it seemed to me that it would be yet
a while before anyone starts to save more than 2,000,000,000 objects
with the serialization library and it seemed to me wasteful for the
native binary archive to included an extra 4 bytes containing 0's for every
collection serialzed. But maybe now (I mean 1.34) is the time to bite
the bullet on this. I don't have a strong prefererence for either changing
it or leaving it the same. If you need it, I would say fine - we'll do it.

> 4. boost/archive/basic_binary_[io]archive.hpp serialize
> container_size_type as an unsigned int as done till now. It might be
> better to bump the file version and serialize them as std::size_t.

agreed - no point in using size_t in the collection code just to throw it
away.

> All the other changes were to modify the binary archives and the
> polymorphic archive to support fast array serialization. In contrast
> to the above points this is optional. Instead we could provide
> fast_binary_[io]archive and fast_polymporphic_[io]archive, that
> differ from their normal versions just by supporting fast array
> serialization. I could live with this as well, although it makes more
> sense in my opinion to just addd the save_array/load_array features
> to the existing archives.

sounds like you can go along with me on this.

> Of all the points above, I believe that you will not have anything
> against points 3 and 4 since you proposed something similar already
> in the past if I remember correctly.

I don't see a problem here.

> Issue 2 should also be noncontroversial, and the main discussion
> should thus be on issue 1, on how one can improve the design of [io]
> serializer to move the the implementation of array serialization into
> a separate header.

looks like we'll have to arm wrestle a little more here. In a way you've
got
me over a barrel. I continually advocate keeping the "core" of the library
small
by factoring out everything that can be factor out. Now you've got me on
C++ array. sort of. I'll re-iterate my view that the "core" library address
serialization
of all "built-in" language types - and no others.

The other part of your proposal really is an enhancement of the stl
serializations
which are not part of the "core" part of the library. I'm confident that
your
enhancement will be quite useful to many people and I'm very happy to see
that
you've done it. That's not the same as forcing it on all users of the
library. So
I prefer to see your enhancement as an option explicitly invoked by the
user.
This has a number of advantages:

a) the user can see what he is doing. No hidden complex behavior.
b) some users might want just minimal code since thier collections are small
c) your enhancement will require significant documentation. This will be
much
easier if it is an optional add-on to the serialization library.
d) parallel or layered developement is facilitated.

So to summarize the issues

a) should C++ array serialization be in a separate header? I say no, you say
yes.
b) should save/load array be incorporated into stl collection serialization
to make
its usage oblicatory? I say no, you say yes.

Regardless of how these questions are answered, its clear to me that your
enhancements to stl containers will be available to users. Assuming that
this
is packaged as an optional add-on as I would hope, the only questions
remaining will
be:

a) Should this be a totally separate library with its own
documentation/tests/directory tree etc?
It should be separate but not totally so:

Maybe a files or maybe directory within serialization for save/load array
and within archive
for fast...archive adaptor.

A group of its own tests - just like we have tests for all other
combinations of serializations
and archives - I can hear the howling already. We'll have to see what to do
about this.

A separate documenation section in the documenation of the serialization
library. Similar to the
miscelleneas. But miscellaneas holds things that are really separate so
we'll find a good place for it. Maybe a section titled something like
"Special Considerations When Serializing Collections" (but shorter).

Note that your fast...archive will really be a new creature - an "archive
adaptor". This merits
an new section in the documentation in any case. This is a cool and useful
technique
and will encourage future great ideas which enhance the serialization
library without
making the core bigger.

b) Should such an optional enhancement be subject to some sort of review?
I'm agonostic
on this. I would be happy to just be the gate keeper and accept it or make
you alter it to my taste.
Of course, no guarentee that anyone else would be happy with this. I'm
willing to go along
with whatever the consensus is regarding if and/or how this thing should be
reviewed.

===============

A final note to others wishing to participate in the serialization library.

Welcome and good luck.

Having said that, I will give a little advice on how to be most successful
in getting
my cooperation. (encouragement you get for free)

a) Notice that I make huge efforts to keep the "core" library from growing.
Anything
that adds to it makes my job bigger - I have to make this job smaller.

b) I am out of the business of writing serializations for specific classes.
Its up to
those who need it to get it done. Of course I'm always full of advice -
sorry about that.

c) I am out of the business of writing new archive types. Instead, I want
to improve
the documentation to make this easier.

d) My efforts will be focused on implemenation aspects of the "core"
library. This
includes:

i) pending issue regarding dynamic loading/unloading of code that contains
serialization
for specific types. This includes testing of support for plug-ins.

ii) making a good performance test and correcting any performance
bottlenecks in the
core library.

iii) making testing more efficient.

iv) fixing bugs

I am getting a little "burned out" on the serialization library. Its only
my obsessive nature that
makes me continue to the bitter end which I'm hoping I am approaching - at
least
asymptotically.

I am extremely gratified by your efforts and those of others to enhance the
library and
will do everything (subject to the above reservations) I can to encourage
and support them.
Nothing would make me happier to see people spinning off their own improved
serializations
and archive classes and making the package the default choice for C++ object
serializaition.

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk