Boost :

Date view	Thread view	Subject view	Author view

From: Matthias Troyer (troyer_at_[hidden])
Date: 2002-11-25 04:41:17

Next message: Peter Dimov: "Re: [boost] intrusive_ptr ?"
Previous message: Matthias Troyer: "Re: [boost] Serialization Library Review"
In reply to: Robert Ramey: "RE:[boost] Serialization library review"
Next in thread: Peter Petrov: "Re:[boost] Serialization library review"

>
>> * A serialization of bool is missing - easy to fix
>
> I don't understand what you mean. basic_[i|o]archive contain:

Sorry, I missed that because it is separate from the other virtual
functions and not implemented in the b[io]archive class on which I
based my XDR implementation.

>> * The code will not compile on platforms where long is 64-bit:
>>
>> virtual basic_oarchive & operator<<(long _Val) = 0;
>> virtual basic_oarchive & operator<<(int64_t _Val) = 0;
>
> the current code contains the following.
>
> #ifndef BOOST_NO_INT64_T
> virtual basic_oarchive & operator<<(int64_t _Val) = 0;
> virtual basic_oarchive & operator<<(uint64_t _Val) = 0;
> #endif
>
> I guess this should be changed to:
> #ifdef BOOST_HAS_MS_INT64
> virtual basic_iarchive & operator>>(int64_t & _Val) = 0;
> virtual basic_iarchive & operator>>(uint64_t & _Val) = 0;
> #endif
> #ifdef BOOST_HAS_LONG_LONG
> virtual basic_iarchive & operator>>(long long & _Val) = 0;
> #endif

This sounds better. Thanks.

>> As mentioned in previous posts, additional functions e.g. load_array
>> and save_array need to be added to allow efficient serialization of
>> large data sets.
>> The default version could just use operator<< or operator>> as in:
>
>> virtual void save_array(const int* p, std::size_t n) {
>> for (std::size_t i=0;i<n;++i)
>> *this << p[i];
>> }
>
>> and this would thus not incur any extra coding work for people not
>> interested.
>> Serialization of containers such as std::vector, or ublas or mtl
>> vectors
>> and matrices can make use of this extra function transparent to the
>> user so that
>> the interface would also not become harder to understand for the
>> library user.
>
> why can't this be handled using
>
> basic_oarchive::write_binary(void *p, size_t count)

This does not allow type-specific transformation (e.g. change of byte
order) to be performed. Thus we neeed one such function for each
primitive type.

> 4c. Interface design: binary archives
> =====================================
> Your more elaborate definition of a family of binary archives is
> totally in keeping
> with the manner that the library is intended to be used. I would call
> these
> definitions examples of how to use the library rather than part of the
> library
> itself. So I would be disinclined to make native binary archives any
> more
> elaborate than they are now.

Once the interface has been fixed I will contribute these binary
archives, as well as the XDR archive. Why don't you put the library
into boost-sandbox?

>> 4d. Interface design: small objects
> ===================================
>
>> I have mentioned this in a previous post. Instead of requiring the
>> user
>> to reimplement the serialization of standard containers for all small
>> object types for which the versioning and pointer system should be
>> bypassed, a
>> traits class can be added and the optimized serialization of all
>> containers of small
>> objects implemented in the library. Note that the traits class needs
>> to be
>> specialized only for those objects for which the user wants to
>> optimize
>> serialization, while no effort is required at all if the standard
>> serialization method is to
>> be used.
>
> I don't see why this would be necessary - I will have to investigate

Let us discuss that in private mails if you have
>
>> The current library is however not consistent since
>
>> * serialization of normal classes goes via specialization of the
>> serialization<T>
>> class
>
>> * serialization of template classes goes via overloading of the free
>> function
>> serialization_detail::save_template(), ...
>
>> This is unacceptable and a consistent method should be found.
>
> I believe that what you refering to is an artifact of a workaround
> for compilers that fail to support partial template specialization.
> This will probably addressed for comforming compilers but
> others will have to live with this or something like it.

According to the documentation this seems to be the way we have to
implement it.
Do you want to tell me that if I just want to support conforming
compilers, then I can just specialize the serialization class for my
template classes? If so, then only a change to the documentation seems
needed.

> 5.b) Allow overriding of preamble, etc.
> ---------------------------------------
>
>> I would like to have more control over some aspects that are currently
>> hardcoded into the library:
>
>> * writing/reading the preamble
>
> I believe the the preamble will be overridable

Great! Thanks.

>
>> * obtaining the version number of a class
>> * starting/finishing writing an object
>> * a new type is encountered
>
> hmmm - new type is encounterd? I don't know what that means.

Well, I assume (not having checked your implementation on this in deep
detail) that you have to write the version number of each class
serialized. Since I hope that you do not write it every time you
serialize an object of this class, I guess that you write it only the
first time an object of that class is serialized. That is what I mean.

>
>> The motivation is very simple: We have hundreds of gigabytes of data
>> lying around
>> in tens of thousands of files that could easily be read by the
>> serialization archive
>> if there were not too small differences:
>
>> i) I wrote a different preamble
>> ii) I only wrote one version number for all classes in the archive
>> instead of separate
>> version numbers for each class
>> iii) no information was written when a new class was encountered
>
>> Since otherwise the interface is nearly identical (many classes
>> contain a load
> and a save function, albeit with a different name for the archive
> classes), changing
>> all my codes over to a boost::serialization library would be easy if
>> it
>> weren't for the three issues above.
>
> I believe you are wrong here. The interfaces might seem similar but
> there
> is no reason to believe that the file formats have very much in common.
> I don't believe there is any way enough flexiblity could be added to
> deserialize a file serialized by another system.

Oh, it should be very simple since there are only a few functions
(skipping pointer serialization) that can be different:

a) preamble of the archive - you agreed to make that flexible and
overridable by the archive class
b) serialization of primitive types - the archive class allows full
flexibility here
c) preamble/postamble of class serialization - that's what I would like
to be possible to override as well.
d) version number - if I could just provide a default version number
instead of reading each version number from the archive I could easily
read all my legacy file formats.

Note that by providing hooks to this functionality (with a warning 'use
at your own risk and only if you understand what you are doing') you
will allow power users to make much better use of the library. I, for
example, could then just replace my library with yours with very little
effort and still be able to read my legacy files.

>
> Note: converting legacy files to a new serializaion system is very
> easy:
>
> load the file into memory using the old system
>
> save the data into a new file using the new system.
>
> forget about the old system.

It would be that easy if I could just define an input archive class to
read the old format.
Otherwise I have to support both serialization libraries at once in all
my codes, and walk through tens of application programs and hundreds of
thousands of files to just convert. It would be much better if I could
just read the old files with your library, by providing my own 'legacy'
archive class.

>
>> Since these are major changes I would like to see a new review after
>> they are implemented and thus vote NO for now. However I am willing
>> to help
>> Robert with implementing the changes, improving the library, and am
>> willing to
>> discuss further.
>
> I very much appreciate your interest in making an portable
> implementation of
> an XDR binary archive binary archive and understand you have made
> great progress in this
> in a very short time. Please let me know if there is anything you need
> else you need from me. Many users feel that this is necessary and it
> would demonstrate the ease of use of the library.

It took me just two hours to convert my XDR archive class from our old
serialization library (on http://www.comp-phys.org) to your library.
Now it compiles well under UNIX, but we still have to sort out the
problem you encountered under Windows.

If only I could override the two functions mentioned above (preamble of
an object and preamble of the archive) I guess that I could immediately
read all my old files with

> I know you have spend a lot of time studying and working with the
> library
> and I much appreciate your efforts.

As you can see on our above mentioned web page, we explicitly state
that we hope that our serialization library will hopefully soon be
replaced by a nicer boost one, and I wish to thank YOU for YOUR efforts.

Matthias

Next message: Peter Dimov: "Re: [boost] intrusive_ptr ?"
Previous message: Matthias Troyer: "Re: [boost] Serialization Library Review"
In reply to: Robert Ramey: "RE:[boost] Serialization library review"
Next in thread: Peter Petrov: "Re:[boost] Serialization library review"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk