Boost :

Date view	Thread view	Subject view	Author view

From: Robert Ramey (ramey_at_[hidden])
Date: 2006-02-13 01:42:44

Next message: Jim Douglas: "Re: [boost] [testing] QCC causing huge numbers of failures"
Previous message: Aleksey Gurtovoy: "Re: [boost] vc6/7 for 1.34?"
In reply to: Kim Barrett: "Re: [boost] [serialization] use of unsigned int instead of size_type"
Next in thread: Matthias Troyer: "Re: [boost] [serialization] use of unsigned int instead of size_type"
Reply: Matthias Troyer: "Re: [boost] [serialization] use of unsigned int instead of size_type"

Kim Barrett wrote:
> At 9:14 AM -0800 2/12/06, Robert Ramey wrote:
>> >> David Abrahams wrote:
>> >> A strong typedef should work, if all archives implement its
>> >> serialization.

> This is the objection I expected to hear from Robert much earlier in
> this discussion. A strong typedef for this purpose effectively widens
> the archive concept by adding a new thing that needs to be supported.
> That conceptual widening occurs even if the default archive behavior
> for a strong typedef is to just serialize the underlying type. It
> still needs to be documented as part of the archive concept, and
> anyone defining a new kind of archive ought to consider whether that
> default is the correct behavior for this new archive type, or whether
> something else is needed.

Even if a strong type is used, it is neither necessary nor is it
desireable to add it to every archive.

The procedures would be:

create a header boost/collection_size.hpp which would contain
something like

namespace boost {
BOOST_STRONG_TYPE(collection_size_t, std::size_t)

// now we have a collection type
BOOST_CLASS_IMPLEMENTION_LEVEL<collection_size_t, object)
// no versioning for effiency reasons

template<class Archive>
void seriaize(Archive &ar, collection_size_t &, const unsigned int version){
ar & t; // if its converted autmatically to size_t
// or
ar & static_cast<collection_size_t &>(t); // if not converted
automatically
}

> And I think he would have a pretty good rationale for feeling that
> way. Keeping the archive interface narrow and minimizing the coupling
> between serialization and archives minimal is, I think, one of the
> strengths of the serialization library's design.

Halleluha!!!

> I would be in full agreement with Robert here, except that all of the
> alternatives I can think of seem worse to me.
>
> 1. std::size_t
>
> This causes major problems for portable binary archives. I'm aware
> that portable binary archives are tricky (and perhaps not truly
> possible in the most general sense of "portable"). In particular,
> they require that users of such archives be very careful about the
> types that they include in such archives, avoiding all explicit use
> of primitive types with implementation-defined representations in
> favor of types with a fixed representation. So no int's or long's,
> only int32_t and the like. Floating point types add their own
> complexity.

A portable binary archive comes down to serializing primitives in
a portable way. This is what the example included with the
serialization library does. The example isn't complete but it
does illustrate this point.

> Some (potential) users of the serialization library (such as us) are
> already doing that, and have been working under such restrictions for
> a long time (long before we ever heard of boost.serialization),
> because cross-platform serialization is important to us.

Hmm - well, maybe you want to just finish the example in the package
by adding floats and doubles - and your done !!.

> The problem for portable binary archives caused by using std::size_t
> as the container count is that it is buried inside the container
> serializer, where the library client has no control over it. All the
> archive gets out of the serializer is the underlying primitive type,
> with which it does whatever it does on the given platform. The
> semantic information that this is a container count is lost by the
> time the archive sees the value, so there's nothing a "portable"
> archive can do to address this loss of information. And this occurs
> no matter how careful clients are in their own adherence to use of
> fixed representation value types.

This is all true. But I'm not convinced that its necessary to know
where the primitive came from to handle it. But I don't really
need to be convinced. I would be happy to go along with it
if someone who does think this is necessary is willing to address
all the minor little things that will add up to kind of a pain. This
includes:

a) Selecting a type that will please everyone.
b) Carefully setting up the appropriate serialization traits for such a type
c) Tweaking the collection serialization to use the new type.
d) while making sure that existing archives can still be read - this
entails having a little bit of conditional code in the collection
loading functions.

I believe that a BOOST_STRONG_TYPE is a very good
candidate for this - But that would suggest it might be a good
idea to take a critical look at BOOST_STRONG_TYPE.

So, if its done correctly, its more than trivial "bug fix"

> This leaves a client needing a portable binary archive with several
> unappealing options (in no particular order)
>
> - Modify the library to use one of the other options.
>
> - Override all of the needed container serializers specifically for
> the portable archive.
>
> - Don't attempt to serialize any standard container types.

As I said - I don't agree at all here. To illustrate my point, I
point to the example in the documentation and code
demo_portable_binary

> 2. standard fixed-size type
>
> We already know that uint32_t is inadequate; there are users with
> very large containers. Maybe uint64_t is big enough, though of course
> predictions of that sort have a history of proving false, sometimes
> in a surprisingly short time. And uint64_t isn't truly portable
> anyway, since an implementation might not have that type at all.
> Also, some might object to always requiring 8 bytes of count
> information, even when the actual size will never be anything like
> that large. This was my preferred approach before I learned about the
> strong typedef approach, in spite of the stated problems.
>
> 3. some self-contained variable-size type
>
> This might be possible, but the additional complexity is
> questionable. Also, all of the ideas I've thought of along this line
> make the various text-based archives less human readable, which some
> might reasonably find objectionable.

text archives present no problem. Numbers coded as a string of
decimal characters have no finite limit as to numbers they can represent.

"portable binary" archives must also have some sort of way to code
numbers in a variable length format.

The only problem arises with the native_binary archive - and it is
explicitly exempt from any portability requirement.

\> So it appears to me that all of the available options have downsides.
> While my initial reaction to the strong typedef approach was rather
> ambivalent because of the associated expansion of the archive
> concept, it seems to me to be the best of the available options.

Noooooo - and you were on a roll.

Robert Ramey

Next message: Jim Douglas: "Re: [boost] [testing] QCC causing huge numbers of failures"
Previous message: Aleksey Gurtovoy: "Re: [boost] vc6/7 for 1.34?"
In reply to: Kim Barrett: "Re: [boost] [serialization] use of unsigned int instead of size_type"
Next in thread: Matthias Troyer: "Re: [boost] [serialization] use of unsigned int instead of size_type"
Reply: Matthias Troyer: "Re: [boost] [serialization] use of unsigned int instead of size_type"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk