Boost logo

Boost Users :

Subject: Re: [Boost-users] [Serialization] Speedding up client-servercommunication
From: Robert Ramey (ramey_at_[hidden])
Date: 2010-02-05 12:09:55


Juraj Ivancic wrote:
> Ruediger Berlich wrote:
>
>> Do you have further suggestions for ways of influencing the
>> Boost.Serialization library ?
>
> I have a very similar scenario where version tracking does not play an
> important role. In this case it is possible to create a single archive
> and stream per, say, connection as opposed to creating them for every
> object (de)serialized. This was a huge performance boost in my case.
>
> What I do is I create a Serializer class for every connection. All the
> (de)serialization is done through it. Note that for this to work all
> objects (de)serialized should have tracking turned off.

In my view this is the correct approach for high performance considerations.
Tracking is important to have for the most general case of saving state
but it conflicts with using serialization for some applications. I've been
considering ways to make serialization more useful in these types
of scenarios.

> This could be improved further:
>
> 1)By replacing stringstreams with something more lightweight.

In small experiments, I've found this to make a big difference. And
it's not that hard as one needs only support a subset of the hole
streambuf functionality.

> 2)Ideally these serialize methods should have some kind of compile
> time assertion that object has serialization tracking turned off.

Note that the default tracking attribute is "selective" which means
that tracking is only on if an object is serialized through apointer
somewhere in the program. So if you never serialize through
a pointer, the default should be just fine. Anyone who serializes
through a pointer, especially one to a virtual base class, must
realize that this requires a lot more processing to work properly
and is fundamentally incompatible with performance optimization.

Note that "implementation level" is also important. The default
is that the class id is looked up in a table to check to see if
versioning must be supported. lowering the "implementation level"
to "object serialization" (hmm I don't remember - better double
check). Means that this class information is not checked. This
speeds things up, but will mean that trying to load old archives
could be a problem. For MPI type applications this shouldn't
be a problem so it should also be considered.

> I'm not quite sure whether this approach is fully supported by the
> boost::serialization interface and if this could could be broken by
> future versions. OTOH it has been working well with last 6-7 boost
> releases (1.41 is the last one I tested).

It seems to me that you've been using the system as I intended it
to be used.

I'm considering enhancement of the library to address situations
like this. This takes me quite a while for a number of reasons.

a) it's way too easy to make a change which ripples through
in such a way that it complicates the library beyond usability.
b) it's way too to make changes which break old archives.
c) it's very helpful to get feed back from real users such as yourself
with real problems to test ideas for extenstions as "thought experiments"
to see if such ideas really would help without violating the considerations
above.

By being "conservative" in this way the library has steadily improved
to be thread-safe, to handle dynamic loading/unloading of classes
and related serialization code, and to have more "concept checking"
to help detect mis-usages of the library. This is all due to getting
complaints about particular use cases. The improvements have
mostly been introduced without breaking old code or archives.

Robert Ramey

>
> HTH


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net