Boost logo

Boost :

From: Christoph Ludwig (cludwig_at_[hidden])
Date: 2004-11-12 04:45:27


On Thu, Nov 11, 2004 at 10:35:45PM -0800, Robert Ramey wrote:
[...]
> User program which use serialization library generally compile quite fast.
> No one has reported that serialization adds a disproportionate amount to
> compile time. I did get one reporte from one user with 80 classes that he
> was starting to get ICEs from VC 7.1 . But that's it.
>
> I believe that the size of library has lead some to have reservations
> regarding performance of the serialization library - compile time, link
> time, memory size, or execution speed - or its ease of usage. I think that
> such reservations are unfounded. No one who has actually used the library
> has voiced such reservations to me.

Here is some user experience:

"Fast" is not exactly the word I'd choose. I use Boost.Serialization in
a library that is templated on arbitrary precision integer and
floating point types. I think the library qualifies as medium
sized. (I don't have the LOCs, but the source code is about 6 MByte.)

Since most classes are serialized through (shared) pointers to base
types I need to register my specializations with
BOOST_CLASS_EXPORT. Earlier versions of the library made the
compiler allocate up to 800 MByte of RAM which caused my system to
thrash. After some refactoring of the code (and the installation of
additional memory) I have the following situation I can live with:

For each (integer, fpa) pair I have a separate
translation unit that does nothing but register the respective
specializations. Note that the class templates are explicitly
specialized in extra TUs; here I only register the classes (which also
means instantiation of the respective serialize member templates).
The compilation of each of those translation units takes about 90
seconds, respectively, and the compiler (gcc 3.4.2) occupies up to
400 MByte of RAM. This is on a notebook with a 1.7 GHz Pentium 4 mobile
CPU and 1 GByte RAM. I am sure this is mostly due to a suboptimal
handling of template instantiation in the compiler.

This is certainly no problem if you do not need to register your
classes. And I guess that the serialization code of non-template
classes will compile reasonable fast. But if I were to develop an even
larger template library than my current project then I would hesitate
to use Boost.Serialization unless I am convinced my compiler performs
well when the number of template instantiation grows.

Besides the problems with the resource usage of the compiler there's
another point that should IMHO be addressed before the next release:
The current solution for serializing shared pointers is quite fragile.

For example, I abonded trials to access my library from Python through
Boost.Python when the requirements of both libraries conflicted:
boost/serialization/shared_ptr.hpp has to be included before
boost/shared_ptr.hpp. And boost/serialization/shared_ptr.hpp includes
somehow (via config.hpp?) system headers. On the other hand, the
Python headers need to be included before any system headers. But the
Boost.Python headers include boost/shared_ptr.hpp... (There is an
inclusion sequence of system and boost headers that satisfies the
requirements of all libraries involved, I think. But I did not have
the time to look into it back then.)

I don't recall the exact reasons, but I also had to manually register
the specializations of boost::detail::sp_counted_base_impl. That's an
implementation detail of shared_ptr that I (as a mere user) don't want
to know about.

Christoph

-- 
http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html
LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk