Boost logo

Boost :

Subject: Re: [boost] [Serialization] Improving performance
From: Robert Ramey (ramey_at_[hidden])
Date: 2009-01-20 15:19:53


David Abrahams wrote:
> on Tue Jan 20 2009, "Hartmut Kaiser" <hartmut.kaiser-AT-gmail.com>
> wrote:
>
>> Hi all,
>>
>> in one of my projects I have a lot of types (>1000) to be serialized
>> using a pointer to a single base class. At some point we found the
>> serialization/deserialzation time to be O(N*M), where N is the
>> number of types and M the number of classes in the derivation
>> hierarchy.
>>
>> Wondering why this is so significant I started digging and
>> measuring. I found the type information registry used for the
>> void_upcast() to be the culprit. It's a plain std::vector<const
>> void_caster *>
>> ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:37) which is
>> searched sequentially a lot (once for each derived/base pair for
>> each serialization call). Moreover, this vector isn't even kept
>> sorted.
>>
>> it = std::find_if(
>> s.begin(),
>> s.end(),
>> void_cast_detail::match(& ca)
>> );
>>
>> ($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180).
>> Changing this to be a std::set improves the picture significantly!
>>
>> What's the reasoning behind using a std::vector<> instead of a
>> std::set<> or a similar indexed structure?

LOL - the reason is that in my wildest imagination I never concieved that
one would ever have more than a few derived types, much less that the
serialization library would ever have to deal with something like this.

I'm sure it's not tooooo hard to address this if you want to post a fix I
would
be happy to look at it.

And while your at it. Perhaps you might want to take a look into the
directory
boost/libs/serialization/performance/...

This directory includes all the infrastructure to generate profiling tables
with
the gcc compiler. It works with bjam, simple shell script and maybe
library_test
in order to implement this. I've tested with cygwin under windows XP.
It includes the a set of performance tests which generate
profiles. The tests aren't very good - I just copied over a few from the
../test
directory to test the performance infrastructure. After getting this far I
got
distracted by other stuff so there it stands.

If anyone were to add a performance_test for void_cast and review what I've
got I would be most greatful.

If someone want's a reasonable size Summer of Code 09 project I would also
very much like to see this happen.

Robert Ramey

> I have a highly optimized component in Boost.Python that I *believe*
> is doing the same job. Perhaps we should factor it out and share?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk