Boost logo

Boost :

Subject: [boost] [Serialization] Improving performance
From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2009-01-20 14:05:08


Hi all,

in one of my projects I have a lot of types (>1000) to be serialized using a
pointer to a single base class. At some point we found the
serialization/deserialzation time to be O(N*M), where N is the number of
types and M the number of classes in the derivation hierarchy.

Wondering why this is so significant I started digging and measuring. I
found the type information registry used for the void_upcast() to be the
culprit. It's a plain std::vector<const void_caster *>
($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:37) which is searched
sequentially a lot (once for each derived/base pair for each serialization
call). Moreover, this vector isn't even kept sorted.

    it = std::find_if(
        s.begin(),
        s.end(),
        void_cast_detail::match(& ca)
    );

($BOOST_ROOT_1_37/libs/serialization/src/void_cast.cpp:180). Changing this
to be a std::set improves the picture significantly!

What's the reasoning behind using a std::vector<> instead of a std::set<> or
a similar indexed structure?

Regards Hartmut


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk