Boost logo

Boost Users :

Subject: [Boost-users] [Serialization] Segfault while serializing derived pointers using multi DLLs
From: François Mauger (mauger_at_[hidden])
Date: 2011-03-17 09:29:10


Hi all (and particularly Boost/(De-)Serializer),

I use Boost 1.44, gcc 4.4.1, Linux.

The problem:

I have two home made libraries compiled as DLL under linux:
- 'datatools' provides the 'libdatatools.so' DLL
- 'brio' provides the 'libbrio.so' DLL
I use Boost serialization features for derived pointers.

Below are the details:

STEP 1:

'datatools' is the base library. It defines:
- its own namespace: 'datatools'
- a virtual class (interface) named 'i_serializable'
  from which all other serializable classes should inherit
  in order to benefit of the (de)serialization mechanism through
  pointer to this base class.
  [see
http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/serialization.html#derivedpointers]
- some concrete classes (A, B, C) that inherit from the 'i_serializable'
  interface and register themselves using
  the export key features described in
  
http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/special.html#export

Typical inheritance diagram looks like:
<pre>
datatools::i_serializable
|
+--------------+--------------+
| | |
datatools::A datatools::B datatools::C
|
datatools::A'
</pre>

Here is the typical model of the 'A.hpp' header file for the A class:
<pre>
...
#include <datatools/serialization/archives_list.hpp> // include
Boost/Serialization text/XML/binary archives
#include <datatools/serialization/i_serializable.hpp> // include the
abstract mother interface class

...

namespace datatools {

  class A: public i_serializable
  {
    blah-blah..

    // no inline code (from
http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/special.html#dlls)
    template<class Archive>
    void serialize (Archive & ar,
                    const unsigned int version);
  };
}

// register the class with a specific GUID:
BOOST_CLASS_EXPORT_KEY2 (datatools::A, "datatools::A");
</pre>

Here is the model of the 'A.cpp' implementation file:
<pre>
namespace datatools {
...
template<class Archive>
void A::serialize (Archive & ar,
                   const unsigned int version)
{
  ar & boost::serialization::make_nvp(
         
"datatools__serialization__i_serializable",

         
boost::serialization::base_object<datatools::serialization::i_serializable
>(*this)
       );
  ar & more data (with NVP stuff)...;
}

} // end of namespace datatools

BOOST_CLASS_EXPORT_IMPLEMENT(datatools::A)

// explicit instantiation for all kind of known archives:
#include <datatools/serialization/archives_list.hpp> // include the
known text/XML/binary archives
template void datatools::A::serialize(boost::archive::text_oarchive &
ar, const unsigned int version);
template void datatools::A::serialize(boost::archive::text_iarchive &
ar, const unsigned int version);
template void datatools::A::serialize(boost::archive::xml_oarchive & ar,
const unsigned int version);
... more...
</pre>

Finaly, I can compile all this stuff using gcc and build the
'libdatatools.so' library which I prepend to my
LD_LIBRARY_PATH. Everything looks fine.

A test program 'prg1.cpp' that links against only 'libdatatools.so'
and 'libboost_serialization.so' works prefectly, serialiazing and
deserialiazing any collection of pointers to A,B, or C classes without
problem. A must ! Thanks to Robert for that magic !

At this point, everything looks (is?) fine. Note that I have followed
(in principle) all the guidelines provided by Robert.

STEP 2:

Now let's consider the actual problem ! As said before, my
'datatools' library is the base of some modular project with some
other libraries that depend on 'datatools' (and Boost/Serialization).

The 'brio' library is such a beast:

<pre>
Boost/Serialization
|
datatools
|
brio
</pre>

It has its own namespace: 'brio'
It provides a few other dedicated classes, inherited from the
'datatools::i_serializable' abstract class and which are serializable
via Boost.

Let's consider the serializable 'brio::D' class, designed on the model
of 'datatools::A' and using the same implementation recommendations.
I have followed the guidelines use for the 'datatools::A' class to
write both 'D.hpp' and 'D.cpp' files.

Now the inheritance scheme is:

<pre>
datatools::i_serializable :
| :
+--------------+--------------+------------+-----+- - - -
| | | : |
datatools::A datatools::B datatools::C : brio::D
| :
datatools::A' :
                libdatatools.so scope : libbrio.so scope
                                           :
</pre>

I can compile the 'libbrio.so' DLL without any problem.

Now I want to run a sample program 'prg2.cpp' that performs some
(de)serialization operations on a collection of pointers to
'datatools::A', 'datatools::B', 'datatools::C' AND 'brio::D'
instances. This program is linked against the following libraries
(among others):
- libbrio.so
- libdatatools.so
- libboost_serialization.so

Well, it compiles perfectly. Note this program links against
third-party libraries too, among them some are explicitely using
'dlopen' and 'dlclose' to satisfy internal and critical features that
are out of my scope. I have no idea if this can have side-effect.

However, when I run it, I observed the following behaviour:
- all (de)serialization operations are done properly
  and I get files with embeded (text/XML...) portable archives than can
be reloaded
  without problem.
- at the END of the program, while some cleaning code is invoked (some
  kind of deep buried code out of my skills and understanding), I get
  a Segmentation fault.
Here is a dump of the GDB backtrace:

<pre>
Program received signal SIGSEGV, Segmentation fault.
0x02647c78 in
boost::serialization::typeid_system::extended_type_info_typeid_0::is_less_than(boost::serialization::extended_type_info
const&) const () from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
(gdb) bt
#0 0x02647c78 in
boost::serialization::typeid_system::extended_type_info_typeid_0::is_less_than(boost::serialization::extended_type_info
const&) const ()
   from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
#1 0x0264737b in
boost::serialization::extended_type_info::operator<(boost::serialization::extended_type_info
const&) const () from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
#2 0x0264dcac in
boost::serialization::void_cast_detail::void_caster::operator<(boost::serialization::void_cast_detail::void_caster
const&) const () from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
#3 0x0264e56d in
boost::serialization::void_cast_detail::void_caster::recursive_unregister()
const ()
   from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
#4 0x0264ed8d in
boost::serialization::void_cast_detail::void_caster_shortcut::~void_caster_shortcut()
()
   from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
#5 0x0264e5ee in
boost::serialization::void_cast_detail::void_caster::recursive_unregister()
const ()
   from
/scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0
#6 0x024a6da7 in
boost::serialization::void_cast_detail::void_caster_primitive<datatools::test::more_data_t,
datatools::test::data_t>::~void_caster_primitive() ()
   from
/home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so
#7 0x024a6f04 in
boost::serialization::detail::singleton_wrapper<boost::serialization::void_cast_detail::void_caster_primitive<datatools::test::more_data_t,
datatools::test::data_t> >::~singleton_wrapper() ()
   from
/home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so
#8 0x0298c428 in __cxa_finalize (d=0x2609830) at cxa_finalize.c:56
#9 0x02426f04 in __do_global_dtors_aux ()
   from
/home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so
#10 0x02534100 in _fini ()
   from
/home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so
#11 0x0011dee6 in _dl_fini () at dl-fini.c:248
#12 0x0298c05f in __run_exit_handlers (status=0, listp=0x2a9e304,
run_list_atexit=true) at exit.c:78
#13 0x0298c0cf in *__GI_exit (status=0) at exit.c:100
#14 0x02973b5e in __libc_start_main (main=0x8059495 <main>, argc=1,
ubp_av=0xbfffcfd4, init=0x8061c40 <__libc_csu_init>,
    fini=0x8061c30 <__libc_csu_fini>, rtld_fini=0x11dcc0 <_dl_fini>,
stack_end=0xbfffcfcc) at libc-start.c:252
#15 0x080592c1 in _start () at ../sysdeps/i386/elf/start.S:119
</pre>

If one ignores the nasty details from this stack (local pathes and
names), one observe that the problem seems to be related to some
unregistration of some Boost/Serialization material. It occurs while
the executable is trying to destruct some singleton_wrapper template
class that manages some serializable classes from the 'datatools'
library:
- class 'datatools::test::more_data_t' (call it A')
- and its mother class 'datatools::test::data_t', (call it A) inherited
from 'datatools::i_serializable'.

I expect such singleton is a static instance attached in some DLL. Am
I wrong ? If not, which DLL is concerned 'libdatatools.so',
'libbrio.so' ? My feeling is that I have a problem with some
arbitrary order of library unloading and messy unregistration that
comes with. Unless there is a specific order to aggregate module
within in DLL (A.o B.o A'.o...). Unfortunately, my skills are too
limited to make a better
idea and find a solution. There is some comments by Robert concerning
such possible problems, but I'm not sure it makes sense in my case.

So I will really appreciate if someone could advise me and possibly
give me some hints.

Thanks a lot for attention and help.
Apologize for this rather long and technical issue.

Regards

frc

-- 
François Mauger 
  Groupe "Interactions Fondamentales et Nature du Neutrino"
  NEMO-3/SuperNEMO Collaboration
LPC Caen-CNRS/IN2P3-UCBN-ENSICAEN
Département de Physique -- Université de Caen Basse-Normandie
Adresse/address:
  Laboratoire de Physique Corpusculaire de Caen (UMR 6534)
  ENSICAEN 
  6, Boulevard du Marechal Juin
  14050 CAEN Cedex
  FRANCE
Courriel/e-mail: mauger_at_[hidden] 
Tél./phone:      02 31 45 25 12 / (+33) 2 31 45 25 12
Fax:             02 31 45 25 49 / (+33) 2 31 45 25 49

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net