Boost logo

Boost Users :

Subject: Re: [Boost-users] Serialization: BOOST_CLASS_EXPORT changes between1.38 and 1.52
From: Robert Ramey (ramey_at_[hidden])
Date: 2013-02-07 11:36:35


Nathan Whitehorn wrote:

> The reason turns out to be changes in handling of the class GUID. This
> is associated to the pointer serializer by BOOST_CLASS_EXPORT -- but
> also now apparently by the first instantiated code [de]serializing the
> class through that archive directly based on the presence of a
> template specialization of boost::serialization::guid<T>() currently
> in scope, which is NULL by default. Once NULL is set as the GUID, it
> stays that way, regardless of a proper BOOST_CLASS_EXPORT later. This
> occurred in our case as a result of def_pickle() calls in
> boost.python bindings that used the same classes and archive type.
>
> This can be fixed by BOOST_CLASS_EXPORT_KEY in all appropriate header
> files. Adding it is a sizeable amount of work (hundreds of classes)
> but I'm more concerned the mechanism is very fragile in the case of
> templates and I'm not sure what to do there. For example, we have a
> generic serializable container. Any specialization of it needs to be
> registered with BOOST_CLASS_EXPORT, which is fine, but anyone using
> that specialization needs to also include a header file
> BOOST_CLASS_EXPORT_KEY() in it for that particular specialization as
> well. In other words, you can't just use container<Foo> and expect it
> to work reliably and, moreover, if you use it *once* without having
> seen the right in-scope macro, it will break serialization of that
> type globally if you happen to have linked against a library that
> made this mistake. Probably. Depending on initialization order.

It's been a while since I did this, but I'll try to respond to the best of
my
recollection.

BOOST_CLASS_EXPORT in it's original form created a lot of problems.
This is/was expecially true in the case of DLLS where instances of the
"guid record" were created every time a class was referred to. This
created a bunch of "dangling" guid records. The fix was to make clear
the distiction between declaring a key and instantiating the guid record.
This sounds simple and obvious as I explain it here, but in practice it
took a while to figure out exactly what to do. On top of that, there is
the issue of getting the right stuff instantiated which required a bunch
of wierd TMP to implement. Much of this code was contributed by our
own TMP guru - David Abrahams.

The current situation is "more correct". So though I appreciate what a
pain it is to change all the BOOST_CLASS_EXPORT to
BOOST_CLASS_EXPORT_KEY - I think this is the best solution as
it will make your system better and less dependent on the "quirky"
behavior of BOOST_CLASS_EXPORT.

> So my questions:
> 1. Is it possible for things that would return GUIDs of NULL to try
> harder and look in a global registry instead of silently breaking
> things? This kind of global lookup was how 1.38 always worked and it
> seems considerably less fragile.

The old method did instantiation by default so it worked then in
some cases where the current one won't. So it seems "less fragile".
But I think that's sort of an illusion. It does so at a cost of gratuitous
instantiations which often are harmless - though non-optimal. But
the real problem is that it left this out of the hands of the programmer.
This could lead to silent and surprising behavior. Now we have the
situtation where this behavior can't happen - we have to explicitly
plan for it. I believe that this leads to less surprising programs - albiet
at the cost of some surprising behavior at build time.

> 2. Is there a way to handle BOOST_CLASS_EXPORT_KEY() sanely in the
> case of templates without the risk of silent serialization failures
> -- in all instances of that class -- that depend on global
> initialization order?

I believe that the best way to do this is to just do an explicit
instantiation
in a cpp file which imports the header containing BOOST_CLASS_EXPORT_KEY()
and includes BOOST_CLASS_EXPORT_IMPL(). Once compiled, this
can be added to a library or DLL. This will result in one and only instance
of
the class serialization existing in the program rather than mulitple ones
(in the case of DLLS). Less code and better yet, this eliminates the
possibility that the mainline module and the dll have different versions of
the code which would be agony of the worst type to track down.

> 3. Is it possible to change the GUID set in the extended type info
> object of a pointer_[i/o]serializer at runtime after the class has
> been added to the export registry?

I have never considered this. I don't see what this would be used for.
The singleton class table is never modified after it is constructed
(before main is called). This is necessary for the serialization library
to be thread-safe.

> 4. Are there any suggested mechanisms for local hacks, given that we
> control the archive implementation, to implement 1-3 without changes
> to boost?

to re-summarize my suggestion above.

a) change all the headers to use BOOST_CLASS_EXPORT_KEY()
b) make a small *.cpp file for each header which imports the header
and invokes BOOST_CLASS_EXPORT_IMPL().
c) add your small *.cpp file to your library - either static library or dll.
d) while you're at it, you might want to consider adding the serialize,
save, and load functions for the class to the *.cpp file and not making
them inline. This will eliminate any code bloat generated by the
serialization
library. If your DLLS are dynamically loadable, they will only occupy
memory when the the classes they refer to are actually being used at
runtime.
(just don't load/unload the DLLS while multi-threading - use a mutex!)

it seems you've touched upon the issue regarding serialization of
template classes. This was also touched upon in a previous email.
Currently we have to explicitly instantiate any templates we want
to serialize. Automatically instantion of template generated classes
using some combination of enable_if, partial specialization and who
knows what else is interesting to consider, but likely much trickier
than first meets eye. Also our "guid" is a string which can only be
processed at runtime. Replacing this with a "guid" generated at
compile time from the class name, might make somethings possible
which weren't before. This is sort of irrelevant to your current
situation, but I like to keep the pot boiling.

Robert Ramey

> -Nathan


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net