Boost logo

Boost Users :

Subject: Re: [Boost-users] [serialization] Deserialize by pointer withoutcalling new/constructor
From: averigin (averigin_at_[hidden])
Date: 2010-10-07 17:24:13


> It's not the most common scenario. It can be addressed, but requires
> some more complete understanding of the library. The library "just works"
> for the common cases.
>
> Think about why you have a base class pointer in your class which will
> "never change". That is, if you recover it with out re-creating it,
> you're
> assuming that in the meantime the following statement has not been
> executed.
>
> my_base_ptr = some_derived_ptr;
>
> That is, you're writing code on an assumption which you can't enforce.
> If someday, in some other code module, far, far, away .. someone
> executes the above, you'll have some unrecoverable mess to deal with
> with no clear idea of what the source of the problem is.
>
> Sooo - my suggestion is to replace
>
> struct my_thing {
> Base *m_ptr
> ...
> };
>
> with
>
> struct my_thing
> Base &m_b
> ...
> };
>
> This will "just work" as far as serialization is concerned.
>
> Now if making the above change in your program is hard to do, you should
> think about it a little bit more before you insert a booby trap into your
> program.
> Usually, this will entail some refactoring to get things back together
> again.
> But the result will be something more robust, less likely to contain bugs,
> easier to test and aintain.

Here's an attempt to further explain and (hopefully) clarify what's
happening and why. 4 pieces of background to start, then I'll try to tie
it all together.

1. In OMNeT, there are two main components to a simulation: Simple
Modules, which have associated functionality (i.e., code), and Compound
Modules, which are containers for simple modules and have no associated
code. Note, for reference, the module types are termed cSimpleModule and
cCompoundModule, both derived from cModule. All simple modules are derived
from cSimpleModule and all compound modules are derived from
cCompoundModule.

2. In a simulation it's possible to lookup another module (M_i) from the
active module (M_a) and invoke methods on M_i. The lookup functions return
a cModule pointer to M_i, which then must be dynamically cast to M_i's
most-derived type to invoke its methods. However, that cast requires
including M_i's header file in the code for M_a, which can result in ugly
dependency trees.
Thus, serializing by a looked-up cModule (base-class) pointer keeps things
'nice'.

3. When dynamically generating modules, they are created using a module
factory which configures some internal state information which is not
publicly accessible.
Also, note that the factory method returns a cModule pointer.

4. When I run simulations, serialization and deserialization never occur
during the same simulation; either the simulation starts fully configured
and eventually serializes its state immediately before terminating, or it
starts with a minimal skeleton and constructs the rest of the modules
based upon a previously stored state.

Now to put all the pieces together.

When saving a simulation state, it's necessary to serialize a number of
compound modules. So far, I've done this by adding a simple "archiver"
module to them. I lookup the "archiver" module using the lookup functions
and try to serialize the resulting cModule pointer. This causes the
"archiver" module to save what information is publicly accessible (i.e.,
there is some internal state information it cannot access).

When loading a saved state, the minimal skeleton includes one of the above
mentioned "archiver" modules. It creates the input archive and then tries
to create the other modules in the simulation.

This is where I run into problems.

What I wanted to do was have the "archiver" module create the other
modules using their associated factories and then deserialize the rest of
their state via the cModule pointers returned by the factories. This way
the factory configures the module's private internal state information and
the everything else is loaded from the archive, and this minimizes file
dependencies (i.e., including header files for derived types). (I realize
this isn't true (de)serialization, so I refer to it as
partial-serialization.)

However, this conflicts with the normal method for pointer
deserialization, which would allocate memory for a new module, initialize
it and then deserialize the information from the archive into that new
object, which (in my case) is then promptly deleted or leaked.

The only alternative I can see would be to dynamically cast the cModule
pointers and directly call their serialize() or save()/load() methods
directly. However, this seems like an ugly hack since that requires making
the serialization methods public and including derived-class headers all
over the place.

Thus, I would like to stick with my partial-serialization method. I'm just
not sure if it's possible to "easily" implement what I need for the
loading portion.
>
> Of course if you're working in an environment whose controlling principle
> is "just make it work right now", you're out of luck. You'll just have to
> bury the bomb in your program and hope you're not in the immediate
> vicinity when it blows up.
>

While I don't quite see it as a bomb, I know that probably just indicates
short-sightedness on my part. Regardless, I am in the "just make it work
right now" boat, so any help you can offer would be much appreciated.

Regards,
Adam Verigin

P.S.: I did manage a temporary hack of archive/detail/iserializer.hpp
pointer_iserializer::load_object_ptr() that (badly) assumed that if the
passed pointer is not NULL, then it is a pointer to an allocated object so
there's no need to create a new object. Coupled with a do-nothing
load_construct_data() method this worked great for
partial-deserialization, but causes memory corruption for normal
deserialization since the assumption is not always true (nor is it a good
assumption to try an force on other users).


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net