Boost logo

Boost Users :

Subject: Re: [Boost-users] crash in boost serialization (1.44)
From: Guy Prémont (guy.premont_at_[hidden])
Date: 2010-12-10 12:05:40


>
> Jeff Flinn wrote:
> > Guy Primont wrote:
> >> I had the same problem, i could pin point to the fact that when
> >> serialization is in a DLL, the i/oserializer and
> >> pointer_i/oserializer are created for each DLL that uses
> >> serialization of a class (and in the main exe if there is code there
> >> that uses serialization for this class). The singleton are always
> dll
> >> exported, never imported. To make a long story short, in the
> >> serialization libs, the method
> >> basic_iarchive_impl::register_type(const basic_iserializer & bis) is
> >> called several times and may overwrite the bpis_ptr that was set
> >> earlier. To fix this, I had to rebuild boost.serialization by and
> >> modify basic_iarchive.cpp.
> >> I changed the line:
> >> coid.bpis_ptr = bis.get_bpis_ptr();
> >> to
> >> if (coid.bpis_ptr == 0) coid.bpis_ptr = bis.get_bpis_ptr();
> >>
> >> But, there are a few issues with serialization methods when they are
> >> implemented inside a dll and called from outside.
> >>
> >> Guy
> >>
> >> --
> >> Guy Primont, D.Sc.
> >> Architecte logiciel senior / Senior software architect CM Labs
> >> Simulations Inc. http://www.cm-labs.com/ Tel. 514-287-1166 ext. 237
> >>> -----Original Message-----
> >>> From: boost-users-bounces_at_[hidden] [mailto:boost-users-
> >>> bounces_at_[hidden]] On Behalf Of rico.cadetg_at_[hidden]
> >>> Sent: Wednesday, December 08, 2010 11:20 AM
> >>> To: boost-users_at_[hidden]
> >>> Subject: [Boost-users] crash in boost serialization (1.44)
> >>>
> >>> Hello,
> >>>
> >>> Ive noticed a crash (accessing a null pointer) inside the boost
> >>> serialization (version 1.44). I use visual studio 2005, the code to
> >>> serialize is in a Dll. I have a small visual studio solution that
> >>> reproduces the crash, I guess it is a bug and should cause a crash
> >>> in other environments as well.
> >>>
> >>> I have the following classes
> >>>
> >>> INode (Interface)
> >>> Node (Abstract, derived from INode)
> >>> LeafNode(derived from Node)
> >>> TableNode(derived from Node)
> >>>
> >>> The TableNode has a list of INode*, the children. A TableNode
> itself
> >>> can be a child of a TableNode.
> >>>
> >>> All classes are inside a Dll and correctly exported (I have no
> >>> unregistered class exceptions).
> >>>
> >>> If I create a TableNode with some entries including another
> >>> TableNode and serialize it as an object, I get the crash when
> trying
> >>> to deserialize later. If I do the same, but serialize the parent
> >>> TableNode as a pointer, it works.
> >>>
> >>> I guess the serialization framework does not recognize that a
> >>> TableNode gets serialized through a pointer and therefore the
> object
> >>> tracking does not work correctly.
> >>>
> >>> I can send the example solution if needed.
> >>>
> >>> The crash is in the basic_iarchive.cpp, line 456 (bpis_ptr is
> null):
> >>> if(! tracking){
> >>> bpis_ptr->load_object_ptr(ar, t, co.file_version); }
> >
> > Have you passed this on to Robert? Any downside to doing this?
>
> First of all, I'm impressed with your understanding of the subtleities
> of the library implemenation. This is not easy to achieve.
>
> I noticed this message and in fact marked it as important (an
> exceedingly rare occurance for me). I wanted to think about it some.
> My current thought is that this is a bad idea. The problems occurs
> when the same code is found more than one execution module (dll or
> main). I feel that, besides being wasteful, there is the potential
> that the code gets out of sync - e.g. when a newer version of the
> program invokes and older version DLL. There is code in the library
> which traps this condition, but I had to comment it out because
> avoiding this condition required more re-organization of user code than
> users could handle. So now we have the case were one includes a
> potentially very difficult to find error in one's program in order to
> avoid organizing one's code so that the situation can't happen. I'm
> sure that this can occur in other scenarios which use template code in
> multiple execution modles, but it seems that it's come up more
> frequently with the serialization library.
>
> The problem with the above fix is that it leaves the source of the
> problem in the code with the potential that it could arise somewhere
> else which I (and I don't think anyone else) can forsee. So I still
> recommend that users organize their code so that this situation cannot
> occur.
>
> However, whenever I do this, someone will always say "What's wrong with
> doing this way? What could go wrong?" The answer is "I don't know -
> and will never know". The answer to the answer is "well, then it must
> be OK". Which doesn't follow as far as I'm concerned.
>
> So, what I plan to do is to re-enable the trap which detects violation
> of the ODR, and permit an explicit override on the part of the user.
> The idea is that I'll be able to avoid being responsable for what
> practices that I can't recommend.
>
> Robert Ramey
>

In my opinion, the actual reason for the problem is the duplication (in
DLLs) of i/oserializer and pointer_i/oserializer for the same class. Those
should be exported and instantiated only in the DLL that contain the actual
serialization code. In the current implementation, the various serializers
are instantiated, through a singleton, at the point of use. If two classes
from two different DLLs have a member of a certain type, both DLLs will
instantiate singletons for serializers.

When you say that the same code is found more than one execution module, you
are not talking about the T::serialize(Archive&, int) for each class, are
you? Because that code is indeed in only one DLL. The code that goes from
  ar & t; // DLL A
to
 t->serialize(ar,version); // DLL B
is all generated by templates. In this case, it would be in DLL A. Any DLL,
or application, that serialize a type T will contains that code. I think the
problem lies in that murky area.

I'm switching to boost 1.45 now (was using 1.40). Maybe the change in
implementation will alleviate a few of these problems.

Thanks
Guy


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net