Boost logo

Boost Users :

Subject: Re: [Boost-users] crash in boost serialization (1.44)
From: Robert Ramey (ramey_at_[hidden])
Date: 2010-12-10 12:35:49


Guy Prémont wrote:
>> Jeff Flinn wrote:
>>> Guy Primont wrote:
>>>> I had the same problem, i could pin point to the fact that when
>>>> serialization is in a DLL, the i/oserializer and
>>>> pointer_i/oserializer are created for each DLL that uses
>>>> serialization of a class (and in the main exe if there is code
>>>> there that uses serialization for this class). The singleton are
>>>> always
>> dll
>>>> exported, never imported. To make a long story short, in the
>>>> serialization libs, the method
>>>> basic_iarchive_impl::register_type(const basic_iserializer & bis)
>>>> is called several times and may overwrite the bpis_ptr that was set
>>>> earlier. To fix this, I had to rebuild boost.serialization by and
>>>> modify basic_iarchive.cpp.
>>>> I changed the line:
>>>> coid.bpis_ptr = bis.get_bpis_ptr();
>>>> to
>>>> if (coid.bpis_ptr == 0) coid.bpis_ptr = bis.get_bpis_ptr();
>>>>
>>>> But, there are a few issues with serialization methods when they
>>>> are implemented inside a dll and called from outside.
>>>>
>>>> Guy
>>>>
>>>> --
>>>> Guy Primont, D.Sc.
>>>> Architecte logiciel senior / Senior software architect CM Labs
>>>> Simulations Inc. http://www.cm-labs.com/ Tel. 514-287-1166 ext. 237
>>>>> -----Original Message-----
>>>>> From: boost-users-bounces_at_[hidden] [mailto:boost-users-
>>>>> bounces_at_[hidden]] On Behalf Of rico.cadetg_at_[hidden]
>>>>> Sent: Wednesday, December 08, 2010 11:20 AM
>>>>> To: boost-users_at_[hidden]
>>>>> Subject: [Boost-users] crash in boost serialization (1.44)
>>>>>
>>>>> Hello,
>>>>>
>>>>> Ive noticed a crash (accessing a null pointer) inside the boost
>>>>> serialization (version 1.44). I use visual studio 2005, the code
>>>>> to serialize is in a Dll. I have a small visual studio solution
>>>>> that reproduces the crash, I guess it is a bug and should cause a
>>>>> crash in other environments as well.
>>>>>
>>>>> I have the following classes
>>>>>
>>>>> INode (Interface)
>>>>> Node (Abstract, derived from INode)
>>>>> LeafNode(derived from Node)
>>>>> TableNode(derived from Node)
>>>>>
>>>>> The TableNode has a list of INode*, the children. A TableNode
>> itself
>>>>> can be a child of a TableNode.
>>>>>
>>>>> All classes are inside a Dll and correctly exported (I have no
>>>>> unregistered class exceptions).
>>>>>
>>>>> If I create a TableNode with some entries including another
>>>>> TableNode and serialize it as an object, I get the crash when
>> trying
>>>>> to deserialize later. If I do the same, but serialize the parent
>>>>> TableNode as a pointer, it works.
>>>>>
>>>>> I guess the serialization framework does not recognize that a
>>>>> TableNode gets serialized through a pointer and therefore the
>> object
>>>>> tracking does not work correctly.
>>>>>
>>>>> I can send the example solution if needed.
>>>>>
>>>>> The crash is in the basic_iarchive.cpp, line 456 (bpis_ptr is
>> null):
>>>>> if(! tracking){
>>>>> bpis_ptr->load_object_ptr(ar, t, co.file_version); }
>>>
>>> Have you passed this on to Robert? Any downside to doing this?
>>
>> First of all, I'm impressed with your understanding of the
>> subtleities of the library implemenation. This is not easy to
>> achieve.
>>
>> I noticed this message and in fact marked it as important (an
>> exceedingly rare occurance for me). I wanted to think about it some.
>> My current thought is that this is a bad idea. The problems occurs
>> when the same code is found more than one execution module (dll or
>> main). I feel that, besides being wasteful, there is the potential
>> that the code gets out of sync - e.g. when a newer version of the
>> program invokes and older version DLL. There is code in the library
>> which traps this condition, but I had to comment it out because
>> avoiding this condition required more re-organization of user code
>> than users could handle. So now we have the case were one includes a
>> potentially very difficult to find error in one's program in order to
>> avoid organizing one's code so that the situation can't happen. I'm
>> sure that this can occur in other scenarios which use template code
>> in multiple execution modles, but it seems that it's come up more
>> frequently with the serialization library.
>>
>> The problem with the above fix is that it leaves the source of the
>> problem in the code with the potential that it could arise somewhere
>> else which I (and I don't think anyone else) can forsee. So I still
>> recommend that users organize their code so that this situation
>> cannot occur.
>>
>> However, whenever I do this, someone will always say "What's wrong
>> with doing this way? What could go wrong?" The answer is "I don't
>> know - and will never know". The answer to the answer is "well,
>> then it must be OK". Which doesn't follow as far as I'm concerned.
>>
>> So, what I plan to do is to re-enable the trap which detects
>> violation of the ODR, and permit an explicit override on the part of
>> the user. The idea is that I'll be able to avoid being responsable
>> for what practices that I can't recommend.
>>
>> Robert Ramey
>>
>
> In my opinion, the actual reason for the problem is the duplication
> (in DLLs) of i/oserializer and pointer_i/oserializer for the same
> class.

agreed.

>Those should be exported and instantiated only in the DLL that
> contain the actual serialization code.

As far as I know - and I spent a lot of time on this - there is no way
to do this with current compilers.

> In the current implementation,
> the various serializers are instantiated, through a singleton, at the
> point of use. If two classes from two different DLLs have a member of
> a certain type, both DLLs will instantiate singletons for serializers.

This is the behavior of all current compiler/linker combinations. It
is not addressable from within a library or application.

> When you say that the same code is found more than one execution
> module, you are not talking about the T::serialize(Archive&, int) for
> each class, are you? Because that code is indeed in only one DLL. The
> code that goes from ar & t; // DLL A
> to
> t->serialize(ar,version); // DLL B
> is all generated by templates. In this case, it would be in DLL A.
> Any DLL, or application, that serialize a type T will contains that
> code. I think the problem lies in that murky area.

any time you use ar << t in more than one runtime module, you'll get
multiple
implemenations generated. The only way to avoid this is to use a different
idiom:

in the header:
class mytype {
...
template<class Archive>
serialize(Archive &ar, const unsigned int version);
...};

in the dll

template class::serialize(text_iarchive & ar, const unsigned version){
    ...
    ar << ...
    ...
};

and even that might not be enough since one has to watch the classes
from which mytype is derived.

Robert Ramey

> I'm switching to boost 1.45 now (was using 1.40). Maybe the change in
> implementation will alleviate a few of these problems.

I doubt it.

Robert Ramey

> Thanks
> Guy


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net