Boost logo

Boost Users :

Subject: Re: [Boost-users] boost serialize - stack overflow for large problems
From: Robert Ramey (ramey_at_[hidden])
Date: 2010-09-29 12:26:56


Jorg F. Unger wrote:
> Robert Ramey <ramey <at> rrsd.com> writes:
>
>>
>> Jorg F. Unger wrote:
>>> Robert Ramey <ramey <at> rrsd.com> writes:
>>>
>>>>
>>>> "Jörg F. Unger" wrote:
>>>>> I have implemented the serialization routine for a rather complex
>>>>> structure with many nested classes. The serialization seems to
>>>>> work for small examples, however, for bigger data sets, I get a
>>>>> stack overflow error. If I increase the stack size from 8MB to
>>>>> 20MB, the examples seems to work. However, I'd like to optimize
>>>>> the code the way that I do not need to increase the stack size
>>>>> (especially, since not all users can do that themselves)
>>>>
>>>> Are you referring to compile time or runtime?
>>>>
>>>>
>>>> If the source is that there is VERY deep nesting of data
>>>> structures, you'll just have to refactor the data. This isn't a
>>>> hardship since you would likely have other issues generated by
>>>> this besides serialization.
>>>>
>>>> Robert Ramey
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Boost-users mailing list
>>>> Boost-users <at> lists.boost.org
>>>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>>
>>> The problem occurs at runtime. I 've figured out that the problem is
>>> due to the nested data structure. I have kind of a graph, where
>>> every node stores a pointer to the neighboring node. This means,
>>> once I start to serialize the first node, all the neighboring nodes
>>> have to be serialized as well, which finally ends up with at many
>>> recursive function calls as I have nodes. This is probably similar
>>> to a linked list, where once the first entry is serialized, a
>>> pointer to the next element is serialized and so on, which finally
>>> ends up with as many recursive function calls as there are elements
>>> in the list. Is there a clever way to restructure the serialization
>>> (e.g. first serializing all the nodes, and in a second step, when
>>> the nodes have been created, serialize there connectivity without
>>> introducing two separate serialize routines?)
>>
>> Generally the library "tracks" those things serialized by pointers
>> so that there is not problem with cycles etc. If you can list all
>> your nodes ahead of time you can serialize them. Then when you
>> serialize your graph, there will be no recursion.
>>
>> Robert Ramey
>>
>>
>>
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users <at> lists.boost.org
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>
> The problem is that the data structure is as follows
>
> class container
> {
> std::vector<object*> myObjects;
> };
>
> class object
> {
> std::vector<object*> neighborObjects;
> }
>
> The problem is, how to split the node information from the graph
> information. I've implemented a serialize routine for the container,
> which then calls the serialize routine for the object, which
> recursively calls the serialize routine of the neighboring objects.
> That means, the first function call to the serialize routine in the
> object class is only finished, if all the neighbors have been
> serialized. Similarly, the serialize routine of the first neighbor
> requires all other neighbors to be serialized. Although the serialize
> routine can track the pointers, this leads to a stack problem, since
> the number of recursive serialize routines to be called (which are
> all stored on the stack) is equivalent to the number of objects.

I totally understand the problem. But messing with the serialization
is the wrong way to go about it.

question:How do you delete the graph? You have to delete all the
elements. If you do it cursively, you'll have the same problem
of deep nesting. I doubt that the graph "own's" the objects pointed to.
So you must have a way to list all the nodes without traveling through
the graph. Assumnig you do, serialize all the nodes this way - but
DON'T include the pointers to adjacent nodes in your serialize function.

This should resolve your problem.

Robert Ramey

> Is
> there a clever why to serialize first all the objects with its data
> (these have been omitted in the class definition for simplicity) and
> then serialize the neighbors information without changing the data
> structure?
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net