|
Boost Users : |
Subject: Re: [Boost-users] boost serialize - stack overflow for large problems
From: Jorg F. Unger (j-unger_at_[hidden])
Date: 2010-09-29 00:39:34
Robert Ramey <ramey <at> rrsd.com> writes:
>
> Jorg F. Unger wrote:
> > Robert Ramey <ramey <at> rrsd.com> writes:
> >
> >>
> >> "Jörg F. Unger" wrote:
> >>> I have implemented the serialization routine for a rather complex
> >>> structure with many nested classes. The serialization seems to work
> >>> for small examples, however, for bigger data sets, I get a stack
> >>> overflow error. If I increase the stack size from 8MB to 20MB, the
> >>> examples seems to work. However, I'd like to optimize the code the
> >>> way that I do not need to increase the stack size (especially, since
> >>> not all users can do that themselves)
> >>
> >> Are you referring to compile time or runtime?
> >>
> >>
> >> If the source is that there is VERY deep nesting of data structures,
> >> you'll just have to refactor the data. This isn't a hardship since
> >> you would likely have other issues generated by this besides
> >> serialization.
> >>
> >> Robert Ramey
> >>
> >>
> >>
> >> _______________________________________________
> >> Boost-users mailing list
> >> Boost-users <at> lists.boost.org
> >> http://lists.boost.org/mailman/listinfo.cgi/boost-users
> >
> > The problem occurs at runtime. I 've figured out that the problem is
> > due to the nested data structure. I have kind of a graph, where every
> > node stores a pointer to the neighboring node. This means, once I
> > start to serialize the first node, all the neighboring nodes have to
> > be serialized as well, which finally ends up with at many recursive
> > function calls as I have nodes. This is probably similar to a linked
> > list, where once the first entry is serialized, a pointer to the next
> > element is serialized and so on, which finally ends up with as many
> > recursive function calls as there are elements in the list. Is there a
> > clever way to restructure the serialization (e.g. first serializing
> > all the nodes, and in a second step, when the nodes have been created,
> > serialize there connectivity without introducing two separate
> > serialize routines?)
>
> Generally the library "tracks" those things serialized by pointers
> so that there is not problem with cycles etc. If you can list all your
> nodes ahead of time you can serialize them. Then when you
> serialize your graph, there will be no recursion.
>
> Robert Ramey
>
>
>
> _______________________________________________
> Boost-users mailing list
> Boost-users <at> lists.boost.org
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
The problem is that the data structure is as follows
class container
{
std::vector<object*> myObjects;
};
class object
{
std::vector<object*> neighborObjects;
}
The problem is, how to split the node information from the graph information.
I've implemented a serialize routine for the container, which then calls the
serialize routine for the object, which recursively calls the serialize routine
of the neighboring objects. That means, the first function call to the serialize
routine in the object class is only finished, if all the neighbors have been
serialized. Similarly, the serialize routine of the first neighbor requires all
other neighbors to be serialized. Although the serialize routine can track the
pointers, this leads to a stack problem, since the number of recursive serialize
routines to be called (which are all stored on the stack) is equivalent to the
number of objects. Is there a clever why to serialize first all the objects with
its data (these have been omitted in the class definition for simplicity) and
then serialize the neighbors information without changing the data structure?
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net