Boost logo

Boost :

From: Dan Notestein (dan_at_[hidden])
Date: 2006-08-18 09:59:28

Robert Ramey wrote:
> For me a one pass solution was a requirement - even though
> I didn't state it explicitly. I just presumed that users would
> object to the exra cost of a second pass. So I never really
> considered it.

I don't believe a second pass as described would have a significant
impact on serialization performance. Similar operations are being
performed during the one-pass serialization now, and the additional
traversal itself should not be expensive compared to the time it takes
to perform the operations on each element (even when we're not
serializing to a slow medium, such as a hard disk). Also note that
during the first pass, no data is actually being serialized to a medium,
it's just performing an analysis to determine which objects are
contained within other objects.

For many situations, I think the proposed algorithm would actually be
faster. For example, when serializing a vector of trackable objects,
the current implementation must add each element of the vector to the
object-id lookup tables in case there is a pointer to the element
somewhere. In the proposed implementation, the elements don't need to be
added to the object-id table, since we know the boundaries of the
containing object and we would be using an offset into the containing
object for any pointer references to these elements. So for any
datastructure that had arrays or vectors with a large number of
trackable elements, the object lookup tables would be drastically
smaller, which should yield better speed performance as well.

The ability to use more contained data is a big memory and speed
performance advantage of it's own, and should outweigh any possible
disadvantage of the analysis pass. Being able to use high performance
data structures is one of the main motivations for this proposal.

I think the ability to serialize many common data structures that are
not currently serializable is another reason to consider such
a change. I've tried to use the serialization library in it's current
form on two different data structures and both times I've encountered
problems because of limitations in the current implementation. The first
attempt was to serialize off an abstract syntax tree for a Verilog
compiler, and I eventually had to give up on this because of the sheer
number of changes that would be required to the data structure. The
second data structure was designed with the serialization library in
mind, but it's still difficult to get around the previously mentioned
limitations as the number of object relationships increases.

best regards,

Dan Notestein

Boost list run by bdawes at, gregod at, cpdaniel at, john at