Boost logo

Boost Users :

From: Sean Cavanaugh (worksonmymachine_at_[hidden])
Date: 2007-07-08 18:28:36


On 7/8/07, Robert Ramey <ramey_at_[hidden]> wrote:
>
> Sean Cavanaugh wrote:
> > Ok I've spent a good chunk of my day making a set of custom archive
> > class, based on the portable_binary_iarchive and
> > portable_binary_oarchive.
> >
> >
> > I have a non-virtual class hierarchy named Asset, in memory this is a
> > cyclic graph, but my requirements are that this structure cannot be
> > serialized all at once, it has to be broken up at boundaries. So
> > the archiver serializes the first Asset* (which is always the root
> > object passed into operator&). This object serializes as normally as
> > per the boost serialization code, except except that when subsequent
> > Asset*'s are encountered
> >
> > ?which point to a previously serialized object? or any subsequent
> > Asset *?
> >
> > I send the pointer into an AssetManager class and translate the
> > Asset* into a ResourceID, and serialize the ResourceID instead of the
> > object. This is done by directly calling operator& inside
> > save_override.
> >
> *******
> Hmmm this sounds to me exactly equivalent to what the serialization
> system does by default for tracked objects. Objects serialized
> through pointers are tracked by default. Your "ResourceID" seems a
> re-implemenation of the "object id" used by the serialization libary
> to track

This is true, but the serialization libraries view of the id's only have
archive scope, which means the same id's are used for totally different
things when you use multiple archives. My ResourceID's have to be globally
unique, in the sense they are GUIDs or relative pathnames, which can then be
mapped to a fully qualified pathname and can be loaded on demand. As far as
the archive class is concerned its a user defined translation (proxy-to-real
and real-to-proxy) that exists for certain types.

So in this case its conceptually an archive of archives. With the outermost
archive being the filesystem, and the innermost being a single object (a
file). The file only contains one object, and all links to other objects
are handles (a filename). So the innermost code when it is loading from the
filesystem, it knows that it wants a pointer to another object, but it only
has its name. So it has to ask the filesystem class to translate the name
into an object, which it can do because it can lookup if its already loaded
and return that, or literally open another file based archive and read it in
on demand, and return that. The archive class doesn't care about the
specifics really, it just needs the means to achieve the result.

So the archive's base class code should be doing this in a conceptual way:

for_each type X, if has_user_defined_translation<X*> implement :
on_save -> translate type X* to type ProxyX via a function : ProxyX
RealToProxy(X* x), serialize the ProxyX
on_load -> read ProxyX translate to X*, via a function : X*
ProxyToReal(ProxyX& px), return the X*

Except that this on_save and on_load example needs to have yet another
template argument, on whether the first occurence of X should be literally
saved or not, since we might be saving a hierarchy of Bar's that always save
handles to Foo's at all times.

>
> > ...
> >
> > I also have to hard-code the full list of derived Asset types and
> > manually provide specializations for all of them in save_override and
> > load_override.
> >
> ****
> Well, since they're different - I would expect each of them to have a
> different serialize function. If all the serialize functions are the
> same, it would seem that something should be moved from the derived
> class to the base class.
> >
> > If I use the base class, I end up slicing my class down to its base,
> > and cannot serialize it.
> >
> ****
> serializing through a base class pointer solves this problem as well.

In my currently kind-of-working hacked up version of the archive classes the
methods load_override are nearly identical when specializing for AssetModel,
AssetTexture, etc. The behavior is constant (proxy-to-real translation or
vice-versa) but the type is not. I can slice them here safely on saving,
but not on loading (since the C++ code in the serialize method 'ar & foo',
is expecting a more-derived type to be filled in).

> I can't make serialize virtual, since the intrusive serialize methods
> > are templates, but it certainly would solve the problem if it were
> > possible.
>
> ***
> I suspect that if the other changes suggested were implemented this
> would disappear as a problem. I don't think I've tried it, but
> rather than including boiler plate code in each derived class, one
> might try adding a "mix-in" base class which contains the serialize
> function.

I'll play around with alternatives, I basically spent the day learning the
archive templates by watching the code flow.

>
> > In addition the bodies of all of my overrides are completely
> > identical except for the classname (AssetModel, AssetTexture, etc).
> >
> > Which means I'll be wrapping the bodies of a generic save_override
> > and load_override in a macro, and have to manually add all Asset
> > derived classes to a list of classes inside my archive class. Which
> > means that my archiver cannot be generic, even though I have managed
> > to make it a template in the sense that the passed in asset manager
> > and base asset types are template parameters.
> >
> ***
> looks to me that you've gotten off on the wrong foot and stuck with it.

That isn't possible with learning new code :) This is what the path of
least resistance yielded, with the docs and examples provided by boost.
Basically this is as far as I got without having to directly hack on the
existing boost code, and having to deal with a crash course on the code flow
and internal data structure of everything.

> >
> *** me this is exactly the wrong approach. Now you've coupled your
> classes
> to be serialized to a specific archive. This means you won't be able to
> use
> any other archive type and you've defeated one of the main benefits to the
> serialization library. Perhaps it wasn't a suitable library for your
> task.

The library can do what I want, because I have the source code :) Anyway
the classes aren't coupled to the archive with what I've come up with so
far, its the other way around. I definitely do not want my classes to
understand archiving beyond a very basic sense of having to call operator&
on most of their fields, since I plan on having several wildly different
archive classes calling the serialize methods on my classes.

So I have a working implementation, how do I make it better?
>
> ****
> Maybe you might try doing it in the simplest way.
>
> I can't see how what you want to do is different than what everyone else
> uses the library for. And I can't see how what you want to do is
> different
> than what the examples do.
>
>
I could get the behavior I want by altering the serialize methods, but then
it would be ill formed for other archives. I could also template specialize
the serialize methods for the archive in question, but then I would have to
write more than one. Its the archives job to interpret what to do when you
call ar & foo.

I anticipate having more data than I can load, so I need to load and save at
an object level. But I still need to write the serialize methods as if they
all could fit in memory, since I plan on having other archive classes that
do operate on the graph of what is loaded at runtime (i.e. to compute
garbage collection).

In essence the archive classes need to be made to be programmable for these
behaviors to work:

Graph of Foo:
Saving: save the first Foo*, translate all further Foo*'s into a user
defined handle with a user defined function and save that instead.
Loading: load the first Foo*, assume all further Foo*'s are saved with a
user defined handle, translate them back into live objects on demand, and
also use the existing caching scheme to prevent having to translate the same
user-defined handle over and over.

Graph of Bar:
Saving: save all Bar's, but save all occurences of Foo*'s as handles
Loading: save all Bar's, but load all occurences of Foo*'s from handles

Garbage Collecting Foo:
'Saving' : archive an array of live root level Foo objects, build a list of
all Foo* that are reachable through serialization. Compare this list to
the full list of Foo objects, and unload the ones that are missing.



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net