Boost logo

Boost Users :

From: Robert Ramey (ramey_at_[hidden])
Date: 2008-09-03 12:45:37


Sebastian.Karlsson_at_[hidden] wrote:
> Hi,
>
> So I've got most of my serialization working now, the only problem now
> is that it's still largely inefficent. The resulting binary file is
> 1.75 times larger than the sourcer XML file which I parse from.
>
> This is probably a lot due to the fact that I still default tracking
> and version everything.

maybe, maybe not

>What I'd love to do however is to specify on a
> archive by archive basis if I want tracking / versioning. In fact,
> isn't this what one would normally want to do? When you specify
> tracking on a class by class basis you have no idea if there will be
> multiple pointer to the same object of that type. I found the
> no_tracking which I found a reference to from 2006 on this very
> mailing list, however it still doesn't seem to affect at least archive
> size ( haven't checked deserialization timings for it ).

At one time I considered implementing this as a runtime flag. I eventually
concluded that this was not a good idea as it would load down everyone's
code
with the weight of a seldom used feature. I concluded that would
better be implemented as a template parameter - if at all.

Note that archives created with such a flag would be "write only"
as tracking is necessary to properly restore pointers.

> When I take a look at the deserialization the binary archive seems to
> store identifiers for types as their straight up string versions. This
> also seems largely inefficient, is there a reason for why this can't
> just be a hash?

I presume you're referring to exported types. If you don't like this
you can just "pre-register" these types with ar.register<T>(0). This
will assign a small integer to the type which is valid for just this
archive.

> Also, is binary archives streaming or do they wait until the entire
> archive is loaded?

streaming. The only storage used is for one data item at time. same
as all archives.

> Basically I'm wondering what would be the fastest possible archive one
> could create if one doesn't care about portability.

binary_archive

Note that the binary archive stores just the raw bits. So if you have
hugh number 0 on a 32 bit machine, you'll be saving 4 bytes for each
data item. In a text rendering, this would be just 2 byte ('0' + space).
Hence, there is no reason to believe that the binary archive will always
be the smallest.

If you're concerned about i/o time, you could use a streambuf with
data compression added on - but that is outside the scope of the
serializaton library.

Robert Ramey

> // Sebastian Karlsson


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net