|
Boost : |
Subject: Re: [boost] [serialization] deserializing asynchronously serializedtypes
From: Robert Ramey (ramey_at_[hidden])
Date: 2011-09-04 02:32:59
Andrew Hundt wrote:
> I have an unusual use case for boost.serialization, and I was
> wondering if
> it would be possible to adapt it to my needs:
>
> - I have a set of over 100 types, and instances of each are generated
> asynchronously then serialized to a file in that order.
> - The most interesting serialized data will be written just before
> the power is unexpectedly cut.
> - I need to load in and run on as much data as possible when reading
> the serialized data back, ignoring incomplete data at the end (due to
> a power cut).
> - The basic Boost serialization examples require you to know the type
> of the next piece of data to be loaded when reading. Since these
> types are
> generated asynchronously they are not known in advance.
> - I need to write the data out immediately when it arrives because of
> the power issue.
> - Files will be getting up to around 150GB in size for binary
> archives, so
> it can't be marshaled in memory, it needs to be written immediately
> even if it is redundant.
>
> Is there a way to read in that serialized file using the facilities
> provided in boost.serialization?
>
> Here are my current ideas:
> - I tried using boost.variant, but it loses its will to compile when I
> increase the typelist limit to around ~60 types on gcc 4.4, and I
> have more than a hundred.
> - Use preprocessor metaprogramming to do something equivalent to
> boost.variant, but I would very much prefer a more pleasant option.
> - Serialize an index or custom headers indicating the next type to
> appear
> - One way of achieving some of these goals is writing one piece at a
> time using a binary archive to an fstream, with the index mentioned
> above separating data types.
>
> I don't know what aspects of my requirements will prove to be a
> problem, so if anyone can provide advice that would help me avoid a
> major pitfall, it would be greatly appreciated.
>
> Thanks for your thoughts.
>
> Cheers!
> Andrew Hundt
Andrew Hundt wrote:
> I have an unusual use case for boost.serialization, and I was
> wondering if
> it would be possible to adapt it to my needs:
>
> - I have a set of over 100 types, and instances of each are generated
> asynchronously then serialized to a file in that order.
> - The most interesting serialized data will be written just before
> the power is unexpectedly cut.
> - I need to load in and run on as much data as possible when reading
> the serialized data back, ignoring incomplete data at the end (due to
> a power cut).
> - The basic Boost serialization examples require you to know the type
> of the next piece of data to be loaded when reading. Since these
> types are
> generated asynchronously they are not known in advance.
> - I need to write the data out immediately when it arrives because of
> the power issue.
> - Files will be getting up to around 150GB in size for binary
> archives, so
> it can't be marshaled in memory, it needs to be written immediately
> even if it is redundant.
>
> Is there a way to read in that serialized file using the facilities
> provided in boost.serialization?
>
> Here are my current ideas:
> - I tried using boost.variant, but it loses its will to compile when I
> increase the typelist limit to around ~60 types on gcc 4.4, and I
> have more than a hundred.
> - Use preprocessor metaprogramming to do something equivalent to
> boost.variant, but I would very much prefer a more pleasant option.
> - Serialize an index or custom headers indicating the next type to
> appear
> - One way of achieving some of these goals is writing one piece at a
> time using a binary archive to an fstream, with the index mentioned
> above separating data types.
>
> I don't know what aspects of my requirements will prove to be a
> problem, so if anyone can provide advice that would help me avoid a
> major pitfall, it would be greatly appreciated.
>
> Thanks for your thoughts.
You could still make your own "special purpose variant. Look at the
section "serialization wrappers".
struct class my_wrapper {
unsigned m_i;
union {
type1 &m_t1
type2 &m_t2
....
};
my_wrapper(type1 t1) : m_i(1), m_t1(t1) {}
my_wrapper(type1 t2) : m_i(2), m_t2(t2) {}
...
}
template<class Archive>
void save(Archive &ar, const my_wrapper & w, unsigned int version){
ar << w.i
switch(w.i){
case 1:
ar << w.t1;
break
....
}
template<class Archive>
void load(Archive &ar, my_wrapper & w, unsigned int version){
ar >> w.i
switch(w.i){
case 1{
ar >> w.t1
....
}
}
so now you could just use
ar << my_wrapper(t); // where t is any one of 100 types
This is basically a poor man's variant which doesn't
use compile time coding.
Another idea - tricker would be to use a variant of variants
to get around the compiler limitations.
Robert Ramey
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk