|
Boost : |
From: Daryle Walker (darylew_at_[hidden])
Date: 2008-08-13 17:28:56
I was thinking about adding serialization to some times I've been
working on in the sandbox. First I tried to recall how Mr. Ramey
said serialization can be tested. I couldn't find the specific post
I was thinking about, but others that were found gave me the answer.
Reading other posts in that search prompted me to ask more questions.
I could reduce the classes I'm working with to:
//=============================================
class computer;
class context
{
public:
typedef boost::array<uint_least32_t, 4> value_type;
context(); // use auto copy-ctr, copy-=, dtr
void operator ()( bool ); // consumer
bool operator ==() const; // equals
bool operator !=() const; // not-equals
value_type operator ()() const; // producer
private:
friend class computer;
boost::uint_fast64_t length;
boost::array<uint_fast32_t, 4> buffer;
boost::array<bool, 512> queue;
template < class Archive >
void serialize( Archive &ar, const unsigned int version );
};
class computer
: public convenience_methods_base<context>
{
// An object of type "context" is incorporated in this object
// due to the base class. A mutable/const pair of non-static
// member functions named "context()" gives access to the inner
// context object.
public:
typedef context::value_type value_type;
// Put various access member functions here that forward to the
// internals of the "context" type, which work because of the
// friend declaration.
private:
template < class Archive >
void serialize( Archive &ar, const unsigned int version );
};
//=============================================
I initially planned to have serialization functions for these two
classes, the "convenience_methods_base" base class template, plus two
other class templates (a base class and a support class) that
"convenience_methods_base" uses. But the e-mail search I mentioned
found a thread from May 2007 (on the main Boost list) the suggested
that the serialization of a non-primitive should match the user's
external representation of the type, and not the type's particular
internal structure. So I decided to keep the serialization protocol
just for the two public-facing classes, "context" and "computer."
I figured that the "computer" object can be serialized like:
//=============================================
template < class Archive >
inline void computer::serialize( Archive &ar, const unsigned int
version )
{ ar & boost::serialization::make_nvp("context", this->context()); }
//=============================================
Which leaves how "context" objects are serialized. After thinking
about it for hours, I decided to just whip out something quick &
dirty and refine it later. So:
//=============================================
template < class Archive >
inline void context::serialize( Archive &ar, const unsigned int
version )
{
ar & BOOST_SERIALIZATION_NVP( length )
& BOOST_SERIALIZATION_NVP( buffer )
& BOOST_SERIALIZATION_NVP( queue );
}
//=============================================
would give a final serialization, in my test file, of:
//=============================================
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>1</length>
<buffer class_id="2" tracking_level="0" version="0">
<elems>
<count>4</count>
<item>1732584193</item>
<item>4023233417</item>
<item>2562383102</item>
<item>271733878</item>
</elems>
</buffer>
<queue class_id="3" tracking_level="0" version="0">
<elems>
<count>512</count>
<item>1</item>
<item>0</item>
<!-- I'll spare you, and the mail server, of 509 more "<item>0</
item>" lines -->
<item>0</item>
</elems>
</queue>
</context>
</test>
</boost_serialization>
//=============================================
Now I started refining, keeping the principle of not leaking
implementation details in mind. The problem here is the array-
counts, which I don't need since they'll never change. The first one
I can fix by writing each element separately:
//=============================================
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>1</length>
<buffer-A>1732584193</buffer-A>
<buffer-B>4023233417</buffer-B>
<buffer-C>2562383102</buffer-C>
<buffer-D>271733878</buffer-D>
<message-tail class_id="2" tracking_level="0" version="0">
<elems>
<count>512</count>
<item>1</item>
<item>0</item>
<!-- 509 more "<item>0</item>" lines -->
<item>0</item>
</elems>
</message-tail>
</context>
</test>
</boost_serialization>
//=============================================
I've always wanted to use something like a base-64 string encoding of
the bit array, because it's cool and it'd save space. I added
conversion functions to/from the bit array and a std::string, and
then (de)serialized the string. I also had to separate "serialize"
into "save" and "load" since conversion is complementary, not
identical. So now I have:
//=============================================
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>1</length>
<buffer-A>1732584193</buffer-A>
<buffer-B>4023233417</buffer-B>
<buffer-C>2562383102</buffer-C>
<buffer-D>271733878</buffer-D>
<message-tail>g</message-tail>
</context>
</test>
</boost_serialization>
//=============================================
Then I added tests for: exactly 6 bits (i.e. one base-64 letter); a
sextet (actually two) and a partial sextet together; filling a queue
to capacity (actually one short of that since a full queue
automatically activates a turnover); and going past capacity
resulting in a new hash buffer and an empty message-tail.
//=============================================
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>1</length>
<buffer-A>1732584193</buffer-A>
<buffer-B>4023233417</buffer-B>
<buffer-C>2562383102</buffer-C>
<buffer-D>271733878</buffer-D>
<message-tail>g</message-tail>
</context>
</test>
</boost_serialization>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>6</length>
<buffer-A>1732584193</buffer-A>
<buffer-B>4023233417</buffer-B>
<buffer-C>2562383102</buffer-C>
<buffer-D>271733878</buffer-D>
<message-tail>q</message-tail>
</context>
</test>
</boost_serialization>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>14</length>
<buffer-A>1732584193</buffer-A>
<buffer-B>4023233417</buffer-B>
<buffer-C>2562383102</buffer-C>
<buffer-D>271733878</buffer-D>
<message-tail>qQg</message-tail>
</context>
</test>
</boost_serialization>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>511</length>
<buffer-A>1732584193</buffer-A>
<buffer-B>4023233417</buffer-B>
<buffer-C>2562383102</buffer-C>
<buffer-D>271733878</buffer-D>
<message-
tail>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-
_AAAAAAAAAAH__________g</message-tail>
</context>
</test>
</boost_serialization>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE boost_serialization>
<boost_serialization signature="serialization::archive" version="5">
<test class_id="0" tracking_level="0" version="0">
<context class_id="1" tracking_level="0" version="0">
<length>512</length>
<buffer-A>2631642121</buffer-A>
<buffer-B>80961853</buffer-B>
<buffer-C>4033330630</buffer-C>
<buffer-D>497373075</buffer-D>
<message-tail></message-tail>
</context>
</test>
</boost_serialization>
//=============================================
If you want to see the actual work, look at revision/change-set
#48131 in Boost's Subversion set-up. Now to the actual questions:
1. If there's only one sub-object, base or member, that has any
significant data, could someone call the "serialize" member function
of that sub-object directly in the wrapping class's "serialize"?
(This assumes that friendship is set up.) This would make the
wrapping class look identical to the sub-object's class, right? Is
this a good idea?
2. Before actually trying to serialize a string, I was worried that
the string's serialization would include a length count. This would
be unnecessary because the object's "length" attribute already
implies the length of the string (int( ceil( double( length % 512 ) /
6.0 ) )). Here, we see that the string's length isn't explicitly
included in the XML archive, so I have no worries. But what about
non-XML archives? Will be string's length be directly serialized,
wasting space? If so, how can I fix that?
3. Having to add std::string to support serialization makes my class
header heavier. My class uses fixed-sized arrays, so is there any way
that I can avoid allocating a string? For writing out, could I set
up a char-array with the encoding and write that out? For reading
in, can I read the string in piecemeal to a char-array just in case
someone added more characters than required. My converter currently
ignores illegal characters and stops when enough legal characters
have been read. If what I ask is possible, would the reading routine
have to seek to the end of the entry so further serialization isn't
messed up?
-- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk