Boost logo

Boost :

From: Augustus Saunders (infinite_8_monkey_at_[hidden])
Date: 2002-12-12 19:16:29

Yitzhak Sapir wrote:

>I would like to offer the following definition (based on the
>definition given by Augustus Saunders):
>Serialization is the process of breaking up various
>containers of data into their components and serializing each
>one by one into a stream, in an agreed-upon intermediate exchange
>that enables an appropriate parser to reconstruct ("deserialize")
>necessary information by reading the stream. The containers may
>components of (1) a fixed quantity and type (arrays/bit arrays and
>integral types), (2) varying quantity and fixed type
(lists/vectors), (3)
>fixed quantity and varying types (structs and classes), or (4) unit
>quantity and discriminated type (unions). Each component of the
>may be a container in itself, which is why the definition is
>recursive. Necessary information is defined by the needs of the
>Because it is not presumed that the parser (which may be a human
>shares apriori knowledge, it may be necessary to include meta-data
>regarding data types. Various structuring mechanisms may be used in
the ,
>coming in various flavours--header, pre/post tags, post
>length prepended, packeted, etc. Metadata may be independent or
mixed with

Ok, we basically agree here. You've attempted to enumerate what kinds
of data can be serialized, while I just assumed "anything that can be
represented in C++." I found your enumeration a little confusing, and
was wondering if you could rephrase it as what *cannot* be serialized
that can be represented in C++. If it's not obvious to me why
certain things can't be serialized, then I'll ask for clarification.

>I disagree that serialization is by necessity lossy, or that
>by necessity performs transformations and persistence does not.

Yes, I didn't mean to indicate that serialization *must* be lossy,
only that it can be while persistance can not. Of course, this is a
bit of a fudge as well, seeing as how I only intend to include data
necessary for restoring application state. You might, for example,
not persist anything that is mutable, presuming that it is a cached
value that will be recalculated. However, serialization may output
data in such a way that it would be impossible to reconstruct the
origional application state. A CSV, for example, might lose all
structural and type information.

>Persistence may perform non-lossy transformation. If the data is to
>persist in a file, pointers may have to be appropriately
>(Maybe I don't understand the meaning of transformation in the given

I was trying to be brief, and so I had to rely on people's intuitive
sense of some terms. What I was trying to imply was that
serialization might transform the data. Image the following class:

class Bitmap
  Rect mRect;
  vector<Pixel> mvPixels;

When serializing this, you might actually transform it into a BMP,
PNG, or JPEG or some such. Now, this is a class that you may prefer
to serialize and deserialize rather than persisting, but if you *do*
persist it, then its members would be stored, as is.

>Serialization may perform lossy transformations, or it may
>not. It may be symmetric, or it may not. But serialization always
>involves data fed serially into a stream. (A stream being defined
as a
>medium that maintains serial data). Serialization always involves a
>format in which that serial data represents the original data. And
>serialization is the process of connecting between the data (and
>metadata) itself, the format, and the stream.

Ok, and Persistance may perform transfromations that are not lossy --

say an object-relational model, or converting int to BigInt, and
persistance must be symmetric.


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.

Boost list run by bdawes at, gregod at, cpdaniel at, john at