Boost logo

Boost :

From: quendezus (quendez_at_[hidden])
Date: 2002-01-24 09:16:22


I would be very interested in discussing about a persistence library
proposition. However, I find the few messages already posted on the
subject not very clear to me. It seems many important issues are
taken for granted.

1-

Is this a common need ? What should the library do ?

2-

I have the feeling that a persistence library would need the
following mechanisms:
- a RTTI+ system that should allow object creation at runtime, based
on type information (I call it RTTI+ to indicate that it has more
functionalities than the standard C++ RTTI).
- a serialization system that would give facilities to serialize and
deserialize objects into a kind of "storage".

In my work as a developer, I need these 2 mechanisms alone more often
than a persistence library. But anyway, I don't think it is possible
to make a persistence library without them.

3-

Given that a persistence library is needed, I think RTTI+ and
serialization should have their own boost libraries. The reasons are:
- I find each of these domains difficult, with many issues,
especially if you want to provide simple and generic code.
- Boost users could benefit for such libraries. For example, I use
RTTI+ heavily for plug-in systems and script parsing (without any
need for persistence). I use serialization without persistence when I
want to implement simple load/save code.

What do you think about that?

Sylvain

--- In boost_at_y..., Tom Becker <voidampersand_at_f...> wrote:
> On Wed, 23 Jan 2002 10:24:51 +0100, Matthias Troyer
> <troyer_at_i...> wrote:
> >I want to bring up these issues here:
> >
> >i) I often have to (de)serialize large arrays of numbers, for
which
> >an optimized function should exist that can (de)serialize a C-
array
> >in one function call. This also allows support for data formats
such
> >as HDF
>
> That's a good feature. It can make a huge difference in performance.
>
> >ii) (De)serialization of pointers
>
> The simple case is to serialize the pointed-to data. The
interesting
> case is references to shared data, where it would be inefficient
and
> possibly harmful to deserialize the same object more than once. A
> typical implementation has a data structure associating pointers
and
> object IDs. If a pointer doesn't have an ID, you assign it one, and
> serialize both the ID and the pointed-to data. After that, you only
> have to serialize the ID. One nice trick is that references to
known
> constant data can be assigned known IDs, so their data never has to
> be serialized. Deserialization simply reverses the process. The
> details of what the IDs look like, and how the IDs are stored
> relative to the pointed-to data, will depend on the framework and
> file format(s) the application needs to work with. Pointer
> serialization should be a separate mechanism that is layered on top
> of basic serialization.
>
> >iii) using runtime polymorphism with the persistence library. At
the
> >moment only compile time polymorphism is implemented, and the
> >Reader/Writer needs to be chosen at compile time. This is a
problem
> >in my applications, where the (de)serialization is controlled from
> >an application framework, which calls a virtual save/load function
> >of a simulation object. For this to work the save and load
functions
> >for the basic data types need to be virtual functions too.
>
> The persistence library needs to serialize some type information
> along with the data. When deserializing, the caller is expecting a
> pointer to a particular type. It's okay if it actually gets a
pointer
> to a derived type. The persistence library just has to read the
type
> information and allocate the actual type, whatever it is.
>
> I like the approach where there is a reader function and a writer
> function registered for each record format in the persistent data.
> This way an object can support multiple record formats as
necessary.
> All the other approaches, such as using reader and writer objects,
or
> calling virtual save/load functions, can be used from the reader
and
> writer functions. It's by far the most flexible approach. The
> downside is the functions have to be registered. I think it's
easiest
> to do that by hand, but there are ways it can be done automatically
> and the choice can be left up to the framework or application
> developer.
>
> >Any ideas/comments how to proceed with the persistence library,
> >which seems to me a very important one?
>
> I'd like to see a persistence library that can replace the
> persistence code in all the application frameworks that are out
> there. At the least, it should have a design that allows writing
> adapters so it can be data format compatible with other persistence
> frameworks.
>
> A good place to start would be understanding the inputs and outputs
> of the most commonly used existing persistence mechanisms. I'm
fairly
> familiar with most of the persistence approaches that are used or
> have been used on the Mac. If there are others who are interested
in
> doing a general solution, let's talk.
>
> Regards,
>
> Tom
>
> --
> Tom Becker "Within C++, there is a much
smaller and
> <voidampersand_at_f...> cleaner language struggling to get out."
> -- Bjarne
Stroustrup


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk