Boost logo

Boost :

From: Joaquín Mª López Muñoz (joaquin_at_[hidden])
Date: 2005-09-23 02:10:12


David Abrahams ha escrito:

> Joaquín Mª López Muñoz <joaquin_at_[hidden]> writes:
> > Suppose we are implemeting serialization for a custom container:
> > save and load are straight enough:
> >
> > class container{
> > save(...)
> > {
> > for(const_iterator it=begin,it_end=end();it!=it_end;++it){
> > ar<<*it;
> > }
> > }
> > load(...)
> > {
> > clear();
> > for(iterator it=begin,it_end=end();it!=it_end;++it){
> > value_type v;
> > ar>>v;
> > push_back(v);
> > ar.reset_object_address(&v,&back());
>
> Assuming value_type is default constructible and push_back doesn't
> invalidate any addresses of other objects, I guess so. But in that
> case I'd still preallocate enough elements and deserialize them in
> place.

Some details are missing for brevity of exposition. v is not default
constructed, but rather constructed via serialization::load_construct_data_adl.
Elements are actually stable in multi_index_container. push_back() is
just pseudocode for your convenience, insertion is done by other means.
The preallocation scheme does not work for many reasons: one of them is
that an associative (set-like) index cannot be prefed elements without
knowing what their values will be.
If you'd like to see the real stuff, have a look at methods save_
and load_ in boost/multi_index_container.hpp.

> > }
> > }
> > };
> >
> > Now we want to add serialization for iterators. One ugly way would be
> > as follows:
> >
> > class iterator{
> > save(...){
> > ar<<&(operator*()); // save pointer to element
> > }
> > load(...){
> > value_type* pv;
> > ar>>pv;
> > node* pn;
> > // cast from pv to pn: possibly nonportable.
> > assign(pn);
>
> Clearly. The "right thing to do" is to serialize all the nodes as part of
> serializing the container. Then this "just works," no?
>
> > }
> > };
> >
> > This is potentially nonportable and, besides, won't work for nontracked
> > value_types.
> > What we want is to archive pointers to the internal nodes, rather than the values:
>
> Right.
>
> > class iterator
> > {
> > save(...){
> > ar<<node_ptr;
> > }
> > load(...){
> > node* pn;
> > ar>>pn;
> > node_ptr=pn;
> > }
> > };
> >
> > But for this to work, nodes must be serialized first so that they can be tracked
> > later.
> >
> > class container{
> > save(...)
> > {
> > for(const_iterator it=begin,it_end=end();it!=it_end;++it){
> > ar<<*it; // save value
> > ar<<*it.node_ptr; // save node
>
> Why wouldn't your node just implement serialization that serializes
> its contained value?

I knew you'd ask that :) It wouldn't work on loading. Take a look again
at the loading part of the container:

/* 1 */ value_type v;
/* 2 */ ar>>v;
/* 3 */ push_back(v);
/* 4 */ ar.reset_object_address(&v,&back());
/* 5 */ ar>>*(--end()).node_ptr; // "load" node

It is not until /* 3 */ is executed that the node comes into existence. But for
the node to be created, we must first know its associated value --remember,
I just cannot insert a node with a "dummy" value into an associative container,
the value determines crucially where the node ends up into the container.
Your next question could be: well, why don't you load the node out of place,
with value and all and then link it into the container? This would work, but would
force me to implement a different insertion method than currently used, where
the value is not assigned to the node until the node is properly linked.

>
> > }
> > }
> > load(...)
> > {
> > clear();
> > for(iterator it=begin,it_end=end();it!=it_end;++it){
> > value_type v;
> > ar>>v;
> > push_back(v);
> > ar.reset_object_address(&v,&back());
> > ar>>*(--end()).node_ptr; // "load" node
> > }
> > }
> > };
> >
> > That's the purpose of node serialization stuff. The implementation does nothing
> > except signalling Boost.Serialization where later node pointers must
> > point to.
>
> Well, I could probably get this if I thought hard enough about it, but
> I don't yet. Of course I could be missing something, it seems like
> a hack to me. Serializing and deserializing the nodes directly seems
> a lot cleaner.

Hopefully this is answered in my previous paragraph.

> > Let T be a serializable type and Pred an associated equality
> > predicate inducing an equivalence relationship on T. Then T is said
> > to be EquivalentSerializable (under Pred) if
> >
> > p(x,y)==true
> >
> > for all p of type Pred
>
> You said Pred was a predicate; now you're saying it's a type. I think
> you were right the first time. You'll never satisfy that for all p of
> type bool(*)(int,int) for example.

A predicate is a type, at least according to

http://www.sgi.com/tech/stl/BinaryPredicate.html

(BTW, it is BinaryPredicate what I meant rather than UnaryPredicate.)
I'm naming types with uppercase first letter, objects lowercase.

> > and x and y of type T such that y is a restored copy of x.
> >
> > This leaves to the implementor of an UDT the open task of giving the
> > appropriate associated equality predicate (by default we can assume
> > std::equal_to).
>
> I think you mean ==

== is not a (SGI sense) BinaryPredicate: it is not even a type.
I really mean std::equal_to (std::equal_to<T>, to be precise.)

> > Then we can rewrite the postcondition on std::vector as
> >
> > if T is EquivalentSerializable under Pred, std::vector<T> is
> > EquivalentSerializable.
>
> Nope. You have to say under what predicate it is
> EquivalentSerializable. And when a nonstandard predicate is used for
> T there may not be any such predicate for the vector.

I think we agree here: when I said above "by default we can
assume std::equal_to" what I meant is: EquivalentSerializable is short for
EquivalentSerializable under std::equal_to. Otherwise we'd say
EquivalentSerializable under Pred and specify Pred. So I think my
statement about std::vector is correct and addresses your concerns.

> > (The statement is a little more complex if we take a Pred other than
> > the default.)

This addresses your point about a "nonstandard predicate". I hope
you're getting me.

> Of course, this EquivalentSerializable concept does
> > not save us the task of first providing archive compatibilty and
> > Serializable concepts the hard way
>
> Of course not.
>
> > and it is only applicable intraprogram.
>
> That's only true if you consider Pred to be a callable C++ predicate
> rather than a logical one.

I feel comfortable referring to C++ entities alone. Such a logical predicate
accepting arguments from different program executions would involve
moving up to a higher ontological level, which is a murky area. You know,
Ockham razor's and all that stuff. This is related to my objection #2
a couple of posts before.

BTW, there's a method by which we can extend EquivalentSerializable
to the interprogram domain without invoking non-C++ predicates. But
we can defer this to a later time.

> > Does this sound good to you?
>
> Yes and no. It's crafty, but you have a pretty big gaping hole as
> demonstrated by the vector example. I would be very happy with the
> good old fuzzy notion of equivalence here, but if you can close the
> hole, I don't mind adding predicates to the mix.
>
> Okay, how about this: the predicate is tightly bound to the type. So
> the predicate for vector<T> is defined to be that the two vectors have
> the same length and that each corresponding element of the two vectors
> satisfies the predicate that's bound to T.

Yes, this is an acceptable refinement. If I'm getting you, you propose
to fix the Pred type for each Serializable type. My proprosal is more of
a "concept template" ranging over a free Pred type.
Let me rewrite down this idea explicitly, so that we can agree we are
talking about the same thing:

* Let T be a Serializable type and Pred an associated BinaryPredicate type
(http://www.sgi.com/tech/stl/BinaryPredicate.html) inducing an equivalence
relationship on T. Then T is said to be *EquivalentSerializable under Pred* if

  p(x,y)==true

for all p of type Pred, and x and y of type T such that y is a restored
copy of x.

* A Serializable type T may provide a designated type X such that
T is EquivalentSerializable under X. In this case, we use the notation
SerializationEquality[T] to refer to such X.

* We say that T is *EquivalentSerializable* if SerializationEquality[T] is
provided.

* [vector example] if T is EquivalentSerializable, then
SerializationEquality[std::vector<T>] is the type:

struct {
  bool operator()(const std::vector<T>& x,const std::vector<T>& y){
    return
      x.size()==y.size()&&
      std::equal(x.begin(),x.end(),y.begin(),SerializationEquality[T]());
  }
};

Is this what you propose? Well, I'd say your idea and mine are mere
variations on the same theme, I couldn't say which one is more convenient.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk