Boost logo

Boost :

From: JOAQUIN LOPEZ MU?Z (joaquin_at_[hidden])
Date: 2005-06-25 00:22:20


----- Mensaje original -----
De: David Abrahams <dave_at_[hidden]>
Fecha: Sábado, Junio 25, 2005 1:39 am
Asunto: Re: [boost] [serialization] a proposal for an alternative
        to thenewconst-saving rule

> "JOAQUIN LOPEZ MU?Z" <joaquin_at_[hidden]> writes:
>
>
> > De: David Abrahams <dave_at_[hidden]>
>
> >> But all of this misses the high-level problem: the author of
> the code
> >> doesn't know what he's doing. You simply can't serialize
> objects from
> >> distinct scopes with tracking into the same archive, because
> there may
> >> be aliasing.
> >
> > Totally agreed, this is what we are trying to detect
> > in order to protect the author of the code.
> >
> >> And there's nothing we can reasonably do to detect that
> >> problem when the aliased objects have the same type
> >
> > Nothing? I'm afraid I don't get you. A perfect
> > aliasing detection mechanism is probably impossible
> > to implement, but the hash test at least approximates it.
> > This is better than providing no safety mechanism,
> > as I understand you advocate.
>
> I'm not sure it is. There's an imposition on users: all the types
> they want to serialize have to support hashing.

No, no. Please check the piece of pseudocode on my first
message: the hash code is automatically built by
Boost.Serialization, without any intervention from the user,
and certainly without any requirement that the type be hashable
(in the sense of providing a hash_value overload or something
like this.)

Let me illustrate with an example:

struct labelled_point
{
  int x;
  int y;
  string label;

  template <class Archive>
  void serialize(Archive & ar, const unsigned int version)
  {
    ar & x;
    ar & y;
    ar & label;
  }
};

struct labelled_segment
{
  labelled_point p0,p1;

  template <class Archive>
  void serialize(Archive & ar, const unsigned int version)
  {
    ar & p0;
    ar & p1;
  }
};

So, given an object seg of type labelled_line, the framework
automatically constructs the following associated hash value:
  
  combine(
    combine(
      boost::hash<int>(seg.p0.x),
      boost::hash<int>(seg.p0.y),
      boost::hash<std::string>(seg.p0.label)),
    combine(
      boost::hash<int>(seg.p1.x),
      boost::hash<int>(seg.p1.y),
      boost::hash<std::string>(seg.p1.label)))

(combine(...) is shorthand for the obvious combination
formula of hash values using boost::hash_combine.)

The process recursively goes down to primitive types,
as specified in my proposed pseudocode, and only these
have to be hashable, but fortunately they are.

>
> It is nice that the serialization library automatically takes care of
> hashing aggregated types and leaving out the unserialized data...
>
> uh, wait: this will never work unless you plan only to do shallow
> hashing. Otherwise you will get an exponential explosion for some
> object graphs. Is that your intention?

I'm not getting you. The hash value is calculated as part
of the saving process itself, so it has the very same complexity.
I've got the hunch you might be meaning something else,
could you please elaborate?

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk