Boost logo

Boost :

From: JOAQUIN LOPEZ MU?Z (joaquin_at_[hidden])
Date: 2005-06-25 13:38:27


Hi Dave,

----- Mensaje original -----
De: David Abrahams <dave_at_[hidden]>
Fecha: Sábado, Junio 25, 2005 2:09 pm
Asunto: Re: [boost] [serialization] a proposal for an
        alternative tothenewconst-saving rule

> "JOAQUIN LOPEZ MU?Z" <joaquin_at_[hidden]> writes:
>
> > David Abrahams <dave_at_[hidden]> wrote:
> > to thenewconst-saving rule
> >
> >> "JOAQUIN LOPEZ MU?Z" <joaquin_at_[hidden]> writes:
> >>
> >> > De: David Abrahams <dave_at_[hidden]>
> >>
> >> I'm not sure it is. There's an imposition on users: all the types
> >> they want to serialize have to support hashing.
> >
> > No, no. Please check the piece of pseudocode on my first
> > message: the hash code is automatically built by
> > Boost.Serialization, without any intervention from the user,
>
> I understand that.
>
> > and certainly without any requirement that the type be hashable
> > (in the sense of providing a hash_value overload or something
> > like this.)
>
> I was talking about the type's sub-objects.
>
> > Let me illustrate with an example:
>
> <snip>
>
> I actually did understand that, as you can see from...
>
> > The process recursively goes down to primitive types,
> > as specified in my proposed pseudocode, and only these
> > have to be hashable, but fortunately they are.
> >
> >>
> >> It is nice that the serialization library automatically takes
> care of
> >> hashing aggregated types and leaving out the unserialized
> data...
>
> ...this remark, and the remarks at the end of my message, which you
> snipped.

OK, excuse my verbosity, but now I'm satisfied that
we both agree on the terms the algorithm is laid out.

>
> >> uh, wait: this will never work unless you plan only to do shallow
> >> hashing. Otherwise you will get an exponential explosion for some
> >> object graphs. Is that your intention?
> >
> > I'm not getting you. The hash value is calculated as part of the
> > saving process itself, so it has the very same complexity.
> > I've got the hunch you might be meaning something else,
> > could you please elaborate?
>
> The question is, when encountering a pointer or reference member that
> needs to be hashed, do you do what the serialization process does and
> hash the thing that it refers to, or do you just hash its address?
> The former is deep hashing; the latter is shallow hashing. In a graph
> with tracking, you serialize an object X once, and all subsequent
> times you encounter X in the graph you just serialize an id. If you
> add hashing, the first time you encounter X you'll hash it and
> serialize it. For this one time it doesn't matter if hashing is deep
> or shallow from a big-O perspective. However, when you encounter X
> again, if hashing is deep, you'll do a lot of extra work.
> Actually, a
> potentially infinite amount if there are cycles, so hashing really
> does have to be shallow. I guess I answered my own question.
>

I guess so. Correct, pointer hashing has to be shallow.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk