Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2005-06-27 01:05:14


David Abrahams wrote:

>> I'm not getting you. The hash value is calculated as part of the
>> saving process itself, so it has the very same complexity.
>> I've got the hunch you might be meaning something else,
>> could you please elaborate?
>
> The question is, when encountering a pointer or reference member that
> needs to be hashed, do you do what the serialization process does and
> hash the thing that it refers to, or do you just hash its address?
> The former is deep hashing; the latter is shallow hashing. In a graph
> with tracking, you serialize an object X once, and all subsequent
> times you encounter X in the graph you just serialize an id. If you
> add hashing, the first time you encounter X you'll hash it and
> serialize it. For this one time it doesn't matter if hashing is deep
> or shallow from a big-O perspective. However, when you encounter X
> again, if hashing is deep, you'll do a lot of extra work. Actually, a
> potentially infinite amount if there are cycles, so hashing really
> does have to be shallow. I guess I answered my own question.

If hashing is shallow, then what does it check? The library already has
mechanism to check that the same address is serialized, and the point of
the proposal, as I understand it, is to detect cases like:

 
   A* a = A()
   archive << a;
   // modify a
   archive << a;

In this case, deep hashing is needed. Yea, it can be costly -- the more
reasons to make it optional, maybe even at runtime.

- Volodya

 


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk