Boost logo

Boost Users :

From: Joaquin M Lopez Munoz (joaquin_at_[hidden])
Date: 2007-04-26 17:24:10


Ovanes Markarian <om_boost <at> keywallet.com> writes:
>
> On Thu, April 26, 2007 12:29, Joaquín Mª López Muñoz wrote:
>
> > Ovanes Markarian <om_boost <at> keywallet.com> writes:
[...]
> Sorry for the long explanation and then the short idea, hope this
> is of interest to others...

FWIW, I think your explanation of how boost::hash works
with strings, char *s and char []s is correct.

[...]
> > This puzzles me a lot: Given that your types_map container is
> > indexed on a std::string, things should be the other way
> > around: it is the "find(name)" version that should work AFAICS.
> > Could you please double-check?
> >
> I double checked it and both give the correct hash value. I think
> this is string dependent issue, where the string uses COW idiom
> to save performance, Herb Sutter wrote about it in his Guru of the
> Week (http://www.gotw.ca/gotw/043.htm). Therefore if I use
> const char* to initialize the string probably it is used in the
> string as long as possible until the string is not modified. But
> if this is so, there is more or less no reliable way to hash
> strings since these can be implicitly converted from const char*
> to std::string and afterwards used as a hash key and return a
> wrong hash result. Unfortunately I cannot step inside of the
> string constructors in MSVC 8.0 to see how these are really
>implemented.

I think this has nothing to do with COW strings: even if this
optimization is in effect, hashing a pointer will never yield
the same value as hashing the associated contents, so if the
index is based on std::strings (COW-based or not) you cannot
expect to locate a given string str by using the hash value
of a pointer to str's contents --your own experiments with
boost::hash described above must have shown you precisely this.

When I said before that the std::string-based create_type
must work and that based on const char * must fail, I made a
mistake: I've examined the issue more carefully and realized
that *both* versions work, albeit not because of COW-related
reasons. When you define a std::string-keyed index, that index
stores internally an object of type boost::hash<std::string>,
let's call it h. Now, when you issue a call like

  types_.get<hash>().find(name.c_str());

The internal code of B.MI calculates the hash value of the
argument you've passed by invoking

  h(arg); // arg is the argument passed, i.e. name.c_str()

As this h is of type boost::hash<std::string>, its operator()
accepts arguments of type std::string, and we're passing a
const char*. Given that const char* is implicitly convertible
to std::string, a temporary string is created automatically
on the fly with the same contents as those pointed to by arg,
so the correct hash value is computed and no pointer
is actually hashed.

I'm sorry for having wrongly stated that passing a const char*
must not work --it works, although the way it does is
admittedly a little convoluted.

So, if we agree on this, there's a little mistery left: since
after double-checking we both agree the two versions of
create_type (based on std::string and on const char *) should
work, what was your original problem about then?

> This is not a trivial issue to hash strings I think. Thanks
> for your time.

If there's something still unclear about the above explanation,
please tell me so. Best,

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net