Boost logo

Boost Users :

Subject: [Boost-users] [hash] hashing of smart pointer types - prevent conversion to bool?
From: Alastair Rankine (arsptr_at_[hidden])
Date: 2010-07-06 08:15:36


Hi Boosters,

I'm sure I'm not the first person to be caught out by this. If you try
to use Boost.Functional/Hash on a smart pointer type such as
boost::shared_ptr, the results may be, well, underwhelming:

    typedef boost::shared_ptr<int> IPtr;
    IPtr i1 = IPtr(new int(1234));
    IPtr i2 = IPtr(new int(4321));

    std::cout << std::hex
              << "hash(i1): " << boost::hash<IPtr>()(i1) << "\n"
              << "hash(i2): " << boost::hash<IPtr>()(i2) << "\n";

which produces:

    hash(i1): 1
    hash(i2): 1

This is because the shared_ptr is being silently converted to a boolean
type before being hashed. Now maybe this is all documented and so forth
but I would argue that it's not really expected behaviour. Or to put it
another way, it's very easy to go wrong and use smart pointers in a
hashed container such as a boost::unordered_set. The performance of
such a set obviously degrades considerably, when compared to a more
intelligent hash function. Here's a simple snippet to show the problem:

    typedef boost::shared_ptr<int> IPtr;

    template <typename T>
    struct PtrHash
    : public std::unary_function<boost::shared_ptr<T>, std::size_t>
    {
        std::size_t operator() (boost::shared_ptr<T> const& ptr) const
        {
            return boost::hash<T*>()(ptr.get());
        }
    };

    template <typename Set>
    double TimeInserts(int i)
    {
        Set s;
        boost::timer t;
        while (i--)
            s.insert(IPtr(new int(i)));
        return t.elapsed();
    }

    int main()
    {
        typedef boost::unordered_set<IPtr> IPtrSetDefaultHash;
        typedef boost::unordered_set<IPtr, PtrHash<int> > IPtrSetPtrHash;
        const int iterations = 10000;
        std::cout << std::dec
                  << "IPtrSetDefaultHash: " <<
TimeInserts<IPtrSetDefaultHash>(iterations) << "s\n"
                  << "IPtrSetPtrHash: " <<
TimeInserts<IPtrSetPtrHash>(iterations) << "s\n";
        return 0;
    }

Which outputs:

    IPtrSetDefaultHash: 2.14526s
    IPtrSetPtrHash: 0.006855s

... a noticable performance difference.

OK so obviously it is reasonable to want to question the wisdom of
using a smart pointer type in a hashed container like this. But I want
to ignore that question for now, and instead ask whether it is possible
to *prevent* the conversion of smart pointers to bool for the purposes
of calculating hash values.

The context for this question is a fairly large codebase with extensive
use of hashed containers. Not everyone on the development team is aware
of this particular limitation of smart pointers. So not only do I want
to detect the use of "dangerous" hashing functions but also prevent
them being used in the future. Instead I really want to either fix or
prevent the conversion to bool for smart pointers when used in a
hashing function (either through the boost:: or tr1:: namespace).

One idea is to patch the boost headers to add a hash function for each
of the smart pointer types (as per the example above) but there must be
a better answer than this?

Does this problem affect anyone else? How do people deal with this?


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net