Boost logo

Boost :

Subject: [boost] [unordered] equal_range is O(bucket_count()), not O(size())
From: Brad Higgins (bhiggins_at_[hidden])
Date: 2011-04-07 11:52:59

TR1 specifies that unordered_multimap::equal_range(k) worst case complexity is O(size()). However, in practice, I find that the worst case performance of boost unordred_multimap's equal_range is O(bucket_count()). Is this known?

I believe the issue is with the second iterator returned by equal_range(). Specifically, if there are sparse elements in the hash table, the implementation may have to traverse many buckets to find the second iterator.

I have a test that demonstrates the issue. The test makes 2 runs over an unordered_multimap, each using the same number of elements. In the second run, it calls rehash() in order to grow the bucket count by a large amount. Notice that the first run is quite fast, while the second run takes more than 10 seconds. Here are the results from my test:

$ ./main
insert 1000 elements
built hash, with 1543 buckets and 1000 elements.  Now, get some ranges
time to obtain 1000 elements: 00:00:00.001366
insert 1000 elements
built hash, with 6291469 buckets and 1000 elements.  Now, get some ranges
time to obtain 1000 elements: 00:00:10.818002
Here is my code for the test:
#include <iostream>
#include <boost/unordered_map.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>
typedef boost::unordered_multimap<size_t, size_t> hash_int_type;
typedef hash_int_type::iterator hash_int_itr;
typedef std::pair<size_t, size_t> hash_insert_type;
run_test(const size_t data_size,
         const size_t rehash_size)
    hash_int_type hash;
    std::cout << "rehash(" << rehash_size << ")\n";
    std::cout << "insert " << data_size << " elements\n";
    for (size_t i=0; i<data_size; i++) {
        hash_insert_type ins(i,i);
    std::cout << "built hash, with "
        << hash.bucket_count()
        << " buckets and "
        << hash.size()
        << " elements.  Now, get some ranges\n";
    boost::posix_time::ptime start(boost::posix_time::microsec_clock::local_time());
    for (size_t i=data_size-1; i!=0; i--) {
        typedef std::pair<hash_int_itr, hash_int_itr> ret_pair;
        ret_pair ret = hash.equal_range(i);
        if (ret.first != ret.second) {
    boost::posix_time::ptime stop(boost::posix_time::microsec_clock::local_time());
    boost::posix_time::time_duration diff_time(stop-start);
    std::cout << "time to obtain " << data_size << " elements: " << diff_time << "\n";
main(int argc, char* argv[])
    return 0;

Boost list run by bdawes at, gregod at, cpdaniel at, john at