Boost logo

Boost Users :

Subject: Re: [Boost-users] Boost or Standard when there is the choice?
From: Joaquin M Lopez Munoz (joaquin_at_[hidden])
Date: 2014-02-16 12:25:12


Daniel James <dnljms <at> gmail.com> writes:

>
> On 15 February 2014 21:11, gast128 <gast128 <at> hotmail.com> wrote:
> >> my experience with Visual Studio where the Dinkumware implementation
> >> almost always outperforms or is similar to a Boost implementation
> >
> > I had one big bad experience with std::unordered_map in the past[...]
>
> According to Joaquin's recent benchmarks the recent standard
> implementations are generally faster than the boost implementation,
> which IMO is how it should be.

I think this nos the story the benchmarks tell. 15 different scenarios were
tested:

* Insertion without rehashing, unique elements
* Insertion with rehashing, unique elements
* Insertion without rehashing, duplicate elements
* Insertion without rehashing, duplicate elements, high load factor
* Insertion with rehashing, duplicate elements
* Insertion with rehashing, duplicate elements, high load factor
* Erasure, unique elements
* Erasure, dupicate elements
* Erasure, dupicate elements, high load factor
* Successful lookup, unique elements
* Unsuccessful lookup, unique elements
* Successful lookup, duplicate elements
* Unsuccessful lookup, duplicate elements
* Successful lookup, duplicate elements, high load factor
* Unsuccessful lookup, duplicate elements, high load factor

and results do not show any definite winner (< means "better than"):

* VS 2012
http://bannalia.blogspot.com/2014/01/a-better-hash-table.html

  - insert_norehash_unique: BMI < Dinkum < BU
  - insert_rehash_unique: BMI < Dinkum < BU
  - insert_norehash_duplicate: Dinkum < BMI < BU
  - insert_norehash_duplicate_highlf: Dinkum < BMI < BU
  - insert_rehash_duplicate: Dinkum < BMI ~ BU
  - insert_rehash_duplicate_highlf: Dinkum ~ BMI ~ BU
  - erase_unique: BMI < Dinkum < BU
  - erase_duplicate: Dinkum < BMI < BU
  - erase_duplicate_highlf: Dinkum < BMI < BU
  - ok_lookup_unique: BMI < Dinkum < BU
  - nok_lookup_unique: Dinkum < BMI < BU
  - ok_lookup_duplicate: BMI < Dinkum < BU
  - nok_lookup_duplicate: BMI < BU < Dinkum
  - ok_lookup_duplicate_highlf: BMI < BU < Dinkum
  - nok_lookup_duplicate_highlf: BU ~ BMI < Dinkum

* GCC 4.8.2
http://bannalia.blogspot.com/2014/01/a-better-hash-table-gcc.html

  - insert_norehash_unique: BMI < BU < libstdc++
  - insert_rehash_unique: BMI < BU < libstdc++
  - insert_norehash_duplicate: libstdc++ < BMI < BU
  - insert_norehash_duplicate_highlf: libstdc++ < BMI < BU
  - insert_rehash_duplicate: libstdc++ < BMI < BU
  - insert_rehash_duplicate_highlf: libstdc++ < BMI < BU
  - erase_unique: libstdc++ < BMI < BU
  - erase_duplicate: libstdc++ < BMI < BU
  - erase_duplicate_highlf: libstdc++ < BMI < BU
  - ok_lookup_unique: BMI < BU < libstdc++
  - nok_lookup_unique: BMI < BU <libstdc++
  - ok_lookup_duplicate: BMI < BU < libstdc++
  - nok_lookup_duplicate: BMI ~ BU < libstdc++
  - ok_lookup_duplicate_highlf: BMI < BU < libstdc++
  - nok_lookup_duplicate_highlf: BMI ~ BU < libstdc++

* Clang 3.4 with libstdc++
http://bannalia.blogspot.com/2014/01/a-better-hash-table-clang.html

  - insert_norehash_unique: BMI < BU < libstdc++
  - insert_rehash_unique: libstdc++ ~ BU ~ BMI
  - insert_norehash_duplicate: libstdc++ < BMI < BU
  - insert_norehash_duplicate_highlf: libstdc++ < BMI < BU
  - insert_rehash_duplicate: libstdc++ < BMI < BU
  - insert_rehash_duplicate_highlf: libstdc++ < BMI < BU
  - erase_unique: libstdc++ < BMI < BU
  - erase_duplicate: libstdc++ < BMI < BU
  - erase_duplicate_highlf: libstdc++ < BMI < BU
  - ok_lookup_unique: BMI < BU < libstdc++
  - nok_lookup_unique: BMI < BU < libstdc++
  - ok_lookup_duplicate: BMI < BU < libstdc++
  - nok_lookup_duplicate: BMI < BU < libstdc++
  - ok_lookup_duplicate_highlf: BMI < BU < libstdc++
  - nok_lookup_duplicate_highlf: BMI < BU < libstdc++

* Clang 3.4 with libc++
http://bannalia.blogspot.com/2014/01/a-better-hash-table-clang.html

  - insert_norehash_unique: BMI < BU < libc++
  - insert_rehash_unique: BMI < BU < libc++
  - insert_norehash_duplicate: BMI < BU < libc++
  - insert_norehash_duplicate_highlf: BMI < BU < libc++
  - insert_rehash_duplicate: BMI < BU < libc++
  - insert_rehash_duplicate_highlf: BMI < BU < libc++
  - erase_unique: BMI < libc++ < BU
  - erase_duplicate: BMI < BU < libc++
  - erase_duplicate_highlf: BMI < BU < libc++
  - ok_lookup_unique: BMI < BU < libc++
  - nok_lookup_unique: BMI < BU < libc++
  - ok_lookup_duplicate: BMI < BU < libc++
  - nok_lookup_duplicate: BMI ~ BU < libc++
  - ok_lookup_duplicate_highlf: BMI < BU < libc++
  - nok_lookup_duplicate_highlf: BMI ~ BU < libc++

In particular, Boost containers (Unordered and MultiIndex) fare better
on lookup. Also, they are the only ones using a special data structure
to skip duplicate elements in constant time, which shows in much faster
lookups for high load factors. Dinkum is generally faster at erasure,
but does so at the expense of potentially throwing where no throwing
is permitted (Stephan already knows about this). Some fixed conditions
for the tests, like the fact that easy-to-hash uints are used, do
certainly bias some results in favor of containers *not* storing
hash values (Dinkum, BMI, libstdc++ in some cases). But, summarizing,
I wouldn't say standard library implementations can be unconditionally
be deemed superior.

Joaquín M López Muñoz
Telefónica Digital


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net