Boost logo

Boost :

From: Maxim Yegorushkin (maxim.yegorushkin_at_[hidden])
Date: 2006-05-29 14:24:00


Joaquín Mª López Muñoz wrote:

[]

> The metaprogramming scaffolding constructed by B.MI with the
> aim of allowing multi-indexing is not always completely optimized
> away by the compiler, basically, most operations go through a lot more
> (ideally inlined) functions. GCC is traditionally poorer at optimizing
> than say MSVC or ICC. But the range of degradation I expect is
> 0-10%, not as high as the 250+% on insert that your tests show for
> mi_hash...
>
> ...which has gotten me examining the test to determine what's going
> weird. I think I found it: your insert test consists of invocations of
> the form
>
> h.insert(begin, end);
>
> for which std_hash (and presumably ext_hash) are smart enough to
> pre-rehash based on n=std::distance(begin,end) when this distance's
> calculation is affordable. B.MI hashed indices don't do that, so the
> same call incurs several rehashing ops from size()=0 till size()=n.
> Could you do me a favor? Please replace your local copy of
> hashed indices range insert member function (at line 249 of
> the RC_1_34_0 version of boost/multi_index/hashed_index.hpp):

[code snipped]

> and rerun your tests? If my analysis is correct, figures for
> insert(mi_hash) should be then on par with or only slightly worse
> than std_hash and ext_hash. If this hunch proves right, it is certainly
> an opportunity for huge performance improvements and I will
> deal with it (in a proper way, the patch above is only a hack meant
> for testing purposes).

Here are results for cvs head version, note the memory usage for mi_set:

99994 elements; 1 iterations
insert test (lower is better)
ext_hash [ 35.29%] OOOOOOOOOOOOOO 15105 usec
std_hash [ 38.83%] OOOOOOOOOOOOOOO 16619 usec
mi_hash [ 88.74%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 37984 usec
mi_set [ 98.82%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 42299 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 42804 usec
find hit test (lower is better)
ext_hash [ 13.13%] OOOOO 4730 usec
std_hash [ 14.15%] OOOOO 5096 usec
mi_hash [ 15.56%] OOOOOO 5606 usec
mi_set [ 93.48%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 33675 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 36024 usec
find miss test (lower is better)
std_set [ 56.75%] OOOOOOOOOOOOOOOOOOOOOO 4275 usec
mi_set [ 61.10%] OOOOOOOOOOOOOOOOOOOOOOOO 4603 usec
mi_hash [ 82.40%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 6207 usec
ext_hash [ 84.10%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 6335 usec
std_hash [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 7533 usec
iterate test (lower is better)
std_hash [ 51.22%] OOOOOOOOOOOOOOOOOOOO 4542 usec
mi_hash [ 72.95%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOO 6469 usec
mi_set [ 83.39%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 7395 usec
ext_hash [ 84.66%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 7508 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 8868 usec
erase test (lower is better)
std_hash [ 21.42%] OOOOOOOO 12797 usec
ext_hash [ 22.85%] OOOOOOOOO 13650 usec
mi_hash [ 22.93%] OOOOOOOOO 13699 usec
mi_set [ 94.28%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 56324 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 59740 usec
memory test (lower is better)
std_hash [ 61.58%] OOOOOOOOOOOOOOOOOOOOOOOO 1231544 bytes
ext_hash [ 79.32%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1586404 bytes
mi_hash [ 79.33%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1586416 bytes
mi_set [ 80.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1599920 bytes
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1999880 bytes

Here are results for cvs head version with your modification, which show that
your analysis was correct:

99994 elements; 1 iterations
insert test (lower is better)
mi_hash [ 31.71%] OOOOOOOOOOOO 13467 usec
ext_hash [ 35.01%] OOOOOOOOOOOOOO 14871 usec
std_hash [ 38.78%] OOOOOOOOOOOOOOO 16471 usec
mi_set [ 97.50%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 41409 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 42471 usec
find hit test (lower is better)
ext_hash [ 12.83%] OOOOO 4604 usec
std_hash [ 14.17%] OOOOO 5086 usec
mi_hash [ 16.04%] OOOOOO 5757 usec
mi_set [ 92.12%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 33056 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 35885 usec
find miss test (lower is better)
std_set [ 57.50%] OOOOOOOOOOOOOOOOOOOOOOO 4291 usec
mi_set [ 62.88%] OOOOOOOOOOOOOOOOOOOOOOOOO 4692 usec
mi_hash [ 81.33%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 6069 usec
ext_hash [ 83.49%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 6230 usec
std_hash [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 7462 usec
iterate test (lower is better)
std_hash [ 50.29%] OOOOOOOOOOOOOOOOOOOO 4425 usec
mi_hash [ 73.62%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOO 6478 usec
mi_set [ 82.21%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 7234 usec
ext_hash [ 84.88%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 7469 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 8799 usec
erase test (lower is better)
std_hash [ 21.48%] OOOOOOOO 12728 usec
ext_hash [ 22.77%] OOOOOOOOO 13492 usec
mi_hash [ 23.35%] OOOOOOOOO 13837 usec
mi_set [ 94.06%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 55744 usec
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 59264 usec
memory test (lower is better)
std_hash [ 61.58%] OOOOOOOOOOOOOOOOOOOOOOOO 1231544 bytes
ext_hash [ 79.32%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1586404 bytes
mi_hash [ 79.33%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1586416 bytes
mi_set [ 80.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1599920 bytes
std_set [100.00%] OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 1999880 bytes

BTW, the in-house container I tested originally missed the same optimization
opportunity.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk