Boost logo

Boost Users :

Subject: Re: [Boost-users] [unordered] Merging unordered maps
From: Pete Bartlett (pete_at_[hidden])
Date: 2014-05-29 17:24:26


Daniel James wrote:
> Pete Bartlett wrote:
>> I have a performance hotspot when merging a few unordered_maps of the
>> same type into a master unordered_map (also of the same type)
>>
>> Currently I am doing
>>
>> foreach( auto const& submap ,submaps )
>> master.insert( submap.begin() , submap.end() );
>>
>> This seems to be somewhat wasteful because the insert is (I think)
computing hashes that the >>submaps already "know".
>>
>> Can I improve on this
>> - without changing Boost?

>Not as far as I know.

Thank you very much for your very helpful comments.

I realise now that in my case, where the hash calculation is expensive
relative to node insertion cost, it is worthwhile doing a sort of
memoization technique: If my original key type is K and hash is h, I can
change the key type to pair<K,size_t> where the keys are now (k,h(k)) and
then give the unordered_map an adapted hash that simply returns pair.second.
I now only pay the hash calculation cost when first inserting into the
submaps. As you point out below, this is only safe for non-random hashes but
works for me.

>> - some more intrusive change?

>Maybe. Could write a function that inserts from another container, rather
than using >iterators. It is now possible to tell if the same hash function
type was used, but it isn't >possible to tell that they're equal (eg. when
using a randomized hash function, the hash >values might not be the same for
two different instances). It might be possible to create a >type trait to
detect if the hash function has an equality operator - and assume they're
not >equal if it doesn't.

Having said the above, I fancy a crack at this just for my own education.
I'll let you know if I get anywhere.

Thanks again,
Pete


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net