Boost logo

Boost :

Subject: Re: [boost] Interest in a Bloom Filter?
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2009-06-15 04:13:07


Hi Milosz,

On Mon, Jun 15, 2009 at 2:49 PM, Milosz Marian HULBOJ<mhulboj_at_[hidden]> wrote:
>
> - It would be nice to have the option of using `double-hashing' schema,
> where all the hashes are generated according to:
> hash(i) = h1(x) + i*h2(x)
> or the squared/cubed variant:
> hash(i) = h1(x) + i*h2(x) + i*i (*i)
>

This is interesting. I'd think this would have to be implemented by
the user of the bloom_filter in the HashFunctions template parameter.

> - one hash value can be potentially used to generate many hash values. For
> example, if you need 16-bit hashes, then from the hash function producing
> 32-bit hash-value you could get 2 values. This can be in turn combined with
> the previous point - double hashing.
>

Right. But this is something that would have to be provided by the user IMO.

> - From my point of view bloom filters are used in performance-critical
> applications, and each tiny optimisation counts.
>

I would agree, and I've tried to optimize as much as I can (or as much
as I know how to) by having a statically-sized version and another
version that has a runtime-constructed Boost.Dynamic_bitset contained.
If you have specific ideas about how I can still optimize parts of the
implementation, I'd definitely like to hear them. :)

> - intersection/union/difference operations

This sounds like it should be easy to do, but I'd need to look into
the protocol implementation for this. I say this because I might run
into a place where instead of just doing simple naive
intersection/union/difference on bitsets, I might have to do some
expression templates with Boost.Proto -- and as much as I'd love to
use Boost.Proto and learn how to implement real protocols with it, I'd
rather stick with the simple implementations first.

At any rate, this would be interesting/easy to implement as free
function operator overloads.

Thanks for your thoughts and I hope this helps!

-- 
Dean Michael Berris | Software Engineer, Friendster, Inc.
blog.cplusplus-soup.com | twitter.com/mikhailberis |
linkedin.com/in/mikhailberis | profiles.friendster.com/mikhailberis |
deanberris.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk