Boost logo

Boost :

Subject: Re: [boost] Request interest in Bloom filters
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2009-03-23 09:16:15


vicente.botet wrote:
> I need for my application bloom filters. Do you know of a good boostified
> generic implementation?
> Is there any interest in such a generic bloom filters library in Boost?

Hi Vicente,

I have a simple generic Bloom filter here:

     http://svn.chezphil.org/libpbe/trunk/include/bloom_filter.hh

This is GPL licensed but I'd be happy to change that to the Boost
license if someone wanted to make a Boost library using it.

After writing this code I eventually decided to use a different method,
so I'm not currently using this code anywhere.

The code is very straightforward. I spent much longer learning the
theory of Bloom filters than I did implementing it. Unfortunately I've
now forgotten much of the theory and all that is left is the code...

What's not included is hash functions; you have to supply these as a
template parameter. I used Boost.CRC and the hash function from std::
unordered containers. I seem to have lost the example code that did
this. One general problem with this approach is to find enough
independent hash functions. Say I'm storing strings and I need four
hash functions; the temptation is to have one function and to compute
e.g. f(x), f("123"+x), f("456"+x), f("789"+x). These will give
different answers and may look like they're independent functions, but
it's not really certain how independent they are.

Cheers, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk