Boost logo

Boost :

Subject: Re: [boost] Interest in a Bloom Filter?
From: Milosz Marian HULBOJ (mhulboj_at_[hidden])
Date: 2009-06-15 02:49:12


Dean Michael Berris wrote:
> Hi Everyone,
>
> During the weekend, I got acquainted with and excited about Bloom
> Filters and the utility of such a data structure. I then proceeded to
> look at generic implementations of a bloom filter and I thought about
> implementing one myself instead as a simple exercise.
>
> Attached is what I came up with which I'm submitting as a library for
> review (and uploading to the vault). Also attached is a sample program
> which uses the bloom_filter.

Hello,

Few comments:

- It would be nice to have the option of using `double-hashing' schema,
where all the hashes are generated according to:
hash(i) = h1(x) + i*h2(x)
or the squared/cubed variant:
hash(i) = h1(x) + i*h2(x) + i*i (*i)

- one hash value can be potentially used to generate many hash values.
For example, if you need 16-bit hashes, then from the hash function
producing 32-bit hash-value you could get 2 values. This can be in turn
combined with the previous point - double hashing.

- From my point of view bloom filters are used in performance-critical
applications, and each tiny optimisation counts.

- intersection/union/difference operations

Regards,
Milosz


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk