Boost logo

Boost :

Subject: Re: [boost] Boost.Bloom_filter now in Sandbox (WAS Re: Interest in a Bloom Filter?)
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2009-06-18 01:50:39

Hi Arash,

On Thu, Jun 18, 2009 at 7:00 AM, Arash Partow<arash_at_[hidden]> wrote:
> 1. Can you please explain why you need multiple "hashes"?

You don't need multiple hashes, but if you do want to use more hash
functions (K) you can. The point is the user should be able to provide
the hash functions that the bloom filter will use internally.

> 2. Your code doesn't trivially exit on the contains

It used to be that it did (in the version that didn't use
Boost.Fusion). I can try and optimize still in cases when it
encounters a '0' bit in between the hash function results.

> 3. Your solution doesn't provide a means to determine the various optimal
> parameter combinations

What do you mean by this?

> 4. Your solution doesn't provide a means for counting modes

Modes for bloom filters? I'm not sure I understand what you mean.

> 5. Your solution doesn't provide for erasures (do not confuse with removal
> of inserted items)

.clear() is supposed to clear the internal bitset and clear the bloom filter.

> 6. Your solution doesn't provide for self superimposed compression mode
> (similar to union but doubling of fpp)

The idea of the bloom_filter is to provide precisely that, a bloom
filter -- any compression you might want to do in transmitting a
bloom_filter would IMO be beyond the scope of the library. There has
been some suggestions of implementing a compressed_bloom_filter that
maintains a compressed bitset, but afar from that I must say I have no
idea what self superimposed compression mode.

Dean Michael Berris | Software Engineer, Friendster, Inc. | | | |

Boost list run by bdawes at, gregod at, cpdaniel at, john at