Subject: Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project
From: Arash Partow (arash_at_[hidden])
Date: 2011-06-25 14:43:05
Phil Endecott wrote:
> Sorry Arash but I don't understand your reply; what exactly
> is correct?
My point was that you are correct, the "and" does not provide anything of value. Only "or" and "xor" are practical for non-counting bloom filters.
> I believe I have convinced myself that a bitwise AND on bloom
> filters gives something that is a superset of the intersection
> of the contents, i.e. the false positive rate is increased. A
> proof would be good. Regarding xor, it superficially looks like
> it might give the symetric set difference i.e. those elements
> in A or B but not both, but when you consider overlapping hashes
> I think it results in false negatives.
You're correct there as well, I should have qualified the difference type, as it is indeed a symmetric difference. That said, I'm not sure what you mean by overlapping hashes, the only rule for such operations is that the BFs be constructed similarly - that is:
1. same types of hash functions
2. same number of hash functions
3. same filter sizes
4. same hash value quantization value
Otherwise such set operations simply wont work and become meaningless.