|
Boost : |
Subject: Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-06-25 13:53:18
Arash Partow wrote:
> Phil Endecott wrote:
>> For some reason I have a mental block about how intersection works. Union is OK; I can see that
>>
>> set a;
>> set b;
>> bloom_filter(a) union bloom_filter(b) == bloom_filter( a union b );
>>
>> but it's not clear to me if
>>
>> bloom_filter(a) intersect bloom_filter(b) == bloom_filter( a intersect b );
>>
>> Can you convince me?
>
> That is correct, afaik only union (or) and difference (xor) can be applied to non-counting BFs.
Sorry Arash but I don't understand your reply; what exactly is correct?
I believe I have convinced myself that a bitwise AND on bloom filters
gives something that is a superset of the intersection of the contents,
i.e. the false positive rate is increased. A proof would be good.
Regarding xor, it superficially looks like it might give the symetric
set difference i.e. those elements in A or B but not both, but when you
consider overlapping hashes I think it results in false negatives.
Regards, Phil.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk