Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project
From: Alejandro Cabrera (cpp.cabrera_at_[hidden])
Date: 2011-06-26 18:41:30

Next message: Paul Mensonides: "Re: [boost] [preprocessor] Warning: Incoming"
Previous message: Paul Mensonides: "Re: [boost] [preprocessor] pp-strings"
In reply to: Arash Partow: "Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project"
Next in thread: Phil Endecott: "Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project"
Reply: Phil Endecott: "Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project"

Arash Partow wrote:
>
> Alejandro Cabrera wrote:
>> I'm writing to request feedback, suggestions, and advice from the Boost
>> community. I've been working on a Bloom filter data structure for about a
>> month and a half now. At this point, the following is complete and
>> available in the Boost Sandbox:
>> - working, basic prototype (insert & query operations, union & intersect)
>> - a test suite covering the public interface
>> - five examples
>> - documentation covering many aspects of the project (tested on Chrome 12
>> and 13, Firefox 4 and 6)
>>
>
> Looks ok, but one important question - Why is the BF typed? Its not
> necessary and in fact there are many use-cases where one might want to
> insert and/or test membership for a range of different types using the
> same BF - all that those types require are that they be hashable.
>

Though not necessary, I believe having a typed BF is safer. If there are
users that want to eschew type safety to allow operations on multiple types,
I believe this can be accomplished either by using a Bloom filter that
accepts insertions of type void *.

Arash Partow wrote:
>
> Another issue, along the lines of what Phil mentioned wrt naming of the
> "contains" method. It should really be called "contains" for two reasons,
> firstly its a transitive verb - "a doing" label which is indicative of how
> it would be used in code, and secondly you've provided the method
> false_positive_rate which coupled with the fact that a BF is a
> probabilistic set, indicates to the user that any result will inherently
> have a false positive probability assigned with it, which is naively
> proportional to a function of the bits in the BF and the number of
> elements inserted into the BF (and not necessarily the element's
> bit-length).
>

I agree that the transitive verb property is crucial. It communicates the
intent of the operation. I've opted to go with the name "probably_contains"
for the time being. Even though it is slightly longer to type, it highlights
its probabilistic nature. I want there to be minimal confusion for users new
to Bloom filters about the nature of the data structure.

Arash Partow wrote:
>
> All in all a good start - keep up the good work!
>

Thank you for the feedback!

--
View this message in context: http://boost.2283326.n4.nabble.com/GSoC-Request-for-Feedback-on-Boost-Bloom-Filter-Project-tp3614026p3626643.html
Sent from the Boost - Dev mailing list archive at Nabble.com.

Next message: Paul Mensonides: "Re: [boost] [preprocessor] Warning: Incoming"
Previous message: Paul Mensonides: "Re: [boost] [preprocessor] pp-strings"
In reply to: Arash Partow: "Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project"
Next in thread: Phil Endecott: "Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project"
Reply: Phil Endecott: "Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk