Boost logo

Boost :

Subject: Re: [boost] Boost.Bloom_filter now in Sandbox (WAS Re: Interest in a Bloom Filter?)
From: Joaquin M Lopez Munoz (joaquin_at_[hidden])
Date: 2009-06-08 16:36:07


Dean Michael Berris <mikhailberis <at> gmail.com> writes:

> The implementation is now available in Boost's Sandbox. You can check
> out the implementation along with the documentation via:
>
> [...]
>
> I'd also like to request to add this to the formal review queue (if
> this qualifies for a fast track review, that would also be an option
> I'm open to).

With all due respect, I think your submission is in too early
a state to already request a formal review. I'd say you are
in "refinement" stage as described in:

http://www.boost.org/development/submissions.html

So you can still get a lot of feedback till the design
reaches a mature state. IMHO there're plenty of
opportunities for improvement, here are a handful of
observations in no particular order:

1. Your material lacks proper tests and examples, and
the docs are sketchy (a formal reference would be
needed at least).
2. bloom_filter should have an allocator template
parameter, just like any other container around.
3. Have you considered the possibility to let
the user specify M and K at constuction time rather
than compile tiem and the pros and cons of both
approaches?
4. It'd be nice to have a way to specify the
desired false positive probability instead of
M and K.
5. It'd be nice if entry-level users can get
a default set of hash functions so that they
need not specifiy them --maybe a parameterized
hash formula h(x,N) so that you can just use
h(x,0), h(x,1),... h(x,N-1).
6. You're not providing a free function
swap(bloom_filter&,bloom_filter&).
7. bloom_filters are not currently testable for
equality.
8. bloom_filters are not currently serializable.
9. Union and intersection of bloom_filters are not
provided (according to the Wikipedia entry these
ops are trivial to implement).
10. For completeness, seems like constructing a
bloom_filter from a bitset_type would be nice to
have.
11. You'd have to decide whether you want to make
this look more like a std::map<T,bool> or not.
If the former, some boilerplate typedefs
like value_type and memer functions like size()
could be provided.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk