Boost logo

Boost :

Subject: Re: [boost] Re quest interest in Bloom filters
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2009-03-23 12:15:01


Vicente Botet Escriba wrote:
> Phil Endecott-48 wrote:
>>
>> vicente.botet wrote:
>>> I need for my application bloom filters. Do you know of a good boostified
>>> generic implementation?
>>> Is there any interest in such a generic bloom filters library in Boost?
>>
>> Hi Vicente,
>>
>> I have a simple generic Bloom filter here:
>>
>> http://svn.chezphil.org/libpbe/trunk/include/bloom_filter.hh

>> I seem to have lost the example code

Now found and checked in here:

     http://svn.chezphil.org/libpbe/trunk/examples/spellcheck.cc

> The hard part (the hash functions) is left out quite elegantly. I was
> thinking about defining some hash traits wich will allow the user to set the
> hash functions in a independent way. Something like
>
> template <template T, size_t N> struct hash_traits;
>
> template <> hash_traits<SpecificUserType,0> {
> static unsigned hash(T& value) {
> // specific code
> }
> }
>
> BTW, besides the tr1::hash function, are there other hash generic functions
> in Boost?

You can use Boost.CRC. What sort of key do you have? I once did some
investigation of hash functions for strings for use with unordered
containers; I was considering using a degenerate hash that only looks
at the first N characters.

Something I didn't mention before is that if you need, say, 10-bit hash
values then a 32-bit CRC could give you three of them. In this case
your hash_traits would presumably compute the same CRC three times and
extract different portions each time, which is not ideal.

> I would need also the union and intersection functions. I'm wondering if
> replacing the tr1::array by a boost::dynamic_bitset should make eassier
> this.

Well it's not exactly difficult with the array. I've never used
dynamic_bitset and being dynamic is not a requirement here. I was
concerned (as normal) about getting a fast implementation and the
simple bit manipulation in my code is probably about as fast as you can get.

Cheers, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk