Boost logo

Boost :

Subject: Re: [boost] [GSoC] Request for Feedback on Boost.Bloom Filter Project
From: Alejandro Cabrera (cpp.cabrera_at_[hidden])
Date: 2011-06-28 22:25:36


Arash,

Arash Partow wrote:
>
> Vicente Botet wrote:
> > Could you give a concrete example of insertion of different types? any
> type?
> >
>
> A valid use-case could be that you have a struct called ipv4 and another
> called ipv6 which represent ip addresses (though not necessarily
> inheriting from a common base). Now it is true that they are nothing more
> than arrays of chars (or a string), but lets assume we'd like to treat
> them as the objects they are without knowing their underlying detail.
> Following on from this I'd like to be able to do something like the
> following:
>
> bloom_filter<AHashFunc> bf(.....);
>
> ipv4 i4(....);
> ipv6 i6(....);
>
> bf.insert(i4);
> bf.insert(i6);
>
>
> Where the only requirement is that the types ipv4 and ipv6 have the
> correct hash function specialization in the namespace std::tr1
>

I can see the convenience of this approach. Why would you prefer to use one
Bloom filter to store two related types rather than two Bloom filters? For
example:

code wrote:
>
> bloom_filter&lt;ip4, 10000&gt; ip4_bloom;
> bloom_filter&lt;ip6, 10000&gt; ip6_bloom;
>
> ip4 x4(...);
> ip6 x6(...);
>
> if ( // if is ip4, perhaps using type-traits or some form of polymorphism
> )
> ip4_bloom.insert(x4);
> else if ( // if is ip6)
> ip6_bloom.insert(x6);
> else
> //error;
>
> //...
>
> if (!ip4_bloom.contains(...) )
> // load resource from high latency storage
> else
> // fetch from cache
>

This is how I would typically expect the current Bloom filter to be used in
multi-type scenarios.

Arash Partow wrote:
>
> Something to ponder over, when inserting a boost:any is it the any-type as
> a whole, the underlying type or the serialized type that will be inserted
> into the BF? What about a Person class? would it be the Person type as a
> whole, or the serialized form of every field in the Person type that will
> be inserted? - I supposed it all depends on the way the hash input is
> generated.
>

This really depends on the implementation of the Hasher (hash function
callable). A Hasher could target a specific field of an object to generate
its fingerprint. Alternatively, the Hasher could treat an object as an array
of bytes and generate the fingerprint in that fashion. I've left this aspect
of the design very open to customization.

-Alej

--
View this message in context: http://boost.2283326.n4.nabble.com/GSoC-Request-for-Feedback-on-Boost-Bloom-Filter-Project-tp3614026p3631842.html
Sent from the Boost - Dev mailing list archive at Nabble.com.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk