|
Boost-Commit : |
Subject: [Boost-commit] svn:boost r73238 - in sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html: . reference reference/hash style tut
From: cpp.cabrera_at_[hidden]
Date: 2011-07-19 02:47:19
Author: alejandro
Date: 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
New Revision: 73238
URL: http://svn.boost.org/trac/boost/changeset/73238
Log:
A lot of git history was lost as a result of a 'git svn dcommit' error. However, the short of it is that the tutorial modules are complete, and a few CSS rules have been added.
Added:
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference/counting_bloom.html (contents, props changed)
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference/hash/murmurhash3.html (contents, props changed)
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/joining.html
- copied, changed from r73237, /sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html
Text files modified:
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/design.html | 5 +-
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/history.html | 12 +++---
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/index.html | 8 +--
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference.html | 10 +++--
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/style/my.css | 4 ++
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/inserting.html | 19 ++++++++++
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/joining.html | 37 ++++++++++++++++++-
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/querying.html | 16 ++++++++
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/removing.html | 42 +++++++++++++++++++++-
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html | 73 ++++++++++++++++++++++++++++++++++++++-
sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tutorial.html | 60 +++++++-------------------------
11 files changed, 215 insertions(+), 71 deletions(-)
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/design.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/design.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/design.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -39,14 +39,15 @@
<div class="toc">
<ul>
- <li>Overview</li>
+ <li>Basic Bloom Filter</li>
+ <li>Dynamic Basic Bloom Filter</li>
+ <li>Counting Bloom Filter</li>
<li>Default Hash Functions</li>
<li>C++0x Considerations</li>
<li>Future Directions</li>
</ul>
</div>
- <a name="overview"></a>
<h4>Overview</h4>
<p>
This Bloom filter implementation is designed to allow compile-time determination of storage space(in bits), insertion/query type, and hash functions. The design goals were:
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/history.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/history.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/history.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -36,7 +36,7 @@
</div>
<h1 class="title">Version history</h1>
- <h3>v1.9 (July 2nd)</h3>
+ <h3>v0.3.0 (July 2nd)</h3>
<p>
Big version jump. The dynamic Bloom filter is completed ahead of schedule. All that's left to do is a few documentation updates.
</p>
@@ -61,10 +61,10 @@
</li>
</ul>
<p>
- Version 2.0 is expected to arrive by July 4th with the introduction of a solid interface for the counting Bloom filter as well as up to date documentation.
+ Version 0.4.0 is expected to arrive by July 4th with the introduction of a solid interface for the counting Bloom filter as well as up to date documentation.
</p>
- <h3>v1.3 (June 24th)</h3>
+ <h3>v0.2.0 (June 24th)</h3>
<ul>
<li><strong>Test Changes</strong>
<ul>
@@ -76,7 +76,7 @@
</li>
</ul>
- <h3>v1.2 (June 23rd)</h3>
+ <h3>v0.1.2 (June 23rd)</h3>
<ul>
<li><strong>API Changes</strong>
<ul>
@@ -88,7 +88,7 @@
</li>
</ul>
- <h3>v1.1 (June 21st)</h3>
+ <h3>v0.1.1 (June 21st)</h3>
<ul>
<li><strong>API Changes</strong>
<ul>
@@ -99,7 +99,7 @@
</li>
</ul>
- <h2>v1.0 (early June)</h2>
+ <h3>v0.1.0 (early June)</h3>
<ul>
<li>Initial release.</li>
<li>Features:
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/index.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/index.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/index.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -35,18 +35,16 @@
<div class="toc">
<p>Table of Contents</p>
<ul>
- <li>Introduction</li>
<li>Tutorial</li>
<li>Reference</li>
<li>Testing</li>
- <li>Bibliography</li>
<li>Version History</li>
<li>Design</li>
<li>Acknowledgements</li>
+ <li>Bibliography</li>
</ul>
</div>
- <a name="introduction"></a>
<h2>Introduction</h2>
<p>
A Bloom filter is a probabilistic data structure that allows the compact representation of a set of elements. Bloom filters are used where an application can gain a performance advantage by avoiding an expensive lookup. For example, one might use a Bloom filter to ask whether a file may be stored on the network. If the Bloom filter returns false, the network lookup can be skipped. A few known applications of Bloom filters:
@@ -60,7 +58,7 @@
Internally, a Bloom filter stores its elements using a bit set with all bits initially set to 0. When an element is to be inserted into the Bloom filter, it is passed to a series of hash functions. The indices generated by the hash functions are all set to 1. When the Bloom filter is queried for an element's set membership, the element is passed through that same set of hash functions. If all the positions given by the hash functions are set to 1, then the element is reported to be in the set.
</p>
<p>
- A powerful property of a Bloom filter is that it is capable of representing the universe of items in a given set without generating any false negatives - if the element is in the Bloom filter, the Bloom filter will never report that the element is not contained within. However, a Bloom filter may report false positives, with a given probability based on the interaction between the number of bits the Bloom filter uses for element storage, the number of bits set, and the number of hash functions used.
+ A powerful property of a Bloom filter is that it is capable of representing the universe of items in a given set without generating any false negatives - if the element is in the Bloom filter, the Bloom filter will never report that the element is not contained within. However, a Bloom filter may report false positives with a given probability based on the interaction between the number of bits the Bloom filter uses for element storage, the number of bits set, and the number of hash functions used.
</p>
<hr/>
@@ -72,7 +70,7 @@
<div>
<p>
- Last revised: <time datetime="2011-07-02">July 2, 2011</time>.
+ Last revised: <time datetime="2011-07-19">July 19, 2011</time>.
</p>
<p class="copyright">
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -24,7 +24,7 @@
<hr/>
<div class="spirit-nav">
- <a accesskey="p" href="tutorial.html">
+ <a accesskey="p" href="tut/tips.html">
<img src="../../../../doc/src/images/prev.png" alt="Prev"/>
</a>
<a accesskey="h" href="index.html">
@@ -40,18 +40,20 @@
<div class="toc">
<h3>Classes</h3>
<ul>
- <li>bloom_filter</li>
- <li>dynamic_bloom_filter</li>
+ <li>basic_bloom_filter</li>
+ <li>dynamic_basic_bloom_filter</li>
+ <li>counting_bloom_filter</li>
</ul>
<h3>Hashers</h3>
<ul>
<li>boost_hash (default hasher)</li>
+ <li>murmurhash3</li>
</ul>
</div>
<hr/>
<div class="spirit-nav">
- <a accesskey="p" href="tutorial.html">
+ <a accesskey="p" href="tut/tips.html">
<img src="../../../../doc/src/images/prev.png" alt="Prev"/>
</a>
<a accesskey="h" href="index.html">
Added: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference/counting_bloom.html
==============================================================================
--- (empty file)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference/counting_bloom.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -0,0 +1,418 @@
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
+ <link rel="stylesheet" type="text/css"
+ href="../../../../../doc/src/boostbook.css"/>
+ <link rel="stylesheet" type="text/css" href="../style/my.css"/>
+
+ <title>Boost.BloomFilter</title>
+ </head>
+
+ <body>
+ <div class="header">
+ <img src="../../../../../boost.png" width="277" height="86"
+ alt="Boost C++ Libraries"/>
+ <p>
+ Home
+ Libraries
+ People
+ FAQ
+ More
+ </p>
+ </div>
+
+ <hr/>
+
+ <div class="spirit-nav">
+ <a accesskey="p" href="../reference.html">
+ <img src="../../../../../doc/src/images/prev.png" alt="Prev"/>
+ </a>
+ <a accesskey="u" href="../reference.html">
+ <img src="../../../../../doc/src/images/up.png" alt="Up"/>
+ </a>
+ <a accesskey="h" href="../index.html">
+ <img src="../../../../../doc/src/images/home.png" alt="Home"/>
+ </a>
+ <a accesskey="n" href="dynamic_bloom.html">
+ <img src="../../../../../doc/src/images/next.png" alt="Next"/>
+ </a>
+ </div>
+
+ <h1>Basic Bloom Filter</h1>
+ <div class="listing">
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_type">T</code>,
+ <code class="c_type">size_t</code> <code class="c_type">Size</code>,
+ <code class="c_keyword">class</code> <code class="c_type">HashFunctions</code> = <code class="c_namespace">mpl::</code><code class="c_type">vector</code><<code class="c_type">boost_hash</code><<code class="c_type">T</code>, 3> > >
+ <code class="c_keyword">class</code> <code class="c_id">bloom_filter</code> {
+ <code class="c_comment">// exported typedefs</code>
+ <code class="c_keyword">typedef</code> <code class="c_type">T value_type</code>;
+ <code class="c_keyword">typedef</code> <code class="c_type">T key_type</code>;
+ <code class="c_keyword">typedef</code> <code class="c_type">HashFunctions hash_function_type</code>;
+
+ <code class="c_comment">// constructors</code>
+ <code class="c_func">bloom_filter</code>();
+ <code class="c_func">bloom_filter</code>(<code class="c_keyword">const</code> <code class="c_namespace">std::initializer_list</code><<code class="c_type">T</code>>&); <code class="c_comment">// requires C++11</code>
+
+ <<code class="c_keyword">typename</code> <code class="c_type">InputIterator</code>>
+ <code class="c_func">bloom_filter</code>(<code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">start</code>, <code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">end</code>);
+
+ <code class="c_comment">// data structure metadata query functions</code>
+ <code class="c_keyword">static constexpr</code> <code class="c_type">size_t</code> <code class="c_func">bit_capacity</code>() <code class="c_keyword">const</code>;
+ <code class="c_keyword">static constexpr</code> <code class="c_type">size_t</code> <code class="c_func">num_hash_functions</code>() <code class="c_keyword">const</code>;
+ <code class="c_type">double</code> <code class="c_func">false_positive_rate</code>() <code class="c_keyword">const</code>;
+ <code class="c_type">size_t</code> <code class="c_func">count</code>() <code class="c_keyword">const</code>;
+ <code class="c_type">bool</code> <code class="c_func">empty</code>() <code class="c_keyword">const</code>;
+
+ <code class="c_comment">// data structures core ops</code>
+ <code class="c_type">void</code> <code class="c_func">insert</code>(<code class="c_keyword">const</code> <code class="c_type">T</code>&);
+
+ <<code class="c_keyword">typename</code> <code class="c_type">InputIterator</code>>
+ <code class="c_type">void</code> <code class="c_func">insert</code>(<code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">start</code>, <code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">end</code>);
+
+ <code class="c_type">bool</code> <code class="c_func">probably_contains</code>(<code class="c_keyword">const</code> <code class="c_type">T</code>&) <code class="c_keyword">const</code>;
+
+ <code class="c_comment">// auxilliary ops</code>
+ <code class="c_type">void</code> <code class="c_func">clear</code>();
+ <code class="c_type">void</code> <code class="c_func">swap</code>(<code class="c_type">bloom_filter</code>&);
+
+ <code class="c_comment">// union assign/intersect assign</code>
+ <code class="c_type">bloom_filter</code>& <code class="c_func">operator|=</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter</code>&);
+ <code class="c_type">bloom_filter</code>& <code class="c_func">operator&=</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter</code>&);
+ };
+
+ <code class="c_comment">// union</code>
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_type">T</code>, <code class="c_type">size_t</code> <code class="c_type">Size</code>,
+ <code class="c_keyword">class</code> <code class="c_type">HashFunctions</code>>
+ <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>
+ <code class="c_func">operator|</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">lhs</code>,
+ <code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">rhs</code>);
+
+ <code class="c_comment">// intersect</code>
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_type">T</code>, <code class="c_type">size_t</code> <code class="c_type">Size</code>,
+ <code class="c_keyword">class</code> <code class="c_type">HashFunctions</code>>
+ <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>
+ <code class="c_func">operator&</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">lhs</code>,
+ <code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">rhs</code>);
+
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_type">T</code>, <code class="c_type">size_t</code> <code class="c_type">Size</code>,
+ <code class="c_keyword">class</code> <code class="c_type">HashFunctions</code>>
+ <code class="c_type">void</code> <code class="c_func">swap</code>(<code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">lhs</code>,
+ <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">rhs</code>);
+
+ <code class="c_comment">// equality</code>
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_type">T</code>, <code class="c_type">size_t</code> <code class="c_type">Size</code>,
+ <code class="c_keyword">class</code> <code class="c_type">HashFunctions</code>>
+ <code class="c_type">bool</code>
+ <code class="c_func">operator==</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">lhs</code>,
+ <code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">rhs</code>);
+
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_type">T</code>, <code class="c_type">size_t</code> <code class="c_type">Size</code>,
+ <code class="c_keyword">class</code> <code class="c_type">HashFunctions</code>>
+ <code class="c_type">bool</code>
+ <code class="c_func">operator!=</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">lhs</code>,
+ <code class="c_keyword">const</code> <code class="c_type">bloom_filter</code><<code class="c_type">T, Size, HashFunctions</code>>& <code class="c_id">rhs</code>);
+ </div>
+ <p>
+ The basic Bloom filter data structure. Supports insertion and query. Allows setting of bit capacity, type, and hash functions at compile-time. This structure is best used when:
+ </p>
+ <ul>
+ <li>element removal is not needed</li>
+ <li>the number of insertions can be reliably upper-bounded ahead of time</li>
+ <li>storage requirements are strict</li>
+ </ul>
+
+ <h2>Template Parameters Reference</h2>
+ <dl>
+ <dt>T</dt>
+ <dd>The type used for all Bloom filter operations.</dd>
+ <dt>Size</dt>
+ <dd>The size in bits of the underlying std::bitset used by the Bloom filter.</dd>
+ <dt>HashFunctions</dt>
+ <dd>The set of hash functions used by the Bloom filter. Currently required to be an mpl::vector of Hashers.</dd>
+ </dl>
+
+ <div class="toc">
+ <p>Function Reference</p>
+ <ul>
+ <li>bloom_filter()</li>
+ <li>bloom_filter(start, end)</li>
+ <li>bloom_filter(initializer_list)</li>
+ <li>bit_capacity()</li>
+ <li>num_hash_functions()</li>
+ <li>false_positive_rate()</li>
+ <li>empty()</li>
+ <li>count()</li>
+ <li>insert(T)</li>
+ <li>insert(start, end)</li>
+ <li>probably_contains(T)</li>
+ <li>clear()</li>
+ <li>swap(bloom)</li>
+ <li>operator|=(bloom)</li>
+ <li>operator&=(bloom)</li>
+ <li>operator|(bloom, bloom)</li>
+ <li>operator&(bloom, bloom)</li>
+ <li>swap(bloom, bloom)</li>
+ <li>operator==(bloom, bloom)</li>
+ <li>operator!=(bloom, bloom)</li>
+ </ul>
+ </div>
+
+ <br/>
+
+ <div class="func_ref">
+ <a name="default_constructor"></a>
+ <div class="ref_listing"><code class="c_func">bloom_filter</code>();</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Constructs a bloom_filter object with all bits set to 0.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="range_constructor"></a>
+ <div class="ref_listing"><code class="c_func">bloom_filter</code>(<code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">start</code>, <code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">end</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Constructs a bloom_filter by inserting all the elements in the range (start, end).</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="ilist_constructor"></a>
+ <div class="ref_listing"><code class="c_func">bloom_filter</code>(<code class="c_keyword">const</code> <code class="c_namespace">std::initializer</code><<code class="c_type">T</code>>&);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Constructs a bloom_filter by inserting all elements in the initializer list.</dd>
+ <dt>Warning</dt>
+ <dd>Only available in C++11-supporting compilers.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="bit_capacity"></a>
+ <div class="ref_listing"><code class="c_type">size_t</code> <code class="c_func">bit_capacity</code>() <code class="c_keyword">const</code>;</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Returns the number of bits used internally by the Bloom filter..</dd>
+ <dt>Returns</dt>
+ <dd>Value of template parameter Size.</dd>
+ </dl>
+ </div>
+
+
+ <div class="func_ref">
+ <a name="num_hash_functions"></a>
+ <div class="ref_listing"><code class="c_type">size_t</code><code class="c_func"> num_hash_functions</code>()<code class="c_keyword"> const</code>;</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Returns the number of hash functions used by the Bloom filter.</dd>
+ <dt>Returns</dt>
+ <dd>mpl::size<HashFunctions>.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="false_positive_rate"></a>
+ <div class="ref_listing"><code class="c_type">double </code><code class="c_func">false_positive_rate</code>() <code class="c_keyword">const</code>;</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Returns the current false positive rate based upon the number of hash functions used (k), the number of bits available (m), and the number of bits set (n).</dd>
+ <dt>Returns</dt>
+ <dd>A double precision value calculated as (1 - e^(k*n/m))^k.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="empty"></a>
+ <div class="ref_listing"><code class="c_type">bool</code> <code class="c_func">empty</code>() <code class="c_keyword">const</code>;</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Returns true if no elements have been inserted into the Bloom filter.</dd>
+ <dt>Returns</dt>
+ <dd>std::bitset.count() == 0</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="count"></a>
+ <div class="ref_listing"><code class="c_type">size_t</code> <code class="c_func">count</code>() <code class="c_keyword">const</code>;</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Returns the number of bits currently set in the Bloom filter.</dd>
+ <dt>Returns</dt>
+ <dd>std::bitset.count().</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="insert"></a>
+ <div class="ref_listing"><code class="c_type">void</code> <code class="c_func">insert</code>(<code class="c_keyword">const</code> <code class="c_type">T&</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Inserts an element into the Bloom filter.</dd>
+ <dt>Post-condition</dt>
+ <dd>At least 1 bit and at most num_hash_functions() bits of the Bloom filter will have been set.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="range_insert"></a>
+ <div class="ref_listing"><code class="c_type">void</code> <code class="c_func">insert</code>(<code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">start</code>, <code class="c_keyword">const</code> <code class="c_type">InputIterator</code> <code class="c_id">end</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Inserts all elements in range (start, end).</dd>
+ <dt>Post-condition</dt>
+ <dd>At least n bits and at most (n * num_hash_functions()) bits of the Bloom filter will have been set, where n = distance(start, end).</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="probably_contains"></a>
+ <div class="ref_listing"><code class="c_type">bool</code> <code class="c_func">probably_contains</code>(<code class="c_keyword">const</code> <code class="c_type">T&</code>) <code class="c_keyword">const</code>;</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Queries an element in the Bloom filter, checking each hash function value against the set bits.</dd>
+ <dt>Returns</dt>
+ <dd>True if all hash functions agree with the bitset; false otherwise.</dd>
+ <dt>Warning</dt>
+ <dd>It may return a false positive if there are hash collisions - this means that depending on the Size of the Bloom filter and the hash functions used, there is a probability that the Bloom filter will return true even though the element was never inserted.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="clear"></a>
+ <div class="ref_listing"><code class="c_type">void</code> <code class="c_func">clear</code>();</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Sets all bits in the Bloom filter to 0, effectively canceling all insertions.</dd>
+ <dt>Post-condition</dt>
+ <dd>this->count() == 0</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="swap"></a>
+ <div class="ref_listing"><code class="c_type">void</code> <code class="c_func">swap</code>(<code class="c_type">bloom_filter</code>&);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Swaps the bits in this Bloom filter with the argument Bloom filter.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="union_assign"></a>
+ <div class="ref_listing"><code class="c_type">bloom_filter&</code> <code class="c_func">operator|=</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Performs a union.</dd>
+ <dt>Returns</dt>
+ <dd>Returns a reference to <code class="c_keyword">this</code>.</dd>
+ <dt>Post-condition</dt>
+ <dd>The number of bits set in <code class="c_keyword">this</code> will be greater than or equal to the number of bits set before the operation.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="intersect_assign"></a>
+ <div class="ref_listing"><code class="c_type">bloom_filter&</code> <code class="c_func">operator&=</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Performs an intersection.</dd>
+ <dt>Returns</dt>
+ <dd>Returns a reference to <code class="c_keyword">this</code>.</dd>
+ <dt>Post-condition</dt>
+ <dd>The number of bits set in <code class="c_keyword">this</code> will be less than or equal to the number of bits set before the operation.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="union"></a>
+ <div class="ref_listing"><code class="c_type">bloom_filter&</code> <code class="c_func">operator|</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">lhs</code>, <code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">rhs</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Performs a union.</dd>
+ <dt>Returns</dt>
+ <dd>Returns a new, stack-allocated Bloom filter that is the result of performing a union between the lhs and the rhs.</dd>
+ <dt>Post-condition</dt>
+ <dd>The number of bits set in the new filter will be greater than or equal to the number of bits set in max(lhs.count(), rhs.count()).</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="intersect"></a>
+ <div class="ref_listing"><code class="c_type">bloom_filter&</code> <code class="c_func">operator&</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">lhs</code>, <code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">rhs</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Performs an intersection.</dd>
+ <dt>Returns</dt>
+ <dd>Returns a new, stack-allocated Bloom filter that is the result of performing an intersect between the lhs and the rhs.</dd>
+ <dt>Post-condition</dt>
+ <dd>The number of bits set in the new filter will be less than or equal to the number of bits set in min(lhs.count(), rhs.count()).</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="global_swap"></a>
+ <div class="ref_listing"><code class="c_type">void</code> <code class="c_func">swap</code>(<code class="c_type">bloom_filter</code>&, <code class="c_type">bloom_filter</code>&);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Swaps the bits between the argument Bloom filters.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="operator_equality"></a>
+ <div class="ref_listing"><code class="c_type">bool</code> <code class="c_func">operator==</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">lhs</code>, <code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">rhs</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Compares the bits of the argument Bloom filters for equality.</dd>
+ <dt>Returns</dt>
+ <dd>True if the bits match, false otherwise.</dd>
+ </dl>
+ </div>
+
+ <div class="func_ref">
+ <a name="operator_inequality"></a>
+ <div class="ref_listing"><code class="c_type">bool</code> <code class="c_func">operator!=</code>(<code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">lhs</code>, <code class="c_keyword">const</code> <code class="c_type">bloom_filter&</code> <code class="c_id">rhs</code>);</div>
+ <dl>
+ <dt>Description</dt>
+ <dd>Compares the bits of the argument Bloom filters for equality.</dd>
+ <dt>Returns</dt>
+ <dd>True if the bits don't match, false otherwise.</dd>
+ </dl>
+ </div>
+ <hr/>
+
+ <div class="spirit-nav">
+ <a accesskey="p" href="../reference.html">
+ <img src="../../../../../doc/src/images/prev.png" alt="Prev"/>
+ </a>
+ <a accesskey="u" href="../reference.html">
+ <img src="../../../../../doc/src/images/up.png" alt="Up"/>
+ </a>
+ <a accesskey="h" href="../index.html">
+ <img src="../../../../../doc/src/images/home.png" alt="Home"/>
+ </a>
+ <a accesskey="n" href="dynamic_bloom.html">
+ <img src="../../../../../doc/src/images/next.png" alt="Next"/>
+ </a>
+ </div>
+
+ <div>
+ <p class="copyright">
+ Copyright © 2011
+ <a href="mailto:cpp.cabrera_at_[hidden]">Alejandro Cabrera</a>
+ </p>
+
+ <p class="copyright">
+ Distributed under the Boost Software License, Version 1.0. (See
+ accompanying file
+ LICENSE_1_0.txt or
+ copy at
+ http://www.boost.org/LICENSE_1_0.txt)
+ </p>
+ </div>
+
+ </body>
+</html>
Added: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference/hash/murmurhash3.html
==============================================================================
--- (empty file)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/reference/hash/murmurhash3.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -0,0 +1,84 @@
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
+ <link rel="stylesheet" type="text/css"
+ href="../../../../../../doc/src/boostbook.css"/>
+ <link rel="stylesheet" type="text/css" href="../../style/my.css"/>
+
+ <title>Boost.BloomFilter</title>
+ </head>
+
+ <body>
+ <div class="header">
+ <img src="../../../../../../boost.png" width="277" height="86"
+ alt="Boost C++ Libraries"/>
+ <p>
+ Home
+ Libraries
+ People
+ FAQ
+ More
+ </p>
+ </div>
+
+ <hr/>
+ <div class="spirit-nav">
+ <a accesskey="p" href="../dynamic_bloom.html">
+ <img src="../../../../../../doc/src/images/prev.png" alt="Prev"/>
+ </a>
+ <a accesskey="u" href="../../reference.html">
+ <img src="../../../../../../doc/src/images/up.png" alt="Up"/>
+ </a>
+ <a accesskey="h" href="../../index.html">
+ <img src="../../../../../../doc/src/images/home.png" alt="Home"/>
+ </a>
+ <a accesskey="n" href="../../testing.html">
+ <img src="../../../../../../doc/src/images/next.png" alt="Next"/>
+ </a>
+ </div>
+
+ <h1 class="title">Default Hash Function</h1>
+ <div class="listing">
+ <code class="c_keyword">template</code> <<code class="c_keyword">typename</code> <code class="c_id">T</code>, <code class="c_type">size_t</code> <code class="c_id">Size</code>>
+ <code class="c_keyword">struct</code> <code class="c_id">boost_hash</code> {
+ <code class="c_type">size_t</code> <code class="c_func">operator</code>()(<code class="c_keyword">const</code> <code class="c_type">T&</code>) const;
+ };
+ </div>
+ <p>
+ A generic Hasher provided by using boost::functional::hash_value. Used by the Bloom filter as a default template parameter for HashFunctions.
+ </p>
+
+ <hr/>
+ <div class="spirit-nav">
+ <a accesskey="p" href="../dynamic_bloom.html">
+ <img src="../../../../../../doc/src/images/prev.png" alt="Prev"/>
+ </a>
+ <a accesskey="u" href="../../reference.html">
+ <img src="../../../../../../doc/src/images/up.png" alt="Up"/>
+ </a>
+ <a accesskey="h" href="../../index.html">
+ <img src="../../../../../../doc/src/images/home.png" alt="Home"/>
+ </a>
+ <a accesskey="n" href="../../testing.html">
+ <img src="../../../../../../doc/src/images/next.png" alt="Next"/>
+ </a>
+ </div>
+
+ <div>
+ <p class="copyright">
+ Copyright © 2011
+ <a href="mailto:cpp.cabrera_at_[hidden]">Alejandro Cabrera</a>
+ </p>
+
+ <p class="copyright">
+ Distributed under the Boost Software License, Version 1.0. (See
+ accompanying file
+ LICENSE_1_0.txt or
+ copy at
+ http://www.boost.org/LICENSE_1_0.txt)
+ </p>
+ </div>
+
+ </body>
+</html>
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/style/my.css
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/style/my.css (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/style/my.css 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -34,6 +34,10 @@
font-family: sans;
}
+.warning {
+ color: red;
+}
+
.header img {
float: left;
position: absolute; top: 4px; left: 4px;
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/inserting.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/inserting.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/inserting.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -40,6 +40,25 @@
<h1 class="title">Inserting Elements</h1>
+ <p>
+ Instantiation is the hardest part about using a Bloom filter.
+ Once that step is complete, insertion, querying, and removal
+ proceed smoothly. The example below shows how to insert an
+ element into the Bloom filter:
+ </p>
+
+ <div class="example">
+ <code class="c_type">basic_bloom_filter</code><<code class="c_type">int</code>, 1000> <code class="c_id">bloom</code>;
+
+ bloom.insert(1);
+ bloom.insert(<code class="c_namespace">std::</code><code class="c_type">string</code>(<code class="c_str">"x"</code>)); <code class="c_comment">// compile-time error</code>
+ bloom.insert(1.0); <code class="c_comment">// fine: implicit cast float -> int</code>
+ </div>
+
+ <p>
+ Range insertion is also supported for all Bloom filter types.
+ </p>
+
<hr/>
<div class="spirit-nav">
<a accesskey="p" href="creating.html">
Copied: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/joining.html (from r73237, /sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html)
==============================================================================
--- /sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/joining.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -33,12 +33,43 @@
<a accesskey="h" href="../index.html">
<img src="../../../../../doc/src/images/home.png" alt="Home"/>
</a>
- <a accesskey="n" href="../reference.html">
+ <a accesskey="n" href="tips.html">
<img src="../../../../../doc/src/images/next.png" alt="Next"/>
</a>
</div>
- <h1 class="title">Warnings and Best Practices</h1>
+ <h1 class="title">Intersections and Unions</h1>
+
+ <p>
+ As stated previously, the union of two Bloom filters is the
+ set of elements that occur in <strong>either</strong> Bloom filter.
+ The intersection of two Bloom filters is the set of elements that
+ occur in <strong>both</strong> Bloom filters. The examples
+ below demonstrate how to use this:
+ </p>
+
+ <div class="example">
+ <code class="c_type">basic_bloom_filter</code><<code class="c_type">int</code>, 1000> <code class="c_id">bloom1</code>;
+ <code class="c_type">basic_bloom_filter</code><<code class="c_type">int</code>, 1000> <code class="c_id">bloom2</code>;
+ <code class="c_type">basic_bloom_filter</code><<code class="c_type">int</code>, 1000> <code class="c_id">bloom_union</code>;
+ <code class="c_type">basic_bloom_filter</code><<code class="c_type">int</code>, 1000> <code class="c_id">bloom_intersect</code>;
+
+ bloom1.insert(1);
+ bloom1.insert(2);
+ bloom2.insert(2);
+ bloom2.insert(3);
+
+ bloom_union = bloom1 | bloom2;
+ bloom_intersect = bloom1 & bloom2;
+
+ bloom_union.probably_contains(1); <code class="c_comment">// true</code>
+ bloom_union.probably_contains(2); <code class="c_comment">// true</code>
+ bloom_union.probably_contains(3); <code class="c_comment">// true</code>
+
+ bloom_intersect.probably_contains(1); <code class="c_comment">// false</code>
+ bloom_intersect.probably_contains(2); <code class="c_comment">// true</code>
+ bloom_intersect.probably_contains(3); <code class="c_comment">// false</code>
+ </div>
<hr/>
<div class="spirit-nav">
@@ -51,7 +82,7 @@
<a accesskey="h" href="../index.html">
<img src="../../../../../doc/src/images/home.png" alt="Home"/>
</a>
- <a accesskey="n" href="../reference.html">
+ <a accesskey="n" href="tips.html">
<img src="../../../../../doc/src/images/next.png" alt="Next"/>
</a>
</div>
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/querying.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/querying.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/querying.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -40,6 +40,22 @@
<h1 class="title">Querying Elements</h1>
+ <p>
+ Querying a Bloom filter is also straightforward. The function to
+ remember is <em>probably_contains</em>, named as such to emphasize
+ the probabilistic nature of the data structure.
+ </p>
+
+ <div class="example">
+ <code class="c_type">basic_bloom_filter</code><<code class="c_type">int</code>, 1000> <code class="c_id">bloom</code>;
+
+ bloom.probably_contains(1); <code class="c_comment">// false</code>
+ bloom.insert(1);
+ bloom.probably_contains(1); <code class="c_comment">// true</code>
+ bloom.probably_contains(2); <code class="c_comment">// false</code>
+ bloom.probably_contains(1001); <code class="c_comment">// true, false positive</code>
+ </div>
+
<hr/>
<div class="spirit-nav">
<a accesskey="p" href="inserting.html">
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/removing.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/removing.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/removing.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -33,13 +33,51 @@
<a accesskey="h" href="../index.html">
<img src="../../../../../doc/src/images/home.png" alt="Home"/>
</a>
- <a accesskey="n" href="tips.html">
+ <a accesskey="n" href="joining.html">
<img src="../../../../../doc/src/images/next.png" alt="Next"/>
</a>
</div>
<h1 class="title">Removing Elements</h1>
+ <p>
+ Element removal is an operation supported only by the counting
+ Bloom filters. <em>remove</em> can introduce false negatives, so
+ it must be used with caution. The list below highlights the
+ possibilities:
+ </p>
+
+ <ul>
+ <li>underflow: element removal caused a bin to go from 0 to BIN_MAX. </li>
+ <li>overflow: element insertion caused a bin to go from BIN_MAX to 0. </li>
+ </ul>
+
+ <p>
+ This implementation does not check for either error.
+ </p>
+
+ <div class="example">
+ <code class="c_type">counting_bloom_filter</code><<code class="c_type">int</code>, 1000, 2> <code class="c_id">bloom</code>;
+
+ bloom.insert(1);
+ bloom.probably_contains(1); <code class="c_comment">// true</code>
+ bloom.remove(1);
+ bloom.probably_contains(1); <code class="c_comment">// false</code>
+
+ bloom.insert(1); <code class="c_comment">// bin[1] = 01 </code>
+ bloom.insert(1); <code class="c_comment">// bin[1] = 10 </code>
+ bloom.insert(1); <code class="c_comment">// bin[1] = 11 </code>
+ bloom.insert(1); <code class="c_comment">// bin[1] = 00 </code>
+ bloom.probably_contains(1); <code class="c_comment">// overflow, false</code>
+
+ bloom.remove(1); <code class="c_comment">// bin[1] = 11 </code>
+ bloom.probably_contains(1); <code class="c_comment">// underflow, true</code>
+ </div>
+
+ <p>
+ Range removal is also supported for counting Bloom filter types.
+ </p>
+
<hr/>
<div class="spirit-nav">
<a accesskey="p" href="querying.html">
@@ -51,7 +89,7 @@
<a accesskey="h" href="../index.html">
<img src="../../../../../doc/src/images/home.png" alt="Home"/>
</a>
- <a accesskey="n" href="tips.html">
+ <a accesskey="n" href="joining.html">
<img src="../../../../../doc/src/images/next.png" alt="Next"/>
</a>
</div>
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tut/tips.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -24,7 +24,7 @@
<hr/>
<div class="spirit-nav">
- <a accesskey="p" href="removing.html">
+ <a accesskey="p" href="joining.html">
<img src="../../../../../doc/src/images/prev.png" alt="Prev"/>
</a>
<a accesskey="u" href="../tutorial.html">
@@ -40,9 +40,78 @@
<h1 class="title">Warnings and Best Practices</h1>
+ <h3>Managing the False Positive Rate</h3>
+ <p>
+ The best possible case for managing your false positive rate is
+ a situation where you can reliably predict the number
+ of elements that your Bloom filter will be responsible for
+ representing. In this case, all you have to do is select a reasonable
+ number of hash functions and an adeqaute bit capacity.
+ </p>
+ <p>
+ If you cannot predict the number of elements reliably or the
+ bit capacity requirement is greater than the size of the
+ stack, then you should prefer
+ to use the dynamic Bloom filter variants.
+ </p>
+
+ <a name="stack_overflow"></a>
+ <h3>Stack Overflow</h3>
+ <p>
+ <span class="warning">WARNING</span>: all Bloom filters that allow
+ you to set the bit capacity as a compile-time parameter are susceptible
+ to stack overflows. The stack size depends on the operating system and
+ compiler used. If your Bloom filter requires more than 1MB in
+ bit capacity or you are getting mysterious segmentation faults,
+ switch over to a dynamic variant.
+ </p>
+
+ <a name="resize"></a>
+ <h3>Resizing Wisely</h3>
+ <p>
+ If it is possible to reliably predict the bit capacity requirement,
+ then it is best to avoid resize operations entirely. A resize operation
+ clears a Bloom filter of all history - it is akin to flushing the cache.
+ However, if you resizing is in the best interest of your application,
+ then do so when the false positive rate becomes unacceptable.
+ </p>
+ <p>
+ As an aside, there exist theoretical Bloom filter variants that can maintain
+ a specific false positive rate as the number of elements increases. These
+ "scalable" Bloom filters do not suffer from the resize problem stated above.
+ Refer to the bibliography for more details.
+ </p>
+
+ <h3>False Negatives</h3>
+ <p>
+ A false negative can occur in any Bloom filter that supports a <em>remove</em>
+ operation.
+ </p>
+
+ <h3>Hasher Set Trade-Offs</h3>
+ <p>
+ A fast Bloom filter will prefer to use fewer and faster hashers. If your
+ application is mostly CPU bound or memory bound, then your Bloom filter
+ should probably prefer this approach.
+ </p>
+ <p>
+ If your application is mostly disk bound, you can afford to use more
+ hashers to limit your false positive rate. You may see significant
+ performance improvements by increasing the number of hashers.
+ </p>
+ <p>
+ If your application has requirements on being able to limit the number
+ of collisions regardless of input characteristics, prefer cryptographic
+ hashers.
+ </p>
+ <p>
+ If ever in doubt, benchmark. This family of Bloom filters was designed to
+ be easy to experiment with.
+ </p>
+
<hr/>
<div class="spirit-nav">
- <a accesskey="p" href="removing.html">
+ <a accesskey="p" href="joining.html">
<img src="../../../../../doc/src/images/prev.png" alt="Prev"/>
</a>
<a accesskey="u" href="../tutorial.html">
Modified: sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tutorial.html
==============================================================================
--- sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tutorial.html (original)
+++ sandbox/bloom_filter/trunk/libs/bloom_filter/doc/html/tutorial.html 2011-07-19 02:47:17 EDT (Tue, 19 Jul 2011)
@@ -47,6 +47,7 @@
<li>Inserting Elements</li>
<li>Querying Elements</li>
<li>Removing Elements</li>
+ <li>Intersections and Unions</li>
<li>Tips & Best Practices</li>
</ul>
</div>
@@ -55,61 +56,26 @@
<h3>Overview</h3>
<p>
- There are two primitive operations that are supported by all Bloom filters: <em>insert</em> and <em>probably_contains</em>. A third operation, <em>remove</em>, is supported by counting Bloom filters, and a few similar variants. In order to control the false positive rate of a Bloom filter, you can control the number of hash functions used, as well as the bit capacity.
+ There are two primitive operations that are supported by all Bloom
+ filters: <em>insert</em> and <em>probably_contains</em>. A third
+ operation, <em>remove</em>, is supported by counting Bloom filters.
+ To effectively use a Bloom filter, the only other concept to understand
+ is the false positive rate.
</p>
- <h3>Types of Bloom Filters Provided</h3>
- <ul>
- <li>Basic Bloom filter (basic_bloom_filter, dynamic_bloom_filter)</li>
- <li>Counting Bloom filter (counting_bloom_filter)</li>
- </ul>
-
- <h4>Basic Bloom Filter</h4>
-
<p>
- To use this Bloom filter, there are two template parameters to be aware of and two functions to utilize:
+ Bloom filters also support the intersect and union operations.
+ The intersection of two Bloom filters represents the elements
+ that are in both bloom filters. The union of two Bloom filters
+ represents the elements that are in either Bloom filter.
</p>
+ <h3>Types of Bloom Filters Provided</h3>
<ul>
- <li>Template parameters:
- <ul>
- <li>Type: the type of elements this Bloom filter will store</li>
- <li>Size: the number of bits used to store the Bloom filter state</li>
- </ul>
- </li>
- <li>Basic functions:
- <ul>
- <li><code>void insert(const T&);</code></li>
- <li><code>bool probably_contains(const T&);</code></li>
- </ul>
- </li>
- </ul>
-
- <ul>
- <li>Advanced template parameters:
- <ul>
- <li><code>HashFunctions</code>:
- a boost::mpl::vector of Hasher objects.</li>
- </ul>
- </li>
- <li>Advanced state monitoring functions:
- <ul>
- <li><code>size_t size() const</code>: returns the value of the template parameter Size</li>
- <li><code>size_t num_hash_functions() const</code>: returns the number of hash functions used to define this Bloom filter.</li>
- <li><code>double false_positive_rate() const </code>: returns the current false positive rate</li>
- </ul>
- </li>
+ <li>Basic Bloom filter (basic_bloom_filter, dynamic_bloom_filter)</li>
+ <li>Counting Bloom filter (counting_bloom_filter)</li>
</ul>
-
- <p>
- All the details of insertion and querying the Bloom filter are handled
- transparently by the insert() and probably_contains() functions, regardless
- of the type, bit size, and hash functions used. This makes it very
- convenient to experiment with different Bloom filter parameters in
- projects you commit to.
- </p>
-
<hr/>
<div class="spirit-nav">
<a accesskey="p" href="index.html">
Boost-Commit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk