Boost logo

Boost :

From: Hervé Brönnimann (hbr_at_[hidden])
Date: 2006-05-04 09:32:52


Thorsten and anyone interested in this proposal (available at
<http://tinyurl.com/lprgp> for those joining this thread): I'd have
been highly interested in mentoring it, but that's fine, I can help
from the sidelines. This is how:

I'm just finishing supervising an MS thesis on Space-efficient
Geometric Algorithms and Data Structures. In the process, we had to
implement the following:

* Beap: fully implicit data structure (encoded as a permutation of a
range [first,last) much like a binary heap) that supports search,
insertion, and deletion in O(sqrt n) time.

* Dynamic Sorted Vector: better than sorted vector and beaps for
moderate sizes. Is a fully implicit data structure (again, no extra
storage than the objects themselves, stored as a permutation of a
sequence much like binary heap). The idea is to dynamize sorted
vector using decomposable data structuring due to Bentley and Saxe
(see Mehlhorn's book). Search, predecessor and successor is in O
(log^2 n) worst case, but insertion and deletion are in O(log^2 n)
amortized time (and perhaps better, I haven't looked it up in a while).

* Munro's implicit ordered dictionary data structure (Journal Comput.
System Science, 1986): this one provides a trade-off of 16s extra
pointers with search, insertion and deletion in O(log n + s*s) time.
We're still in the final stages of debugging it. It's highly non-
trivial!!!

Of course, none of these structures provide stable external
references (iterators), or rather, like heaps, they provide iterators
with invalidation of all iterators upon any modifying operation
(change-key, insert or delete). For providing iterators, it is
possible to do by storing the objects in a list and having the DS
stored in an array of pointers to the list nodes. This is what
Dietmar Kuehl had done for his small collection of priority queues
(which I still use fondly, btw, Dietmar; too bad those were never
submitted to Boost!) The space required is 2n extra pointers and you
lose flexibility: the DS no longer operates on an external range, but
instead manages its own storage.

In addition, as a research project, Pat Morin, Jyrki Katajainen (head
of the Copenhagen STL project) and myself are discussing various
ideas for lowering the extra storage requirements to (1+epsilon)
pointers per element, and still provide external references. I'd
love to have a programmer for those ideas.

To finish off, I have a few utilities for compacting bits into
pointers (such as the red/black bit) and saving space in red/black
tree implementation (because of alignment, the extra bool for the
color of a node ends up costing as much as a pointer even in the
latest gcc implementation!). And I have many implementations of
binary search trees all suitable for implementing an STL map, which
have various tradeoffs. There are two which are still unimplemented
and would be a great contribution to this project:
* a BST with three pointers per node providing O(1) time iterators
increment and decrement, as well as all the other red/black tree
guarantees (worst case O(log n) per operations).
* a BST with two pointers per node providing O(1) time iterators and
expected O(log n) time per operations. Note that expectation is NOT
based on element order of insertion or distribution. It's a fully
randomized algorithm (threaded treaps, with the priority being a hash
of the node's address, if you must know).

The MS student who did the work on implicit DS described above is no
longer enrolled, and works full-time at some company, so not
eligible. However, I would gladly contribute his work to the project
for inclusion. There will be a bit of work to clean up the code and
make it fit homogeneously in the project, but that's what the SoC
student gets paid for, no? And it will be highly informative and
cutting-edge research, surely leading to publications.

Hope that gets enough candidates excited!

--
Hervé Brönnimann
CIS, Polytechnic University
hbr_at_[hidden]
On Apr 24, 2006, at 3:34 PM, Thorsten Ottosen wrote:
> Rene Rivera wrote:
>> To Whomever added the " Associative containers specialized for low
>> memory footprint and fast lookup" item to the project list
>
> It was me...I better explanation shuold be in place now.
>
>> <http://tinyurl.com/lprgp>. I curious which containers in  
>> particular you
>> have in mind? I'm asking because I'm working on such a container,  
>> both
>> for myself and for Boost, bjam, and binutils/ld/bfd. The particular
>> container is the radix tree <http://en.wikipedia.org/wiki/ 
>> Radix_tree>.
>
> Nope it wasn't that I was thinking of. See wiki.
>
> -Thorsten
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/ 
> listinfo.cgi/boost
>

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk