|
Boost : |
Subject: Re: [boost] [review] [heap] Summary
From: Tim Blechmann (tim_at_[hidden])
Date: 2011-06-19 09:22:02
hi andrew and list,
> Other comments received during the course of the review were generally
> positive, with a number of technical issues identified that need to to
> be addressed. The final version of the library can be fully accepted
> once Tim has addressed the technical issues identified during the
> review.
i will try to address the issues in the next few weeks and post an updated
version when it is ready ...
> - The performance of b_heap vs. d_ary_heap vs. priority_queue needs to
> be better characterized. Under what circumstances does one outperform
> the other. If no convincing scenario can be found where b_heap
> outperforms the others, then it should also be placed in an
> experimental directory.
i have been thinking about a `best practices' section, discussing the
performance aspects in more detail.
> - The heap concepts need to be revisited as per comments on the review
> tool. The concepts should also describe semantics of the operations,
> not just the interfaces.
ok
> - There were several comments on the implementation of heap stability.
> Three alternatives were identified: ânaturally stableâ heaps (of which
> no examples were given), the current counter based approach, and a
> chaining approach. No consensus on the best implementation strategy.
> Although the counter-based approach is probably adequate for most
> applications, it is not universally desirable. A more thorough
> investigation of alternative strategies is encouraged.
in order to implement stability without a version count, we need to be able to
find out, if a value is already inside the heap. since a heap is not a search
tree the possible solutions are (a) iterate the complete heap (O(n)), (b) use a
treap (mixture of heap and binary search tree, O(log(n))) or (c) maintain a
separate hash table O(1). then nodes can be ordered in `natural stable' order or
they could be chained.
even compared with a hash table, the version count will probably be more time-
efficient (no hashing required) and most likely more memory efficient ...
> - The notion of Mergeable heaps needs to be better addressed, at least
> to differentiate between heaps that merge in O(n) or slower time and
> those that can merge in O(lg n) time. The proposed solution is to
> write merge as a general heap algorithm and specialize it for
> Mergeable heaps by calling the member function merge. Also, merge
> should be destructive, so merge and clear should not be needed.
more or less done
> - The documentation must be substantially improved with respect to the
> overall design of heap data structures (i.e., concepts), especially
> mutable heaps and best practices. It also needs to provide both
> theoretical and performance comparisons for the different heaps.
already started to work on this
> - Include documentation of the benchmarking tool in the documentation.
> It could be instructive for users to evaluate different heap
> implementations on their host or target architectures.
will give some more details about my benchmarking procedure. but like all
synthetic benchmarks, they should be taken with a grain of salt ...
> - Document the complexity of == and < for heaps.
ok
> - Address the comments and issues in the Review Tool.
ok
> - Phil Endecott suggested rewriting some heaps (b_heap and d_ary_heap)
> in terms of more general algorithms operating on a (Random Access)
> Range in order to support the formulation of heap structures on other
> memory structures. Doing this would also help support the definition
> of heaps that satisfied Thorsten's request.
i will have a look when i addressed the other issues.
i would like to thank the reviewers for their feedback and the valuable
discussions about the boost.heap library, andrew for running the review and
google for supporting this project as part of the summer of code program.
cheers, tim
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk