|
Boost : |
Subject: [boost] [Itl] Review
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2010-02-23 14:26:47
My review of the proposed Interval Template Library follows. This is
based only on studying the documentation, and previous discussion of
the library on the Boost mailing list (e.g.
http://lists.boost.org/Archives/boost/2009/03/149396.php ).
Joachim has clearly put a lot of work into what is a substantial and
well-documented library. Unfortunately, it seems to me that it is
built on a weak foundation.
One of the most well-known problems involving intervals is to determine
which of a given set of intervals contains a particular point. How to
store a set of intervals in such a way as to make this problem
efficient has been extensively studied. If an "interval tree" (See
e.g. Wikipedia: http://en.wikipedia.org/wiki/Interval_tree) is used,
then the problem can be answered in logarithmic time and linear space.
(More precisely: if there are N intervals in the container and M
overlap the point of interest, finding them should take O(M + log N)
time and the data structure should take O(N) space.)
I was hoping to find that this library was an implementation of an
interval tree. But it is not: it is built on top of normal
standard-library containers. The implementation used can find the
intervals that overlap a point in O(M + log N) time, but to do so it
needs worst-case O(N^2) space and I think it needs O(N^2 log N) time to
build that structure.
Presumably this complexity is not an issue in Joachim's application.
Unfortunately, library code will be used in many different applications
including those where optimal complexity is required. For example,
consider trying to store the lifespans of people in a genealogy
program: recording a few thousand people living over a few hundred
years would approach the worse-case space complexity and need megabytes
of storage.
Perhaps other reviewers will feel that there are sufficient
applications where this library is suitable to justify its inclusion in
Boost. My feeling is that since the interval tree data structure is so
well known and its performance is so much better, we should not accept
this library.
Ideally it would be possible for Joachim to replace ITL's current
standard containers with an interval tree implementation. Could an
interval tree be implemented with an interval sufficiently similar to a
std::set/map to make that possible, without losing the complexity benefits?
Once again, I believe that apart from this weak foundation this looks
like an excellent submission that is comprehensive and well-documented.
Regards, Phil.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk