|
Boost : |
From: joel de guzman (djowel_at_[hidden])
Date: 2002-01-02 12:12:11
----- Original Message -----
From: "Nathan Myers" :
>
> BTW^2 (speaking of work), trees keyed on strings group entries with identical
> prefixes together, so that as you walk down them or rebalance them you are
> comparing the same substrings over and over before you get to the parts that
> differ. Specializing trees for the case of string keys to identify where
> the difference begins, should speed up string-keyed trees dramatically.
> (I think the extremum of such an optimization is a lexicographic tree.)
BTW, symbol tables in Spirit are implemented using ternary state
trees (TSTs). Common prefixes are shared. Searching for a string
of length k in a ternary search tree with n strings will require at most
O(log n+k) *character* comparisons. Quite nifty indeed. The data
structure is faster than hashing for many typical search problems
especially when the search interface is iterator based. Instead of
passing in a string in its "find" function, you can just pass in first-last
iterators straight from the stream, no need to pre-extract anything.
TSTs are many times faster than hash tables for unsuccessful searches
since mismatches are discovered earlier after examining only a few
characters. Hash tables always examine an entire key when searching.
--Joel
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk