|
Boost Users : |
From: Elisha Berns (e.berns_at_[hidden])
Date: 2005-09-17 23:57:35
Thanks for the reply,
Your point is well taken. Well if that's so then why does boost::spirit
implement a TST?
And just to get a bit more clarity here, at the site for the TST their
data set looks much like a pretty random collection of words (a typical
dictionary), so I guess the question is which container works best for
implementing a) a very small dictionary of under 100 terms that are
comprised of the Unicode invariant characters (about 60 characters)
which is even a subset of ASCII and b) a larger dictionary of thousands
of terms comprised of the entire Unicode BMP set of characters?
Thanks,
Elisha
>
> >> 2) Would its performance be much better than a general purpose
> >> associative container?
> >
> > Performance-wise, it should be about the same as a hash_map.
> > Memory-wise, a trie should be much better.
> >
> > AFAIK there is nothing resembling a trie in Boost, but there are
other
> > free implementations available.
> >
>
> Spirit has a TST as a internal class. It is used in
boost::spirit::symbols
> class.
>
> For information about TST see
<http://www.cs.princeton.edu/%7Ers/strings/>
> or a recent copy of Sedgewick's "Algorithms" book.
>
> Memory-wise a trie could be much worse than a hash_map, it depends on
the
> data set. If enough of your keys are prefixes of one another then
memory
> usage is not that bad. However if your keys are unique or the end of
the
> key is the common part then an excessive amount of memory can be used.
> Since for a TST a typical node will be at least 16 bytes on a 32bit
> machine. That is 16 bytes per character, not per string.
>
> Nathan E. Moore
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net