|
Boost : |
Subject: Re: [boost] [ boost ] [ Trie ]
From: Cosmin Boaca (boost.cosmin.boaca_at_[hidden])
Date: 2015-03-14 07:22:57
On 14 March 2015 at 10:18, Antony Polukhin <antoshkka_at_[hidden]> wrote:
> 2015-03-13 23:13 GMT+03:00 Cosmin Boaca <boost.cosmin.boaca_at_[hidden]>:
> OK. Not a problem, we're not in a rush.
>
> By the way, think of making `parent` a private variable and adding
> (get|set)_parent() functions. We can later try to make node even smaller by
> removing `parent` variable and getting parent node by calling
> intrusive::set::container_from_iterator(iterator);
>
>
> > I have thought a little about the container used for children. I think
> that
> > a container with less insertion / find overhead would be better. I think
> > that maintaining a sorted intrusive list / vector of pointers would
> perform
> > better if the number of children for a node is small (something like <
> 50 ,
> > maybe < 100). We may tune this further on anyway, maybe using some kind
> of
> > policy for the container.
> >
>
> I'm thinking of a slightly different approach: we can make a specific
> "leaf" node, that contains the remaining part of the value. For example, if
> we have two values in trie "hello word" and "hell's bells", then the trie
> would look like:
>
> node['h']->node['e']->node['l']->node['l']->(leaf["o world"] | leaf['''s
> bells']);
>
> But this must be done only after the existing code would be polished
> enough.
>
Related to this, I was thinking at changing the data structure to radix
tree [0]. It would be more compact. You're idea it's something like a mix
between trie and radix tree. In a radix tree the two strings would be
represented like :
node["hell"] -> (node["o world"] | node["'s bells"]). I think there are
specific cases in which it performs worse than a trie, for instance when
you add all the strings that may be formed by an alphabet, but this is a
quite rare case, but in the general case it's more compact and I think it
performs better.
Cosmin
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk