Boost logo

Boost :

Subject: Re: [boost] A possible GSOC 2011 proposal idea
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-03-25 11:33:41


Chad Seibert wrote:
> I have a new proposal, however. There is no doubt that Suffix Tree's are
> useful and I think it would be a great addition to the Boost library. Over
> the GSOC timeframe, I'll implement a RAM based suffix tree algorithm such
> as Ukkonen's algorithm. The nice thing about implementing Ukkonen's
> algorithm is that it isn't difficult to get a simple implementation running
> and to speed it up later. Please see the implementation section of "
> http://en.wikipedia.org/wiki/Suffix_tree" for more information.

I'm sure that a good suffix tree library would be welcomed. I can't
say whether the amount of work is sane for a GSoC project, though.

Have you considered suffix arrays? Personally I have found more
real-world applications for these, rather than trees. Actually the
variant that I use currently is a suffix array that indexes only the
start of each word; I'm not sure how practical it would be to make a
generic suffix array that could be specialised with a predicate to
match in that way.

Do you have a motivating application? Or at least something real to
benchmark with? If you do, that makes it much easier to choose between
implementation options etc.

Regards, Phil.

(p.s. you might like to change the message subject, if you want this
new idea to get noticed...)


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk