
Chad Seibert wrote:
I have a new proposal, however. There is no doubt that Suffix Tree's are useful and I think it would be a great addition to the Boost library. Over the GSOC timeframe, I'll implement a RAM based suffix tree algorithm such as Ukkonen's algorithm. The nice thing about implementing Ukkonen's algorithm is that it isn't difficult to get a simple implementation running and to speed it up later. Please see the implementation section of " http://en.wikipedia.org/wiki/Suffix_tree" for more information.
I'm sure that a good suffix tree library would be welcomed. I can't say whether the amount of work is sane for a GSoC project, though. Have you considered suffix arrays? Personally I have found more real-world applications for these, rather than trees. Actually the variant that I use currently is a suffix array that indexes only the start of each word; I'm not sure how practical it would be to make a generic suffix array that could be specialised with a predicate to match in that way. Do you have a motivating application? Or at least something real to benchmark with? If you do, that makes it much easier to choose between implementation options etc. Regards, Phil. (p.s. you might like to change the message subject, if you want this new idea to get noticed...)