Subject: Re: [boost] [gsoc]built-in support for dictionary words
From: Arash Partow (arash_at_[hidden])
Date: 2009-03-29 11:00:27
Mathias Gaunard wrote:
> Thankfully, the Unicode standard defines representation and a lot of
> operations. The funny thing is that some languages, such as Thai,
> actually require a dictionary to tell words apart from each other, since
> there are no explicit word boundaries (alternatively, it can be done
> using machine learning algorithms to percept word-like constructs, there
> are quite a few research papers on that topic).
I would further that by not only allowing spoken languages but generalize
the concept to take any form of bit groupings, then it could be useful in
other areas of comp-sci.
Furthermore, Bloom filters would only be one aspect of such a library, the
underlying data structures would require tries, wide-column stores etc.
Seems more like a few GSOCs.
Be one who knows what they don't know,
Instead of being one who knows not what they don't know,
Thinking they know everything about all things.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk