Boost logo

Boost :

Subject: Re: [boost] [gsoc]built-in support for dictionary words
From: Arash Partow (arash_at_[hidden])
Date: 2009-03-29 11:00:27


Mathias Gaunard wrote:
>
> Thankfully, the Unicode standard defines representation and a lot of
> operations. The funny thing is that some languages, such as Thai,
> actually require a dictionary to tell words apart from each other, since
> there are no explicit word boundaries (alternatively, it can be done
> using machine learning algorithms to percept word-like constructs, there
> are quite a few research papers on that topic).
>

I would further that by not only allowing spoken languages but generalize
the concept to take any form of bit groupings, then it could be useful in
other areas of comp-sci.

Furthermore, Bloom filters would only be one aspect of such a library, the
underlying data structures would require tries, wide-column stores etc.

Seems more like a few GSOCs.

Arash Partow
________________________________________________________
Be one who knows what they don't know,
Instead of being one who knows not what they don't know,
Thinking they know everything about all things.
http://www.partow.net


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk