Boost logo

Boost :

Subject: [boost] [gsoc] unicode tools and an unicode string type
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2009-03-29 21:40:36


I plan to submit during the week my proposal for the Summer of Code
about Unicode.

I plan to provide:
- iterator adaptors to iterate sequences of code units, code points and
graphemes, and eventually more, from a sequence in UTF-8, UTF-16, UCS-2
or UTF-32/UCS-4.
- miscellaneous utilities, such as categorization of code points
- normalization functions
- comparisons but not collations
- substring search algorithms
- and finally, an unicode string type

I am well aware defining yet another new string type is quite
controversial, but I believe this is quite useful. A dedicated type
would be able to maintain certain invariants, such as maintaining a
special normalization form.
Also, I believe it can be possible to come up with a string design that
allows easy integration with any other existing string type, such as the
ones from the standard or Qt.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk