Boost logo

Boost :

Subject: Re: [boost] RFC: interest in Unicode codecs?
From: Mathias Gaunard (mathias.gaunard_at_[hidden])
Date: 2009-07-18 01:43:33


OvermindDL1 wrote:
> On Sat, Feb 14, 2009 at 10:07 AM, Phil Endecott
> <spam_from_boost_dev_at_[hidden]> wrote:
>> /* snip */
>> Yes, a Unicode character properties library is important to those who are
>> writing text editors and similar applications. Perhaps Boost should have
>> one. I have personally used the Unicode properties tables for doing
>> "approximate matching" of e.g. accented characters with their base
>> characters when searching. But I can do that equally well in UTF-8 as in
>> UTF-32.
>
> If you are all interested in other opinions, I would love for boost to
> have a UTF8(16/32) helper library.

There is a google summer of code project for a unicode library which I'm
working on.

It allows handling of unicode text in any of UTF-8, UTF-16 or UTF-32
encodings, bundles a small-ish unicode character database, supports
grapheme boundaries, composition/decomposition and normalization, but
not "approximate matching", collation or case folding (at least it won't
for the time being).


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk