Boost logo

Boost :

Subject: Re: [boost] [rfc] Unicode GSoC project
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2009-05-13 06:55:17


Hi Mathias,

Mathias Gaunard wrote:
> I have been working on range adaptors to iterate over code points in
> an UTF-x string as well as converting back those code points to UTF-y
> for the past week

I would be interested to see this code. I encourage you to share what
you have done as soon as possible, to get prompt feedback.

> short documentation
> http://mathias.gaunard.emi.u-bordeaux1.fr/unicode/doc/html/

Some feedback based on that document:

     UTF-16
     ....
     This is the recommended encoding for dealing with Unicode.

Recommended by who? It's not the encoding that I would normally recommend.

     make_utf8(Range&& range);
     Assumes range range is a properly encoded UTF-8 range in
Normalization Form C.
     Iterating the range may throw an exception if it isn't.

     as_utf8(Range&& range);
     Return type is a model of UnicodeRange whose value type is uchar8_t.

To me, the word "make" suggests that the former is actually doing a
conversion. But it's the latter, "as", that does that. Can we think
of something better? (Can anyone suggest any precidents?)

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk