Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Patrick Horgan (phorgan1_at_[hidden])
Date: 2011-01-15 01:11:35


On 01/14/2011 07:16 PM, Dave Abrahams wrote:
> On Fri, Jan 14, 2011 at 9:35 PM, Patrick Horgan<phorgan1_at_[hidden]> wrote:
>> ... elision ...
>> I don't understand. UCS-32 (I assume you meant encoded as UTF-32) is a
>> fixed width encoding so the n-th character is just 4n away from the
>> beginning of the string. Right?
> No. The nth code point is 4n bytes from the beginning of the string,
> but characters may be made of a combination of adjacent code points.
Ahhhh! Of course this occurred to me moments after clicking send.
lol! There should be a name for that phenomenon. Some correlation to
staircase wit.

Patrick


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk