Boost logo

Boost :

Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-01-20 09:21:15


On Thu, 20 Jan 2011 00:05:47 -0800
Patrick Horgan <phorgan1_at_[hidden]> wrote:

>>> Inevitably a Unicode standard will be adapted where every character
>>> of every language will be represented by a single fixed length
>>> number of bits. [...]
>>
>> I'm no Unicode expert, but the reason this hasn't happened might be
>> combinatorial explosion. In which case it might never happen. But I
>> could well be wrong. And I hope I am, the design you outline is
>> something I'd love to see.
>
> It's already here and has been for a long time. That's just UCS
> encoded as UTF-32. [...]

The problem, in my uninformed view of it, is the idea of combining
characters. Any time you can have a single character that requires more
than one code-point, you can't assume that a fixed number of bits will
be able to represent every character.

I may be wrong, and I hope I am. If a character is guaranteed never to
consist of more than X code-points, it would be simple to offer a
fixed-width character type, even if the width is huge by comparison to
the eight-bit char type. But from what I've seen, I don't think that's
the case.

-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk