Boost logo

Boost :

Subject: Re: [boost] [UTF String] UTF String library 1.5 ready for perusal
From: Jeremy Maitin-Shepard (jeremy_at_[hidden])
Date: 2011-02-12 14:00:31


On 02/12/2011 05:57 AM, Chad Nelson wrote:
> On Fri, 11 Feb 2011 20:23:58 -0800
> Scott McMurray<me22.ca+boost_at_[hidden]> wrote:
>
>> On Thu, Feb 10, 2011 at 21:41, Chad Nelson
>> <chad.thecomfychair_at_[hidden]> wrote:
>>
>>>> I understand why it's useful to know how long it is in encoding
>>>> units, but the number of code points seems quite useless to me.
>>>>
>>>> Can you elaborate?
>>>
>>> The size in code-points *is* the size of the string, according to the
>>> view of the string that the class exposes.
>>
>> Ok, but what would I actually want to use that for?
>
> What do you use string.length() for? :-) Efficiently providing an
> answer to that is one of several things the UTF string classes keep
> track of it for.

std::string::length specifies the amount of memory required to represent
it as encoded, and is useful if you intend to pass it to something else
as a char array, length pair. Given that number of code points is
directly related to neither the memory required nor the number of
logical characters/glyphs/size it will take up to display, it seems it
is unlikely to be useful in many cases. In cases where there is a limit
of the maximum length of a string, I believe that is almost certainly
going to be in terms of the encoded length in a particular encoding
(i.e.g UTF-8 or UTF-16), rather than in code points.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk