Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [UTF String] UTF String library 1.5 ready for perusal
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-02-12 15:14:38

Next message: Chad Nelson: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"
Previous message: Matus Chochlik: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"
In reply to: Jeremy Maitin-Shepard: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"
Next in thread: Anders Dalvander: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"

On Sat, 12 Feb 2011 11:00:31 -0800
Jeremy Maitin-Shepard <jeremy_at_[hidden]> wrote:

>>>> The size in code-points *is* the size of the string, according to
>>>> the view of the string that the class exposes.
>>>
>>> Ok, but what would I actually want to use that for?
>>
>> What do you use string.length() for? :-) Efficiently providing an
>> answer to that is one of several things the UTF string classes keep
>> track of it for.
>
> std::string::length specifies the amount of memory required to
> represent it as encoded, and is useful if you intend to pass it to
> something else as a char array, length pair. Given that number of
> code points is directly related to neither the memory required nor the
> number of logical characters/glyphs/size it will take up to display,
> it seems it is unlikely to be useful in many cases.

But for those few cases where it *would* be useful, I see no reason not
to provide it. It costs essentially nothing, since the count is
originally provided by the same function that validates the encoded
data when it's put into a UTF type, and is used for other things as
well. And people are used to being able to retrieve the size of a
string, eliminating that function would discomfort some developers.

> In cases where there is a limit of the maximum length of a string, I
> believe that is almost certainly going to be in terms of the encoded
> length in a particular encoding (i.e.g UTF-8 or UTF-16), rather than
> in code points.

Well, that's easily available too, via T.coded().length().

-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*

application/pgp-signature attachment: signature.asc

Next message: Chad Nelson: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"
Previous message: Matus Chochlik: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"
In reply to: Jeremy Maitin-Shepard: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"
Next in thread: Anders Dalvander: "Re: [boost] [UTF String] UTF String library 1.5 ready for perusal"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk