Boost :

Date view	Thread view	Subject view	Author view

From: Jeremy Maitin-Shepard (jbms_at_[hidden])
Date: 2004-04-13 15:09:17

Next message: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"
Previous message: Val Samko: "Re[2]: [boost] Date-Time"
In reply to: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"
Next in thread: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"
Reply: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"

Miro Jurisic <macdev_at_[hidden]> writes:

> [snip]

> You are forgetting that abstract Unicode characters are defined as sequences of
> code points (even if those code points are 32-bit) and string manipulation has
> to take this into account (there are numerous combinations of characters and
> combining marks that must be treated as single units for purpose of searching,
> collation, etc.) A single encoded character type may be 32 bits, but encoded
> characters are often not the level on which the clients need to manipulate
> strings.

Right, it will certainly be necessary to provide a
grapheme_cluster_iterator (with value_type = the Unicode string
type). ICU should help with this. Nonetheless, it is useful to
represent a single code point, for several reasons:

- For the purpose of string construction, the Unicode specification
   explicitly states that any sequence of code points is well formed,
   and so this provides the smallest unit by which
   guaranteed-well-formed strings can be formed.

- It would be useful to provide functions for querying the Unicode
properties of individual code points, and this code_point type
would be the only suitable parameter type.

I do agree, however, that for almost any output formatting, the
locale-specific or user-specified fill text/symbols should be specified
as strings, rather than as individual characters.

-- 
Jeremy Maitin-Shepard

Next message: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"
Previous message: Val Samko: "Re[2]: [boost] Date-Time"
In reply to: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"
Next in thread: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"
Reply: Miro Jurisic: "[boost] Re: Boost Unicode support ideas"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk