Boost logo

Boost :

From: Marshall Clow (marshall_at_[hidden])
Date: 2008-02-20 13:56:11


At 1:41 PM -0500 2/20/08, Frank Mori Hess wrote:
>I don't have a lot of experience using non-ascii strings in my internal code,
>aside from occasional forays into utf-8 for special characters, but wouldn't
>using ucs-4 for the "core" encoding be the sane thing to do? With a ucs-4
>encoding, you could use a
>
>basic_string<wchar_t>
>
>and continue using the familiar api without worrying about the complications
>and confusion caused by variable length encodings.

You are making an unwarranted assumption - that wchar_t is big enough
to hold a ucs-4 code point (or, in fact, that wchar_t has a
particular size).

This is incorrect. On some compilers, sizeof(wchar_t) == 2, while on
others, sizeof(wchar_t) == 4. (Other compilers may use other values
as well - but I've never seen them).

-- 
-- Marshall
Marshall Clow     Idio Software   <mailto:marshall_at_[hidden]>
It is by caffeine alone I set my mind in motion.
It is by the beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk