Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [General] Always treat std::strings as UTF-8? (was [Process] List of small issues)
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-01-13 12:17:05

Next message: Tim Blechmann: "Re: [boost] Improving review process"
Previous message: KTC: "Re: [boost] Improving review process"
In reply to: Artyom: "[boost] [Process] List of small issues"
Next in thread: Artyom: "Re: [boost] [General] Always treat std::strings as UTF-8"
Reply: Artyom: "Re: [boost] [General] Always treat std::strings as UTF-8"
Reply: Alexander Lamaison: "Re: [boost] [General] Always treat std::strings as UTF-8? (was [Process] List of small issues)"

On Thu, 13 Jan 2011 06:35:53 -0800 (PST)
Artyom <artyomtnk_at_[hidden]> wrote:

[...]
> Notes:
>
> 1. You can also always assume that strings under windows are UTF-8
> and always convert them to wide string before system calls.
>
> This is I think better approach, but it is different from what
> most of boost does.
[...]

An interesting thought... I developed a set of ASCII/UTF-8/16/32
classes for my company not too long ago, and I became fairly familiar
with the UTF-8 encoding scheme. There was only one issue that stopped
me from assuming that all std::string types as UTF-8-encoded: what if
the string *isn't* meant as UTF-8 encoded, and contains characters with
the high-bit set?

There's nothing technically stopping that from happening, and there's
no way to determine with complete certainty whether even a string that
seems to be valid UTF-8 was intended that way, or whether the UTF-8-like
characters are really meant as their high-ASCII values.

Maybe you know something I don't, that would allow me to change it? I
hope so, it would simplify some of the code greatly.

-- 
Chad Nelson
Oak Circle Software, Inc.
*
*
*

application/pgp-signature attachment: signature.asc

Next message: Tim Blechmann: "Re: [boost] Improving review process"
Previous message: KTC: "Re: [boost] Improving review process"
In reply to: Artyom: "[boost] [Process] List of small issues"
Next in thread: Artyom: "Re: [boost] [General] Always treat std::strings as UTF-8"
Reply: Artyom: "Re: [boost] [General] Always treat std::strings as UTF-8"
Reply: Alexander Lamaison: "Re: [boost] [General] Always treat std::strings as UTF-8? (was [Process] List of small issues)"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk