Boost logo

Boost :

Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Matus Chochlik (chochlik_at_[hidden])
Date: 2011-01-21 04:54:15


On Fri, Jan 21, 2011 at 10:37 AM, Alexander Lamaison <awl03_at_[hidden]> wrote:
> On Thu, 20 Jan 2011 23:26:35 -0800, Patrick Horgan wrote:
>
>> On 01/20/2011 07:43 AM, Alexander Lamaison wrote:
>>>
>>> I imagine you wouldn't have UTF-16 and UTF-32 string being passed about as
>>> a matter for course.  For instance, a UTF-16 string should only be used
>>> just before calling a Windows API call.
>>>
>>> If this is the case, it makes sense to make the common case (UTF-8 string)
>>> have a nice name like boost::string and the others which are used for
>>> special situations can have something less snappy like boost::u16string and
>>> boost::u32string.
>> What would you use for a regular string where you just had, essentially
>> a vector of char, wchar_t, char8_t, char16_t, char32_t, or unsigned
>> char, but didn't care about encoding?  I want to differentiate between
>> this case and the case where I know that there's a particular encoding.
>> A lot of times you just know you got a string from one system call and
>> you're passing it to another and you don't care about encoding.
> [..]
>
> Good point! boost::u8string then?

Why not boost::string (explicitly stating in the docs that it is UTF-8-based) ?
the name u8string suggests to me that it is meant for some special case
of character encoding and the (encoding agnostic/native) std::string
is still the way
to go.

IMO we should send the message that UTF-8 is
"normal"/"(semi-)standard"/"de-facto-standard"
and the other encodings like the native_t (or even ansi_t,
ibm_cp_xyz_t, string16_t,
string32_t, ...) are the special cases and they should be treated as such.

Matus


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk