Boost logo

Boost :

Subject: Re: [boost] [strings][unicode] Proposals for Improved String Interoperability in a Unicode World
From: Beman Dawes (bdawes_at_[hidden])
Date: 2012-01-29 09:43:36


On Sun, Jan 29, 2012 at 3:21 AM, Keith Burton <kb_at_[hidden]> wrote:
> -----Original Message-----
> [snip]
> These proposals are the Boost version of the TR2 proposals made in
> N3336, Adapting Standard Library Strings and I/O to a Unicode World.
> See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3336.html.
>
> I'm very interested in hearing comments about either the Boost or the
> TR2 proposal
>
> [snip]
> -----Original Message-----
>
> Beman
>
> I do not understand how the converting c_str template can be useful in what
> for me, is the normal usage of the c_str function.
>
> Given existing code
>
> std::string stdstr;
> const char * cstr = stdstr.c_str();
>
> third_party_api( cstr );
>
> and moving to general use of a wide string type e.g.
>
> std::u32string stdstr;
> const char * cstr = stdstr.c_str< char >();                     // ?????????

That's a compile time error. The unspecified iterator type returned
will not be const char*. It will be a conversion iterator with a value
type of char, and thus only useful directly in purpose written code or
in generic algorithms templated on iterator type.

>
> third_party_api( cstr );
>
>
> clearly it is possible to make   third_part_api( stdstr.c_str< char
>>.c_str()  ) work but surely that would also permit the above invalid use.

One possible problem with conversion iterators with a value type of
char is that they can be passed to functions that don't work with
UTF-8 encoded data because of its multibyte nature. But UTF-8 is so
craftily designed that many functions do work as intended, even though
the functions were designed without multibyte encodings in mind.

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk