Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Mostafa (mostafa_working_away_at_[hidden])
Date: 2011-01-19 17:32:31


On Wed, 19 Jan 2011 05:55:38 -0800, Chad Nelson
<chad.thecomfychair_at_[hidden]> wrote:

> On Wed, 19 Jan 2011 12:10:35 +0000
> Alexander Lamaison <awl03_at_[hidden]> wrote:
>
>> On Tue, 18 Jan 2011 20:37:44 -0500, Chad Nelson wrote:
>>> My utf8_t class lets you get the std::string with operator*, so it's
>>> easy to use with such encoding-agnostic functions as well.
>>
>> I meant to mention this: please, no ;) Can we make it .raw()
>> or .str() or something, anything but an operator overload?
>
> operator* has a long history of providing the contents of a variable,
> even in C, and is a lot less typing to boot. But if you have any
> technical arguments against it, I'm listening.

Can we stick to std::string conventions as closely as possible? It makes
using whatever new string library that much easier, and clearer, and
maintainable. From usage, it's not readily apparent what operator* is
supposed to do in the context of strings, ie,

utf8_t myStr(...);
some_api_foo(*myStr);

Even if I'm an experienced programmer, but a newbie to whatever library
makes use of some_api_foo, I would be scratching my head at "*myStr"; and
I would be forced to look up utf8_t::operator* or some_api_foo to figure
it out.

What about:

utf8_t::cu_str

where the last one stands for code-unit string.

I'm a big fan of conveying your intent in code. For the same reason I
strong disagree with utf8_t::str. utf8_t is already a string class, and a
generic sounding "str" method off it doesn't convey what kind of string it
returns.

And whichever you choose, can we have one and only one way of doing it?
Again, for the sake of code maintainability.

My thoughts/suggestions.

Mostafa


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk