Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-27 02:34:15

On Thu, Jan 27, 2011 at 3:19 PM, Patrick Horgan <phorgan1_at_[hidden]> wrote:
> On 01/26/2011 07:54 PM, Dean Michael Berris wrote:
>> ... elision by patrick ...
>> Yes, but really I think the view<encoding>  is the encoding-aware
>> string type mostly because if you convert it to an std::string for
>> example or into a buffer and look at it like a `char const *` or even
>> `wchar_t const *` then you basically get what you'd need for the C or
>> OS APIs.
>> I just prefer calling a spade a spade and not say `string` when I
>> really mean a `view<encoding>` -- because largely I think everyone
>> would agree that the string data structure really doesn't have an
>> intrinsic property that relates to an 'encoding'.
> But what some are talking about is a utf-8_string.  I know it's not what
> you're talking about, but saying that everyone would agree would be a bit
> disingenuous and discount much of the preceding discussion.

So you're saying, utf8_string is not view<utf8_encoding> as far as
I've already described it?

> I really wish this discussion would split into two, because the discussion
> about the benefits of an immutable string, and the discussions of an utf
> encoded string are two completely different discussions and you keep butting
> heads each saying, no, but that's not what I'm talking about.

Really, if you read the recent discussions, you will see that we're
really talking about the same thing: a data structure that knew the
encoding somehow. That somehow is, and has been determined (and agreed
upon already) already suitably modeled by a view<...> that takes a
string for a suitable definition of string. Note that the string *has
no encoding that is intrinsic to it*.

> That's right.  There were several threads, but everyone's jumped onto this
> one which I believe was started by Mr. Berris to talk about the benefits of
> an immutable string.  Please, please, separate these threads again.

So Mr. Berris is saying right now, if you didn't see the point: your
"utf8_string" is really just a typedef to view<utf8_encoding>. The
only *reasonably efficient* way of achieving this view design is if
you had immutable strings. The thread has already hashed out *why*
mutable strings is a bad thing (performance and design-wise) for
encoding-aware algorithms. I don't see why we need to go back to that

At any rate feel free to convince me otherwise that immutable strings
wouldn't be a good thing for
encoding/transcoding/string-or-text-centric algorithms. ;)

Dean Michael Berris

Boost list run by bdawes at, gregod at, cpdaniel at, john at