Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-27 02:34:15


On Thu, Jan 27, 2011 at 3:19 PM, Patrick Horgan <phorgan1_at_[hidden]> wrote:
> On 01/26/2011 07:54 PM, Dean Michael Berris wrote:
>>
>> ... elision by patrick ...
>>
>> Yes, but really I think the view<encoding>  is the encoding-aware
>> string type mostly because if you convert it to an std::string for
>> example or into a buffer and look at it like a `char const *` or even
>> `wchar_t const *` then you basically get what you'd need for the C or
>> OS APIs.
>>
>> I just prefer calling a spade a spade and not say `string` when I
>> really mean a `view<encoding>` -- because largely I think everyone
>> would agree that the string data structure really doesn't have an
>> intrinsic property that relates to an 'encoding'.
>
> But what some are talking about is a utf-8_string.  I know it's not what
> you're talking about, but saying that everyone would agree would be a bit
> disingenuous and discount much of the preceding discussion.
>

So you're saying, utf8_string is not view<utf8_encoding> as far as
I've already described it?

> I really wish this discussion would split into two, because the discussion
> about the benefits of an immutable string, and the discussions of an utf
> encoded string are two completely different discussions and you keep butting
> heads each saying, no, but that's not what I'm talking about.
>

Really, if you read the recent discussions, you will see that we're
really talking about the same thing: a data structure that knew the
encoding somehow. That somehow is, and has been determined (and agreed
upon already) already suitably modeled by a view<...> that takes a
string for a suitable definition of string. Note that the string *has
no encoding that is intrinsic to it*.

> That's right.  There were several threads, but everyone's jumped onto this
> one which I believe was started by Mr. Berris to talk about the benefits of
> an immutable string.  Please, please, separate these threads again.
>

So Mr. Berris is saying right now, if you didn't see the point: your
"utf8_string" is really just a typedef to view<utf8_encoding>. The
only *reasonably efficient* way of achieving this view design is if
you had immutable strings. The thread has already hashed out *why*
mutable strings is a bad thing (performance and design-wise) for
encoding-aware algorithms. I don't see why we need to go back to that
*again*.

At any rate feel free to convince me otherwise that immutable strings
wouldn't be a good thing for
encoding/transcoding/string-or-text-centric algorithms. ;)

-- 
Dean Michael Berris
about.me/deanberris

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk