Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-24 05:25:18


On Mon, Jan 24, 2011 at 1:41 PM, David Bergman
<David.Bergman_at_[hidden]> wrote:
> On Jan 24, 2011, at 12:13 AM, Dean Michael Berris wrote:
>
>> On Mon, Jan 24, 2011 at 11:51 AM, David Bergman
>> <David.Bergman_at_[hidden]> wrote:
>>> On Jan 23, 2011, at 9:34 PM, Dean Michael Berris wrote:
>>>
>>>>
>>>> I think I disagree with this. A string is by definition a sequence of
>>>> something -- a string of integers, a string of events, a string of
>>>> characters. Encoding is not an intrinsic property of a string.
>>>
>>> Ok... it feels like you are changing the rules as we play, instead of admitting "defeat" ;-)
>>>
>>> Or, did you indeed talk about *generic sequences* this whole time? If so, why the focus on encoding strategies for characters?
>>>
>>
>> Huh?
>
> Why did you suddenly mention string as being an alias for what we usually denote a 'sequence' (of which std::vector is a model, by the way)?
>

Hmmm... In English, it's usually used to frame a discussion, by saying
what you *think* what's being discussed it before you lay your
argument down. This is normal for making a sound discourse/argument
for/against a given position. I was merely saying that a string of
something means that it's a collection of something -- stating a basis
for the arguments that follow.

Also, actually, I wasn't even going down the "generic sequence" route.
The reason basic_string is even a template in the STL is precisely
because you can instantiate the template type of the "character". It's
perfectly fine if someone came up with a character-like type that you
can feed into basic_string<> (something like wchar_t, or your own
"character-like" type). I wasn't suggesting replacing std::vector
because what I was interested in solving is the problem of dealing
with strings of characters, or in the context of the discussion in
general "text" -- which just so happens to be concerned (quite
narrowly I believe) on encoding/decoding of data in a certain format.

>> I've always been pointing out that strings should just be immutable
>> and agnostic of the encoding and have the encoding enforced externally
>> to the string.
>>
>> Are you confusing me for someone else?
>
> No, not at all. There were two - for me - pretty awkward statements made by you, indicating a lack of coherence:
>
> 1. Your answer regarding what you meant by 'smarter iterator', which was a tautology adding no information at all:
>
>        "The way I was thinking about it, "smarter" would mean something along
> the lines of "knows more than your average <thing>" where <thing> is a
> bare iterator."
>
>        Yes, that touches at the definition of 'smarter'... but surely you understand that we (or he...) wondered *in what way* it was smarter; you *did* in fact extend a little bit on that later, I can give you that...
>

So I don't understand what you're saying here... that my trying to
define what I mean by 'smarter' is... is awkward?

Also, smarter can be interpreted many different ways -- it could mean
clever, contrived, has more capabilities, etc. I was simply stating
that I used the word 'smarter' in a sense that implies that it knows
more than the average iterator. And I did make the point how a smarter
iterator would look like.

So now, I think *I'm* confused about you saying I'm changing the rules
when all I've been saying has been consistent towards a different
implementation (and semantics) for what a string should be.

> 2. Your sudden proclamation that a string is a sequence of anything; indicating that you have been talking about a new sequence concept (variant) all this time, capable of holding stuff that are quite distinct from characters.
>
>        Yes, ok, that is one meaning of 'string' in a strict sense, yes, but (I hope it was clear) that it is not the meaning used in this specific discussion; so, that switch of interpretation of the term does probably not make the discussion more focused.
>

I wasn't even implying that the discussion even go anywhere other than
the string which has to do with "characters" for whatever definition
of the word "character" there is. And if you look at my points, it's
been towards the definition of "an std::string that is immutable, has
value semantics, lightweight, and can be the basis of
encoding/decoding algorithms".

You took one sentence and chose to read that out of context and then
assume that I've somehow been "defeated"?

>> My assertion has been from the beginning:
>>
>> 1. Let's focus on a string class first that is (arguably) better than
>> std::string by making it efficient, immutable, and does proper value
>> semantics.
>>
>> 2. Once we have this then let's build upon it to allow for multiple
>> ways of interpreting the *contents* of the string.
>>
>> I'm inclined to think you're confusing me for someone else while
>> replying to my message above.
>
> No, I did not. Sorry. By what you said above, you also add this point, which de-coheres the picture quite a bit:
>
> 3. This string class should be able to manifest sequences of anything, including events or arbitrary objects.
>

I didn't say that it *should* be able to manifest sequences of
anything -- I was pointing out *one* definition of *string*. I was
doing this to point out that nowhere in that definition does
"encoding" become intrinsic -- and even if you look at the string in
the context of std::string, neither is encoding intrinsic to that
string. Also, to point, UTF doesn't even imply *strings*, it implies
characters and character encodings, so I don't see why the encoding of
a string of characters should be considered an inherent property of a
string *type*.

Maybe you're assuming I'm making the #3 point above when in fact I was
just framing the discussion to assert that the encoding of the data
encapsulated by a string is not intrinsic to the string's type.

-- 
Dean Michael Berris
about.me/deanberris

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk