Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-29 11:17:01


On Sat, Jan 29, 2011 at 11:44 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>>
>> I wonder where you got that notion. I  framed the discussion around my
>> definition of `string` to be a sequence. In  that context (in an
>> earlier post) I was basically saying "a string is a data  structure for
>> holding things, [FOR EXAMPLE] a string of events, a string  of
>> characters, ..." just to frame the definition properly and  identify
>> that I was talking about a data structure.
>>

And you snipped the part explaining that I used that statement for
framing a discussion, as in laying down the foundations for arguments.
*sigh*

>
> I'm sorry:
>
> Let's see:
>
> - Java String - one meaning text, UTF-16 encoded

Nope, in Java a String is a data type that derives from Object which
stores an immutable sequence of 16-bit characters. Not necessarily
*text*, and it just so happens that it chooses the UTF-16 encoding.
AFAIK you can still stuff arbitrary bytes when constructing a String
object -- try reading from a binary file and see what I mean.

> - C# string - one meaning text, UTF-16 encoded

Sequence of UTF-16 characters. Not sure if it's immutable.

> - C++/GTKmm ustring - one meaning text, UTF-8 encoded

Sequence of characters, just so happens to be UTF-8 encoded.

> - C++/Qt QString - one meaning text, UTF-16 encoded

Sequence of UTF-16 characters. May or may not be *text* i.e. I can
still fill the buffer with *garbage*.

> - C++/wxWidgets wxString - one meaning text, Unicode (don't remember encoding
> type)

Same thing, sequence of characters. Just so happens to choose a
default encoding, but still you can fill the thing with garbage.

> - Vala string - text UTF-8 encoded

I have no idea what a Vala string is.

> - Python 3 str[ing] - text UTF-16 ot UTF-32 encoded
>

In Python 3 it chose to deal with strings as UTF-16 or UTF-32 (there's
a move to make this largely transparent to the user depending on the
platform) characters. I can still fill a str with garbage even in
Python.

> Is this clear enough?
>

I didn't get the point. You were enumerating these data types... to
convince me that 'string' only denotes text?

> When you say string you mean TEXT not more not less.

No, maybe *you* say string when TEXT is what you mean. I OTOH view a
string as a sequence of characters for whatever suitable meaning of
character exists.

Also, TEXT can be represented in many different ways as well, not just
with strings. And TEXT is largely a human idiom referring to letters
and words visible on some medium. This has nothing to do with
computers because to computers, guess what: it's all *bytes*.

>
> And yes in C++ you can store arbitrary date in char buffers
> or in std::string, but this does not change the meaning
> of string word - it means "text"
>

No, sorry. I think really it's either bad computer science or bad
English (or bad translation of concepts, FWIW).

> Don't try to reinvent the meaning of string word in CS context.
> It is not about English, it is about concept.
>

And the concept of a string in computer science is a sequence of
characters for whatever suitable definition of characters exist.

-- 
Dean Michael Berris
about.me/deanberris

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk