Subject: Re: [boost] [string] proposal
From: Chad Nelson (chad.thecomfychair_at_[hidden])
Date: 2011-01-21 21:43:10
On Sat, 22 Jan 2011 01:56:36 +0800
Dean Michael Berris <mikhailberis_at_[hidden]> wrote:
>>> I think strings are different from the encoding they're interpreted
>>> as. Let's fix the problem of a string data structure first then tack
>>> on encoding/decoding as something that depends on the string
>>> abstraction first.
>> That gets back to the problem that I was originally trying to solve
>> with the UTF types: that a string needs a way to carry around its
>> encoding. A UTF-8 type could be built on such a thing very easily.
> Hmm... I OTOH don't think the encoding should be part of the string.
> The encoding is really external to the string, more like a function
> that is applied to the string.
It's a property of the string. It may change, but some encoding (even
if it's just "none") should be associated with a particular string
throughout its existence. Otherwise you might as well use the existing
> If you can wrap the string in a UTF-8, UTF-16, UTF-32 encoder/decoder
> then that should be the way to go. However building it into the string
> is not something that will scale in case there are other encodings
> that would be supported -- think about not just Unicode, but things
> like Base64, Zip, <insert encoding here>.
I assume that there is some unique identification for each language and
encoding, or that one could be created. But that's too big a task for
one volunteer developer, so my UTF classes are intended only to handle
the three types that can encode any Unicode code-point.
> Ultimately the underlying string should be efficient and could be
> operated upon in a predictable manner. It should be lightweight so
> that it can be referred to in many different situations and there
> should be an infinite number of possibilities for what you can use a
> string for.
You've just described std::string. Or alternately, std::vector<char>.
-- Chad Nelson Oak Circle Software, Inc. * * *
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk