Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Alexander Lamaison (awl03_at_[hidden])
Date: 2011-01-18 08:05:18


> On Mon, 17 Jan 2011 23:50:18 -0800 (PST), Artyom wrote:
>
> >
> > We'll have to agree to disagree there. The whole point to these classes
> > was to provide the compiler -- and the programmer using them -- with
> > some way for the string to carry around information about its encoding,
> > and allow for automatic conversions between different encodings.
>
> This is totally different problem. If so you need container like this:
>
> class specially_encoded_string {
> public:
> std::string encoding() const
> {
> return encoding_;
> }
> std::string to_utf8() const
> {
> return convert(content_,encoding_,"UTF-8");
> }
> void from_utf8(std::string const &input) const
> {
> content_ = convert(input,"UTF-8",encoding_);
> }
> std::string const &raw() const
> {
> return content_;
> }
> private:
> std::string encoding_; /// <----- VERY IMPORTANT
> /// may have valies as: ASCII, Latin1,
> /// ISO-8859-8, Shift-JIS or Windows-1255
> std::string content_; /// <----- The raw string
> }
>
> Creating "ascii_t" container or anything that that that does
> not carry REAL encoding name with it would lead to bad things.

I thought the point of using different types was instead of tagging a
string with an encoding name. In other words, a utf8_t would always hold a
std::string content_ in UTF-8 format.

Alex

-- 
Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk