|
Boost : |
Subject: Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-19 15:50:17
> From: Dave Abrahams <dave_at_[hidden]>
>
> Matus Chochlik wrote:
> >
> > On Wed, Jan 19, 2011 at 5:26 PM, Dave Abrahams <dave_at_[hidden]> wrote:
> > > Matus Chochlik wrote:
> > >
> > > *Scenario D:* We try for scenario A. and people still use Qstrings,
>wxStrings, etc.
> >
> > 'I think maybe you underestimate our influence.' :)
>
> Our influence, if we introduce new library components, is very great,
> because they're on a de-facto fast track to standardization, and an
> improved string library is exactly the sort of thing that would be
> adopted upstream. If we simply agree to a programming convention,
> that will have some impact, but much less.
>
Dave,
Most of existing projects and frameworks had decided
about 1 single and useful encoding:
- C++:
+ Qt UTF-16 using QString
+ GtkMM UTF-8 using ustring
+ MFC UTF-16 using CString /when compiled in "Unicode mode"
+ ICU UTF-16 using UnicodeString
- C:
+ Gtk UTF-8 string
- Java: UTF-16 String
- C#: UTF-16 string
- Vala: UTF-8 String/using "char *"
And so on...
If you take a look on All C++ frameworks they
all have a way to convert their string to std::string
and backwards.
C++ hadn't picked yet, but C++ has string
and very good one. And every existing
project has an interface to it.
The problem we hadn't decided about its encoding.
Yes, we can't say to standard std::string is UTF-8
but we can say other things.
As standard deprecated auto_ptr (which I think is crime but this
is other story) it should deprecate all non-unicode
aware uses of std::string and say default is UTF-8.
It already has u8"ש×××" that creates UTF-8
string using "char *" the only remaining thing
is to adopt it.
All frameworks decided how they use Unicode and what
string they use.
Boost can and **should** decide - we use Unicode - and
we use UTF-8 as all frameworks did.
Decide and cut it. As Boost had decided not to
use tabs in source code or use BSL for all its
code base.
This would do only good.
Sometimes it is bad to support every bad decision
that was made.
As many Boost Developers and Users enjoy the
fact that Boost is in constant evolution so we
can evolve and decide:
On windows char */std::string etc is UTF-8
if you don't agree, don't use Boost.
Artyom
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk