|
Boost : |
Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-15 04:39:02
> From: Dave Abrahams <dave_at_[hidden]>
>
> Peter Dimov wrote:
> >
> > Alexander Lamaison wrote:
> > > I'm opposed to this strategy simply because it differs from the way
> > > existing libraries treat narrow strings.
> >
> > It differs from them because it's right, and existing libraries are
> > wrong. Unfortunately, they'll continue being wrong for a long time,
> > because of this same argument.
>
> Does the "right" strategy come with some policies/practices that can
> allow it to coexist with the existing "wrong" libraries? If so, I'm
> all +1 on it.
>
Combining old libraries with new ones:
======================================
It would be simple to combine a library that
uses old policies with new ones.
namespace boost {
std::string utf8_to_ansi(std::string const &s);
std::string ansi_to_utf8(std::string const &s);
std::wstring utf8_to_wide(std::string const &s);
std::string wide_to_utf8(std::wstring const &s);
}
- If it supports wide strings call boost::utf8_to_wide
**under Windows platform** and nothing is lost.
- If it supports only narrow strings:
a) if it is encoding agnostic: like some unit-test
that only open files named with ASCII names,
then you can safely ignore and pass UTF-8 string
as ASCII and ASCII as UTF-8 as is the subset of it.
b) Do following:
1. Fill a bug to library owner on not-supporting
Unicode strings under Windows.
2. Use utf8_to_ansi/ansi_to_utf8 to pass strings
to this library under Windows.
Current State of Using Wide/ANSI API in Boost:
==============================================
I've did a small search to find which libraries use what API:
Following use both types of API:
-------------------------------
thread
asio
system
iostreams
regex
filesystem
According to new policy they should replace
ANSI api by wide api and conversion between UTF-8 and UTF-16
Following libraries use only ANSI API
--------------------------------------
interprocess
spirit
test
random
The should replace their ANSI api by Wide one
with a simple glue of utf8_to_wide/wide_to_utf8
Following libraries use STL functions that are not aware of unicode under
windows
---------------------------------------------------------------------------------
std::fstream
- Serialization
- Graph
- wave
- datetime
- property_tree
- progam_options
fopen
- gil
- spirit
- python
- regex
Need to replace with something like:
boost::fstream
and
boost::fopen
that work with UTF-8 under windows.
The rest of the libraries seems to be encoding agnostic.
Artyom
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk