|
Boost : |
Subject: Re: [boost] [unicode] Interest Check / Proof of Concept
From: James Porter (porterj_at_[hidden])
Date: 2008-11-19 19:29:38
Eric Niebler wrote:
> Agree. Thanks Zach. I'm discouraged that every time the issue of a
> Unicode library comes up, the discussion immediately descends into a
> debate about how to design yet another string class. Such a high level
> wrapper *might* be useful (strong emphasis on "might"), but the core
> must be the Unicode algorithms, and the design for a Unicode library
> must start there.
Since it seems like there's a lot of concern with making a new string
type, how about the following (off-the-cuff):
* Iterator filters a la Zach's message:
typedef std::basic_string<char16_t> utf16_string;
utf16_string u_string = /*...*/;
std::string std_string = /*...*/;
typedef boost::recoding_iterator<boost::utf16, boost::utf8>
utf16_to_utf8_iter;
std::copy(utf16_to_utf8_iter(u_string.begin()),
utf16_to_utf8_iter(u_string.end()),
std::back_inserter(std_string));
* Runtime-defined filters:
typedef boost::recoding_iterator<boost::utf16,boost::runtime>
utf16_to_any_iter;
boost::runtime *my_codec = /*...*/;
std::copy(utf16_to_utf8_iter(u_string.begin(), my_codec),
utf16_to_utf8_iter(u_string.end(), my_codec),
std::back_inserter(std_string));
* Shorthand for the above two points:
boost::transcode(u_string, boost::utf16(),
std_string, boost::utf8());
* String views that can wrap up the encoding type and the data (a
container of some kind: strings, vector<char>s, ropes, etc):
boost::estring_view<utf8> my_utf8_string(std_string);
boost::estring_view<> my_rt_string(str, my_codec);
boost::transcode(my_utf8_string, my_rt_string);
Luckily, most of the work I've done is in making the encoding facets
extensible and chooseable at runtime, so I wouldn't mourn the loss of my
(frankly none-too-zazzy) string class.
- Jim
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk