Boost logo

Boost :

Subject: Re: [boost] Boost.Locale (was Re: [SQL-Connectivity] Is Boost interested in CppDB?)
From: Matus Chochlik (chochlik_at_[hidden])
Date: 2010-12-15 02:20:11


On Tue, Dec 14, 2010 at 8:25 PM, Mathias Gaunard
<mathias.gaunard_at_[hidden]> wrote:
>
> My library can do that kind of conversion with arbitrary ranges, and
> possibly lazily as it is being iterated.
>
> Artyom's library can probably do it too, but only eagerly and with
> contiguous memory segments.
>
> My Unicode library would be in the review queue if people had manifested
> sufficient interest, but I was quite disappointed to see none last time I
> asked for comments.
> I did send a submission to boostcon 2011 about it though, to present its
> approach to Unicode and discuss it.
>

I *am* interested in a good (semi-)standard Unicode handling library
for C++ since it is IMO long overdue (not counting all the C libraries)
and working with text at character level in C++ is nowadays
a real pain if you are not limited to ASCII.

+ Eager / lazy iteration and traversing noncontiguous sequences
are cool, but I would also welcome some high-level one-line
tools for convenient conversion between std::strings and wstrings on
different platforms, most notably Windows where using std::strings in
Unicode builds and with functions taking just LPWSTR is a nightmare.

IMO a lot of people would find something like this extremely useful
(even if not extremely efficient).

str::string s = get_utf8_string();
WhatEverWinapiFunc(..., convert_to<std::string<TCHAR>>(s).c_str(), ...);

or

str::wstring ws = get_string();
AnotherWinapiFunc(..., convert_to<std::string<TCHAR>>(ws).c_str(), ...);

Another thing is some kind of adaptor for std::(w)string providing begin()/end()
functions returning an iterator traversing through the code points instead
of utf-XY "chars". i.e. in C++0x:

std::string s = get_utf8_string();
auto as = adapt(s);
auto i = as.begin(), e = as.end();
while(i != e)
{
   char32_t c = *i;
   ...
   *i = transform(c);
   ++i;
}

I have just scrolled through the docs for Boost.Unicode some time ago
so maybe it is already there and I've missed it. If so, links to some
examples showing this would be appreciated.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk