Boost logo

Boost :

From: Howard Hinnant (hinnant_at_[hidden])
Date: 2003-08-04 16:41:19


At 03:35 AM 7/30/2003, Christophe Meessen wrote:

>would anybody be interested in a stdC++ string variant class using UTF8
>as native encoding ?

On Wednesday, July 30, 2003, at 08:25 AM, Beman Dawes wrote:

> Yes, although I'd be much more interested in a string variant that
> could handle other multi-byte encodings too. That would be very
> useful, IMO.

...

> Dinkumware has a commercial library which does conversions, based on
> the standard's codecvt mechanism, IIUC.

Fwiw, Metrowerks also ships UTF-8 codecvt facets, as well as other
encodings (which is how Dinkumware packages this functionality). The
Metrowerks version comes bundled with their std::lib. I'm not sure if
other C++ vendors are doing this as well, but you might base a
mulitbyte string on that assumption. The main stumbling block to
portability would appear to be the name of the codecvt facets. For
example our UTF-8 codecvt facet is spelled std::__utf_8<charT> where
charT is normally a wchar_t, but could be a short or long (if you
weren't happy with sizeof(wchar_t) for example). That is, our UTF-8
will adapt to a 16 or 32 bit wide character.

-Howard


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk