Boost logo

Boost :

From: dietmar_kuehl_at_[hidden]
Date: 2001-09-26 07:48:33


Hi,
--- In boost_at_y..., Beman Dawes <bdawes_at_a...> wrote:
> [suggested approach to add Unicode characters to the C++ Standard
Library]

Note: I didn't propose anything to add to the Standard C++ Library!
I just described how the existing is intended to be used. That is,
on platforms where 'wchar_t' is sufficient, nothing special is to
be done except that appropriate code conversion facets are to be
used. I also described how the standard library is to be used for
"not supported" character types (ie. character types not explicitly
considered as characters). This stuff is all there and the user
just has to write down a few rather technical thingies and
everything just drops into place.

> There was recently lengthy discussion of this on one of the C++
> committee reflectors.

Apparently not on the std-lib reflector which is, for whatever
reason, the only one I get. I think they are archived, however,
and I will look for this when I find the time...

> Related question: Is the approach you propose something that is
> likely to be well-received by the committee, or is it
> controversial?

I proposed how to play within the rules set up by the standard.
Adding or changing the standard to accommodate what I described is
basically changing the rules. While I think everybody would agree
that my description is playing according to the rules and is thus
well-received, I doubt that changing the rules is as well: There
is already a character type for Unicode, namely 'wchar_t' and I
doubt that there is wide support for adding yet another character
type because someone messed things up, independent on whether the
"someone" is seen as the Unicode standardization group or the
compiler vendors making the optimistic choice of Unicode having
just 16 bits.

My feeling is that there are indeed three parties involved in the
whole story:
- The standard setting up the rules and a basic playing ground,
  eg. a framework to deal with strings, characters, and
  corresponding I/O.
- The user who wants to play the game according to the rules, eg.
  read/write Unicode character streams, possibly in the context of
  processing XML documents.
- A third party, eg. Boost, who provides the equipment for the
  players to play fairly according to the rules, eg. by providing
  character traits, facets, etc. for processing of Unicode
  characters.

What do other LWG members think of Unicode characters? Do we need
another character type, eg. with character and string literals
introduced by an 'U' like in U'B' or U"Boost"? We should discuss
this with Core. ...or just 'ucchar_t' support, wide enought to hold
Unicode ie. what 'wchar_t' was intended for and leave those stuck
with a 16 bit 'wchar_t' in the rain? Is there anybody who really´
wants character or string literals not fitting into 16 bits?

My feeling is that Boost is an appropriate place to provide
something like 'ucchar_t' and, for the platforms requiring this
(ie. where 'ucchar_t' is not 'wchar_t'), the corresponding support
classes.

--
<mailto:dietmar_kuehl_at_[hidden]> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk