Boost logo

Boost :

Subject: Re: [boost] [locale] Formal review of Boost.Locale library EXTENDED
From: Matus Chochlik (chochlik_at_[hidden])
Date: 2011-04-20 04:21:51


On Wed, Apr 20, 2011 at 9:58 AM, Ryou Ezoe <boostcpp_at_[hidden]> wrote:

>
>
> On Wed, Apr 20, 2011 at 4:47 PM, Matus Chochlik <chochlik_at_[hidden]> wrote:
>> On Wed, Apr 20, 2011 at 3:54 AM, Edward Diener <eldiener_at_[hidden]> wrote:
>>> On 4/19/2011 9:05 AM, Artyom wrote:

> Why some people thinks one encoding of UCS is better than others.

I could ask why some people think that one character
set (UCS) is better than others, but I think the answer
is obvious and to me it is also obvious why to use
one character encoding.

1) Because some of those people don't do just
 Windows programming or just Mac programming
 or Linux programming, but they do multi-platform
 programming and sometimes they do it for machines
with different byte orderings.

2) If you pick one encoding and use it consistently
everywhere your application does not need to include
transcoding-related code nor data. You for example
save a data file on MAC and open it on Windows
just by reading it byte by byte. You don't have to solve
Little Endian/Big Endian related problems etc.
i.e. you don't need byte ordering marks.
The only time you need to transcode is on platforms
where the OS/API uses another encoding and such
APIs already provide means for transcoding from UTF-8.

3) You use the same algorithms everywhere, you don't
need to write the same algorithm n-times (for UTF-8,
UTF-16, UTF-32) and I'm not even going to start talking
about maintaining and debugging that, you use just one
of them.

4) If someone on the other end of the Globe uses the same
approach and wishes to use the output of your applications
or wants to feed it some input data he/she does not need to
do the transcoding.

5) You don't need dumb macro-based character type
switching.

6) Your library plays well with other libraries using
char. Try to count 3-rd party libraries that use
char-based strings and count those using wchar-based
strings only in their APIs. Compare.

[snip/]

Matus


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk