Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2004-02-16 09:45:58

At 12:49 AM 2/16/2004, Walter Landry wrote:
>Beman Dawes <bdawes_at_[hidden]> wrote:
>> At 12:19 PM 2/15/2004, Paul Miller wrote:
>> > Presumably
>> >Linux still works with multi-byte characters.
>> >
>> >Is there progres toward a wchar_t-aware path?
>> Yes. I now have the outline of a design for the internationalization of

>> Boost.Filesystem paths.
>Care to share? I'm curious how you handle some of the legacy Japanese

The framework looks something like this:

There are internal representation types like char, wchar_t, or user-defined
character types meeting std::string requirements. Those are handled by
path, wpath, or a basic_path class template respectively. The encoding of
char and wchar_t, of course, are defined by the compiler. The encodings of
UDT's are defined by their implementations.

There is one (usually, but with exceptions) external representation type.
Each representation type may support multiple external path name encodings,
including user defined encodings, subject to the operating system's
encoding limitations.

There will be a locale based (ie codecvt) mechanism for converting between
the internal representation type and encoding, and the external
representation type and encoding. The mechanisms for default and explicit
locale operations will presumably be modeled on those of I/O streams.

So handling the legacy Japanese encodings works like this:

The programmer selects an internal type and encoding that can represent
those external types and encodings. Perhaps wchar_t, but perhaps some UDT.

The external type and encoding is presumably the operating system's
default. The default locale mechanism will provide the codecvt facet to
handle the conversions. So on a Japanese O/S, the external representation
may be one of the legacy encodings, and if so the correct conversions will
take place.

Does that make sense?


Boost list run by bdawes at, gregod at, cpdaniel at, john at