Boost logo

Boost :

From: dietmar_kuehl (dietmar_kuehl_at_[hidden])
Date: 2002-03-05 13:53:28


Hi,
Beman Dawes wrote:
> In other words, an NT file system implementation would support both
> char and wchar_t, but POSIX (which if I understand correctly only
> supports char) would only supply specializations for char.

Both 'char' and 'wchar_t' interfaces (and, if it emerges as another
standard character also a possible third character type, eg.
'uchar_t' specified to use Unicode characters) to the file system
library are supported on all systems, independent on whether the
underlying system supports them or not. The major difference between
a system supporting wide characters and a system not supporting them
is the available choice of file names: No name using character which
cannot be mapped 1:1 to the underlying character type shall be used.
That is, on a POSIX system which only support 'char' it is impossible
to use filenames using character not representable as a 'char'
string. So, why is there support for 'wchar_t' in the first place?
Simply to write programs in aware of wide characters in portable
fashion. Just the choice of file names is restricted, nothing else.

This is, BTW, a similar approach to the one taken by XML: "System
IDs" are even more restricted in the available choices of characters.
The specification above, however, allows users to possible encode
names using UTF-8 if they want to do so (well, I think UTF-8 would
create certain reserved characters, notably '/', but this is an issue
users of the code have to address).

> If conversion was desired, the separate conversion library could be
> used.

I would add the conversion facilities to the file system classes
also for another reason: If a majority of the application would use
them, a change of the underlying system's support of character types
would be available to all these application basically by just
switching to a file system class library which is aware of
additionally supported character types.

> Wouldn't this answer the concern of Asian nations that we not
> mandate a particular filename conversion scheme?

That's something you have to ask the Asian nations or, short of
really being able to ask a nation, their representatives dealing
with such issues. I think it is a clean and viable technical approach
which at least suits my needs. Whether it also suits anybody else's
needs is a different issue. Especially, as I have a reasonable
transcription of the letters I'm typically dealing with to ASCII
(eg. u-umlaut to "ue"). I'm not sure whether similar transcription
conventions exist for all Asian writings.

--
<mailto:dietmar_kuehl_at_[hidden]> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk