From: Dylan Nicholson (dylan_nicholson_at_[hidden])
Date: 2002-03-04 18:02:51
--- dietmar_kuehl <dietmar_kuehl_at_[hidden]> wrote: >
> dylan_nicholson wrote:
> > For POSIX however, assuming you go the
> ctype-narrow/widen approach,
> > the main issue is of course which locale to
> request. I would say
> > locale("") (ie the default "system" locale), but
> there probably
> > to be a once-off method of overriding this.
> I think the "C" locale is the right one to choose
> because the other
> locale depends on user perferences. Hence, filenames
> written by my
> russion collegue may be pretty unreadable to me,
> although I can read
> Cyrillic letters - I just can't tell what Cyrillic
> characters are
> those I see which are German ones.
But from what I understand with many Unices the
filenames are actually stored on disk using a
particular MBCS mapping. Assuming your application
deals in Unicode characters you then have to know
which MBCS mapping to use. Unless the filesystems
stores the mapping scheme id along with each filename
then you have to pick *some* mapping to use, so the
obvious choice would be that defined by the
user-specificed system locale. There is a risk this
will generate undefined characters if you save your
files using one scheme then restore them using
another, which might be a problem for bilingual users.
But I honestly don't see another solution - how does
current Unix software generally solve this problem?
The fact that the latest POSIX standard still
specifies nothing as far as MBCS->Unicode mappings for
filenames to me is amazing. I assume protocols like
NFS are all still char-only, and I'm guessing not even
NFS defines a mapping to use.
http://movies.yahoo.com.au - Yahoo! Movies
- Vote for your nominees in our online Oscars pool.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk