|
Boost : |
From: Beman Dawes (bdawes_at_[hidden])
Date: 2005-05-02 07:59:28
At 05:22 AM 3/31/2005, Vladimir Prus wrote:
>On Wednesday 23 March 2005 18:07, Beman Dawes wrote:
>> CVS now contains a branch "i18n" of the filesystem directories:
>>
>> * Class templates basic_path, basic_directory_iterator, etc, support
>> narrow, wide, and user-defined path types. Typedefs path,
>> directory_iterator, etc, are provided, so most existing code continues
to
>> work.
>
>I recall we had a long discussion concerning basic_path vs. single path
>type. I don't think results of that discussion are present in i18n.html
-- >essentially, there's no rationale for going with basic_path. OK, I'll add rationale. Here is a first draft: During preliminary internationalization discussion on the Boost developer's list, a design was considered for a single path class which could hold either narrow or wide character based paths. That design was rejected because: * There were technical issues with conversions when a narrow path was appended to a wide path, and visa versa. The concern was that double conversions could cause incorrect results, that conversions best left to the operating system would be performed, and that the technical complexity was too great in relation to perceived benefits. User-defined types would only make the problem worse. * The design was, for many applications, an over-generalization with runtime memory and speed costs which would have to be paid for even when not needed. * There was concern that the design would be confusing to users, given that the standard library already uses single-value-type strings, rather than strings which morph value types as needed. >... > >Also I note that there's no conversion from basic_path<char> to >basic_path<wchar_t> or vice versa, as far as I can say. To recall my >argument >for conversion: say I have a library which exposes paths in the interface, >should I use path or wpath in it? If I use path, then due to missing >conversion, the library is unusable with other code that uses wpath. So I >need to use wpath. And so basically, all libraries need to use wpath >everywhere. So, why do you need path at all? Applications which need wide-character internationalization will use wpath or other wide-character basic_path types. Applications which don't need wide-character internationalization will use path. Both are needed - they serve different user needs. >> * The POSIX wpath implementation assumes that UTF-8 is always the >> operating system's preferred external path encoding. If any Boost >> users are concerned about other encodings, please let me know. > >I certainly do. The standard encoding for russian on Linux is koi8-r. >Probably, we need to use the conversion facet that's part of global locale. >... So using locale("") is the best guess, I think. Hum... Point taken. I was hoping to avoid use of global locale because of past unhappy experience with inconsistencies between different UNIX flavors. Perhaps that situation has improved. Maybe a UTF-8 fallback could be provided for systems where global locale is unreliable. --Beman
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk