Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2005-05-02 07:59:28


At 05:22 AM 3/31/2005, Vladimir Prus wrote:
>On Wednesday 23 March 2005 18:07, Beman Dawes wrote:
>> CVS now contains a branch "i18n" of the filesystem directories:
>>
>> * Class templates basic_path, basic_directory_iterator, etc, support
>> narrow, wide, and user-defined path types. Typedefs path,
>> directory_iterator, etc, are provided, so most existing code continues
to
>> work.
>
>I recall we had a long discussion concerning basic_path vs. single path
>type. I don't think results of that discussion are present in i18n.html

-- 
 >essentially, there's no rationale for going with basic_path.
OK, I'll add rationale. Here is a first draft:
During preliminary internationalization discussion on the Boost developer's 
list, a design was considered for a single path class which could hold 
either narrow or wide character based paths. That design was rejected 
because:
* There were technical issues with conversions when a narrow path was 
appended to a wide path, and visa versa. The concern was that double 
conversions could cause incorrect results, that conversions best left to 
the operating system would be performed, and that the technical complexity 
was too great in relation to perceived benefits. User-defined types would 
only make the problem worse.
* The design was, for many applications, an over-generalization with 
runtime memory and speed costs which would have to be paid for even when 
not needed.
* There was concern that the design would be confusing to users, given that 
the standard library already uses single-value-type strings, rather than 
strings which morph value types as needed.
 >...
 >
 >Also I note that there's no conversion from basic_path<char> to
 >basic_path<wchar_t> or vice versa, as far as I can say. To recall my
 >argument
 >for conversion: say I have a library which exposes paths in the 
interface,
 >should I use path or wpath in it? If I use path, then due to missing
 >conversion, the library is unusable with other code that uses wpath. So I 
 >need to use wpath. And so basically, all libraries need to use wpath
 >everywhere. So, why do you need path at all?
Applications which need wide-character internationalization will use wpath 
or other wide-character basic_path types. Applications which don't need 
wide-character internationalization will use path. Both are needed - they 
serve different user needs.
 >> * The POSIX wpath implementation assumes that UTF-8 is always the
 >> operating system's preferred external path encoding. If any Boost
 >> users are concerned about other encodings, please let me know.
 >
 >I certainly do. The standard encoding for russian on Linux is koi8-r.
 >Probably, we need to use the conversion facet that's part of global 
locale.
 >... So using locale("") is the best guess, I think.
Hum... Point taken. I was hoping to avoid use of global locale because of 
past unhappy experience with inconsistencies between different UNIX 
flavors. Perhaps that situation has improved. Maybe a UTF-8 fallback could 
be provided for systems where global locale is unreliable.
--Beman

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk