Boost logo

Boost :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2004-11-11 20:11:46


At 12:41 PM 11/11/2004, Peter Dimov wrote:
>Beman Dawes wrote:
>> The critical technical change required is internationalization. The
>> plan is to provide a templated basic_path class, with typedefs for
>> path and wpath. In other words, an approach very similar to the current
>> std::basic_string, std::string, and std::wstring.
>>
>> Doing this adds a certain amount of complexity compared to the
>> current path class. For example, a path_traits class has to be
>> introduced to give basic_paths on user defined types a way to import
>> the conversions, delimiters, and other traits.
>
>If your basic_path takes a path_traits parameter - which means that
you're,
>in fact, misnaming a policy class as 'path_traits' - this is likely to
>create basic_string-ish problems further down the path.

I think I had better post some actual code rather than try to explain the
approach taken. I'm definitely nervous about it. It will be a couple of
weeks, but I will post some of my proof-of-concept code.

>In addition, basic_path on user-defined types simply doesn't make sense
(to
>me at least), because the user can't just define his own path class.

The user in fact can define his or her own path class, although granted
there are a lot of constraints.

> The
>set of supported paths is determined by the capabilities of the
underlying
>filesystem layer.

Each O/S has an external data type and encoding which is used to represent
paths in the external filesystem. For example, POSIX uses 1 byte with
various implementation defined encoding and Windows uses 2 bytes with a
Unicode encoding. The user can't change that. But inside a program the
user has more freedom.

> In particular, the user cannot define the conversions
>between the different path types, because they are implementation
defined.

The default conversion is implementation defined, but users can supply
their own conversion. One use case I have in mind is a character based O/S
which uses some MBCS encoding of paths that isn't UTF-8, but the user
wishes to burn a CD with UTF-8 encoding. The user should be able to provide
such a conversion function, overriding the implementation defined default.
Note however that whether or not such a user supplied conversion will work
sensibly or at all is very operating system dependent. The filesystem
library can't do anything about what the O/S accepts or doesn't accept.

Another case of particular interest is Windows where the external type is 2
bytes and the user chooses path, which is char based, as the internal type
for directory iteration. What happens when an directory entry uses the
high-order byte? The default conversion supplied by the Windows API is
lossy; the high order byte is simply discarded. An alternative conversion
function might consider this to be an error and throw. Now assuming the
filesystem library chooses one of those approaches as the default, some
users will prefer the other approach and they should be permitted to supply
such a conversion function.

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk