|
Boost : |
From: Peter Dimov (pdimov_at_[hidden])
Date: 2006-02-17 17:54:29
Beman Dawes wrote:
> "Peter Dimov" <pdimov_at_[hidden]> wrote in message
> news:00af01c6334a$b9ed2200$6407a8c0_at_pdimov2...
>> At the end I reverted the changes and just encoded the wide path into
>> UTF-8 at the very start, passed the UTF-8 string through the existing
>> code, then decoded the UTF-8 into a wstring at the very end,
>> immediately before calling the Windows API. It worked.
>
> Seems like a reasonable and practical approach. I've wondered several
> times if we wouldn't have been better off if Microsoft had chosen
> UTF-8 as their Window external representation, too.
I don't think that they could have done that because of legacy FAT
filesystems that could have been using narrow paths with an arbitrary
encoding.
But my point is that the library can use UTF-8 as its _internal portable
encoding_, encoding into UTF-8 when it is given a path as a wstring or a
(string, encoding) pair, and decoding into the appropriate (string,
encoding) or wstring when it passes a path to the OS. Everything else can be
string-based.
With this approach, we can have a single path class that handles everything.
No need to choose between a narrow path and a wide path, and no need to
encode the character encoding into the path type.
I've tried to communicate this via code, apparently with mixed success. :-)
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk