Boost logo

Boost :

From: Carl Daniel (cpdaniel_at_[hidden])
Date: 2002-02-23 11:52:13


From: "Beman Dawes" <bdawes_at_[hidden]>
>
> How does this work? The Win32 API traffics mostly in character strings.
>
> For example, say we have a function template:
>
> template< typename CharT >
> bool exists( const std::basic_string<CharT> & path );
>
> The Win32 char implementation might look like this:
>
> template<> // specialization for char
> bool exists<char>( const std::basic_string<char> & path )
> {
> return GetFileAttributes( path.c_str() ) != 0xFFFFFFFF;
> }
>
> What does the wchar_t implementation look like? There isn't a
> GetFileAttributes overloaded for wchar_t; GetFileAttributes itself is
> apparently Unicode enabled. I guess that means the wstring path argument
> has to be converted somehow before calling GetFileAttributes().
>
> I'd appreciate seeing the preferred code from someone who has actual
> experience with Unicode path names.

The Win32 API actually has two versions of nearly every function which accepts a character sequence, with the
"Overloading" handled via macros (so that it works in C).

For example, continuing your example, GetFileAttributes is declared like this:

WINBASEAPI
DWORD
WINAPI
GetFileAttributesA(
    LPCSTR lpFileName
    );
WINBASEAPI
DWORD
WINAPI
GetFileAttributesW(
    LPCWSTR lpFileName
    );
#ifdef UNICODE
#define GetFileAttributes GetFileAttributesW
#else
#define GetFileAttributes GetFileAttributesA
#endif // !UNICODE

As you can see, there really isn't a function GetFileAttributes, but rather a pair of functions. For writing C++ code
with overloads, it's appropriate & common to simply use the A and W suffixed versions directly:

    template<> // specialization for char
    bool exists<char>( const std::basic_string<char> & path )
    {
      return GetFileAttributesA( path.c_str() ) != 0xFFFFFFFF;
    }

    template<> // specialization for char
    bool exists<wchar_t>( const std::basic_string<wchar_t> & path )
    {
      return GetFileAttributesW( path.c_str() ) != 0xFFFFFFFF;
    }

Another possibility would be to create a whole new flavor of std::string:

namespace std
{
#ifdef UNICODE
    typedef string tstring;
#else
    typedef wstring tstring;
#endif // !UNICODE
}

And then write all your overloads in terms of TCHAR and tstring:

    template<> // specialization for TCHAR
    bool exists<TCHAR>( const std::tstring & path )
    {
      return GetFileAttributes( path.c_str() ) != 0xFFFFFFFF;
    }

This would be consistent with the way the Windows headers are set up. Personally, I think the way they handle Unicode
versus multi-byte is grotesque if clever, but many Windows programmers program to TCHAR religiously & end up with code
which can be compiled for Unicode or multi-byte with the flip of a global #define.

One last item to consider which muddies the waters even further:
NT based systems (NT, 2000, XP) are 100% Unicode internally. The "ANSI" functions internally convert their parameters
to Unicode and then forward to the Wide version (this is why the ANSI versions are limited in the length of path they
can support, for example - they won't allocate more than MAX_PATH bytes of space for the conversion).

Under the Windows 95 progeny, the situation is reversed: Internaly, the OS is 99% ANSI based, and the Wide functions
either convert their arguments to ANSI and forward to the narrow function, or they simply do nothing (astoundingly,
there are 100's of "Wide" APIs under Windor 9x which do nothing at all - they don't even return failure). Microsoft has
recently released a "Unicode layer for Windows 9x" which actually implements most of the missing APIs so it's actually
possible to run a Unicode application under 9X.

-cd


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk