Boost logo

Boost Users :

Subject: Re: [Boost-users] program_options and wstrings
From: Vladimir Prus (vladimir_at_[hidden])
Date: 2009-11-09 05:59:25


Yang Zhang wrote:

> On Sun, Nov 8, 2009 at 11:52 PM, Vladimir Prus
> <vladimir_at_[hidden]> wrote:
>> Yang Zhang wrote:
>>>
>>> - value vs. wvalue
>>
>> If your option type can be constructed from char* (either using custom
>> validator, or operator>>), you can use value.
>> If your option type can be constructed from wchar_t, you can use wvalue.
>> If both, wvalue is a better since you won't loose data no matter what
>> kind of parser is used.
>
> Why would you ever lose data? UTF-8 and UTF-16 are both encodings of
> the same set of characters. Isn't that what codecvt converts between?

But char* strings are not necessary UTF-8, they are in local 8-bit
encoding that might well be KOI8-R, or whatever else. So, if you have
wchar_t* argv, and your final target is 'string', you have two possible
transformations:

        wchar_t* -> string
        wchar_t* -> char* -> string

with 'wvalue', the first conversion will be attempted, and will fail.
with 'value', the second conversion will be attempted, and might work,
or might lose some data.

>
>> Given what wstring cannot be constructed from char*, you have to
>> use wvalue for wstring.
>
> You can't construct a vector<...> from a char* either, yet that's
> legal. See my confusion? :) This is why it's unclear to me what
> significance value vs. wvalue have - esp. since codecvt is doing
> conversions anyway.

Speaking of vector, you can create a custom validator to create
vector from char*, or wchar_t*, or both -- and then you still
have to use value/wvalue to specify which path data should travel.

- Volodya


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net