Boost logo

Boost Users :

From: Daniel Krügler (dsp_at_[hidden])
Date: 2008-08-18 04:02:49


Robert Ramey wrote:
> Daniel Krügler wrote:
>
>>Robert Ramey wrote:
>>
>>>the wide character xml archives use UTF8.
>>>
>>>the narrow character xml archives use the currently set locale.
>>>
>>>Robert Ramey
>>
>>Sorry for asking offhand:
>>
>>What is the reasoning behind this different behaviour?
>
> I assumed that most programs built with narrow characters used
> the locale concept to deal with this.
>
> Wide character systems lend themselves to UTF coding so I
> used that for wide char archives. In order to do this, I used
> Ron Garcia's UTF code conversion facet for streams.
>
> It would be quite easy to generate UTF coding for narrow
> character archives. Just do the following:
>
> a) Build the UTF code conversion facet for narrow character
> input (its templated on character type).
>
> b) When the stream is opened, attach this facet to the stream.
>
> Note the the output char format is not really a property of the
> serialization
> library, but rather an artifact of the way it has been used. That is, the
> serialization library depends on the standard stream library for this
> property.

Thanks for your thorough explanation, Robert. There remains a
slight bad feeling in my stomach (Apologies for a possibly
inappropriate metaphorical speaking): Many programs are written
to be compilable (and executable) in both narrow character or
wide character mode. The above described difference of the
serialization library unfortunately seem to have the effect that
those two programs could not interact with the same persisted
serialization product, right? Or to say it in different words:
If the programmer decides to switch e.g. from one character mode
to the other (a typical usecase I think), (s)he has to take
care of those possibly needed extra steps to realize compatibility
of the serialization IO. This is especially quite cumbersome,
because the more typical way would be to switch from narrow
character to wide character mode. In this case the serialization
has already caused harm, because the old code had created
output which is locale-dependent, while the newer code is free
of this local-dependency, but has now the problem to interpret
existing serialization outputs.

Have I understood this effect correctly?

Thanks,

- Daniel


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net