Boost :

Date view	Thread view	Subject view	Author view

From: Eric Niebler (eric_at_[hidden])
Date: 2004-10-21 14:08:18

Next message: Eric Niebler: "[boost] Re: Any interest in adding unicode support to boost?"
Previous message: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"
In reply to: Erik Wien: "[boost] Re: Re: Any interest in adding unicode support to boost?"
Next in thread: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Rogier van Dalen: "Re: [boost] Re: Any interest in adding unicode support to boost?"

Erik Wien wrote:
> "Rogier van Dalen" <rogiervd_at_[hidden]> wrote in message
>
>>I hadn't yet looked at it this way, but you are right from a
>>theoretical point of view at least. To get more to practical matters,
>>what do you think this should do:
>>
>>unicode::string s = ...;
>>s += 0xDC01; // An isolated surrogate, which is nonsense
>>
>>?
>>Should it throw, or convert the isolated surrogate to U+FFFD
>>REPLACEMENT CHARACTER (Unicode standard 4 Section 2.7), or something
>>else? And what should the member function with the opposite behaviour
>>be called?
>
>
> The best solution would be to never append single code units, but instead
> code points. The += operator would determine how many code units is required
> for the given code point.
>

I disagree. The user should be allowed to twiddle as many bits as she
pleases, even permitted to create an invalid UTF string. However,
operations that interpret the string as a whole (comparison,
canonicalization, etc.) should detect invalid strings and throw. The
reason is that people will need to manipulate strings at the bit level,
and intermediate states may be invalid, but that the final state may be
valid. We shouldn't do too much nannying during these intermediate states.

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Next message: Eric Niebler: "[boost] Re: Any interest in adding unicode support to boost?"
Previous message: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"
In reply to: Erik Wien: "[boost] Re: Re: Any interest in adding unicode support to boost?"
Next in thread: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Rogier van Dalen: "Re: [boost] Re: Any interest in adding unicode support to boost?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk