Boost logo

Boost :

Subject: Re: [boost] [General] Always treat std::strings as UTF-8
From: Robert Kawulak (robert.kawulak_at_[hidden])
Date: 2011-01-16 14:10:57


> From: Chad Nelson
> http://www.oakcircle.com/toolkit.html
>
> I've released it under the Boost license, so anyone may use it as they
> wish.

A very nice and useful utility. Anyway, I'll share some comments, just in case you want to hear some. ;-)

"Be warned, if you try to convert a UTF-coded value to ASCII, each decoded
character must fit into an unsigned eight-bit type. If it doesn't, the library
will throw an \c oakcircle::unicode::will_not_fit exception."

I think that exception is not always appropriate. A better solution would be a policy-based class design or additional conversion
function accepting an error policy. This way the user could tell the converter to use some "similarly looking" or "invalid"
character instead of throwing when exact conversion is not possible.

"Note that, like pointers, they can hold a null value as well, created by passing
\c boost::none to the type's contructor or setting it equal to that value."

I don't feel the interface with pointer semantics is the most suitable here. Are there any practical advantages from being able to
have a null string? Even if so, one could use an actual pointer or boost::optional anyway.

Moreover, it would be nice if the proper encoding of the underlying string was the classes' invariant. Currently the classes cannot
guarantee this because they allow for direct access to the value which may be freely changed by the user with no respect to the
encoding.

Best regards,
Robert


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk