Boost logo

Boost :

From: Rogier van Dalen (rogiervd_at_[hidden])
Date: 2006-12-06 05:11:17


Dear Nemanja,

On 12/5/06, Nemanja Trifunovic <nemanja_trifunovic_at_[hidden]> wrote:
> This is the second call for the informal review of the UTF8 library. It is based on verson 1.02 of UTF8-CPP: http://utfcpp.sourceforge.net/ and you can find it at

I like the functions you provide, and the "unchecked" namespace.
Unlike Hervé, I do think exceptions are the way to go. I seem to miss
a couple of things though.
In a recent discussion on this list there seemed to be a preference
for using iterators, which can be composed, for example to perform
UTF-8->UTF-16 conversion, or conversions to other codepages. Iterators
can be much more flexible than these free functions.
Is there any particular reason why you do not include similar
functions for UTF-16?
One of the most important uses for UTF must be IO. Shouldn't a
utf_codecvt be part of the library?
Hervé is right: reading UTF-8 can be optimised a lot using tables with
data. I've got an implementation lying around that I'd be happy to
share. It took 30% less time than the straightforward implementation
and it did all the necessary checks.

The final thing is, your functions try to maintain strings with of
valid UTF-8. Why not provide a string type that maintains this
variant?

Conclusion: in my opinion a lot of things are missing from the library
at the moment.

Regards,
Rogier


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk