Boost logo

Boost :

Subject: Re: [boost] [rfc] Unicode GSoC project
From: Graham (Graham_at_[hidden])
Date: 2009-05-15 18:34:55


>What kind of real-world use do people have for random access, anyways?
>Even UTF-32 isn't random access for the things I can think of that
>people would care about, what with combining codepoints and ligatures
>and other such things.
I wrote a couple of Unicode text editors that would have been a
nightmare if they had not been operating on UTF-32.

>As an aside, I'd like to see comparisons between compressed UTF-8 and
>compressed UTF-16, since neither one is random-access anyways, and it
>seems to me that caring about size of text before compression is about
>as important as the performance of a program with the optimizer turned
>off.
Actually in a few cases I have seen it is not the compressed size but
the conversion performance [memory/CPU] that hurts. It is much better to
get the correct encoding for the correct use case.

Yours,
Graham


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk