Boost logo

Boost :

Subject: Re: [boost] [rfc] Unicode GSoC project
From: Scott McMurray (me22.ca+boost_at_[hidden])
Date: 2009-05-16 01:58:07


On Fri, May 15, 2009 at 18:34, Graham <Graham_at_[hidden]> wrote:
>
>>What kind of real-world use do people have for random access, anyways?
>>Even UTF-32 isn't random access for the things I can think of that
>>people would care about, what with combining codepoints and ligatures
>>and other such things.
>
> I wrote a couple of Unicode text editors that would have been a
> nightmare if they had not been operating on UTF-32.
>

What sort of thing? I would have thought that the most
nightmare-inducing stuff would be replacing an "ffi" ligature with an
"ff" ligature if someone hit backspace, figuring out how to edit
combining codepoints, and other such stuff that's not much different
in the various UTFs.

>>As an aside, I'd like to see comparisons between compressed UTF-8 and
>>compressed UTF-16, since neither one is random-access anyways, and it
>>seems to me that caring about size of text before compression is about
>>as important as the performance of a program with the optimizer turned
>>off.
>
> Actually in a few cases I have seen it is not the compressed size but
> the conversion performance [memory/CPU] that hurts. It is much better to
> get the correct encoding for the correct use case.
>

Conversion between what?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk