Boost :

Date view	Thread view	Subject view	Author view

From: Peter Bindels (dascandy_at_[hidden])
Date: 2006-09-17 12:10:19

Next message: Roland Schwarz: "Re: [boost] [threads] Permission given to change to Boost.License"
Previous message: Andy Little: "Re: [boost] [fusion] matrix?"
In reply to: Aristid Breitkreuz: "Re: [boost] Work that has been done on Unicode"
Next in thread: Aristid Breitkreuz: "Re: [boost] Work that has been done on Unicode"
Reply: Aristid Breitkreuz: "Re: [boost] Work that has been done on Unicode"
Reply: loufoque: "Re: [boost] Work that has been done on Unicode"

On 17/09/06, Aristid Breitkreuz <aribrei_at_[hidden]> wrote:
> Am Samstag, den 16.09.2006, 19:55 +0200 schrieb loufoque:
> > Aristid Breitkreuz wrote :
> [snip]
> > > That's fine. Do you have plans on which Unicode encoding to use
> > > internally?
> >
> > UTF-8, UTF-16 and UTF-32 would all be available for implementations, and
> > each one would be able to take or give the other ones for input/output.
>
> I guess that every single supported type is extra complexity, right?
> Would not UTF-8 (for brevity and compatibility) and UTF-32 (because it
> might be better for some algorithms) suffice?

That's not entirely accurate. UTF-8 is Latin-centric, so that all
latin texts can be processed in linear time, taking longer for the
rest. UTF-16 is common-centric, in that it works efficiently for all
common texts in all common scriptures, except for a few. Choosing
UTF-8 over UTF-16 would make the implementation (and accompanying
software) slow in all parts of the world that aren't solely using
Latin characters. That would be most of Europe, Asia, Africa,
South-America and a number of people in North-America and Australia.
Forcing them to UTF-32 makes for quite a lot worse memory use than
could reasonably be expected. I see quite a lot of use for the UTF-16
case, perhaps even more than the UTF-8 one.

Next message: Roland Schwarz: "Re: [boost] [threads] Permission given to change to Boost.License"
Previous message: Andy Little: "Re: [boost] [fusion] matrix?"
In reply to: Aristid Breitkreuz: "Re: [boost] Work that has been done on Unicode"
Next in thread: Aristid Breitkreuz: "Re: [boost] Work that has been done on Unicode"
Reply: Aristid Breitkreuz: "Re: [boost] Work that has been done on Unicode"
Reply: loufoque: "Re: [boost] Work that has been done on Unicode"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk