Boost :

Date view	Thread view	Subject view	Author view

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-10-21 05:12:02

Next message: Michael Walter: "Re: [boost] [graph] adjacency_list::operator=()"
Previous message: John Torjo: "[boost] FORMAL review of "Output Formatters" - results"
In reply to: Mathew Robertson: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Next in thread: Mathew Robertson: "Re: [boost] Re: Re: Any interest in adding unicode support to boost?"
Reply: Mathew Robertson: "Re: [boost] Re: Re: Any interest in adding unicode support to boost?"

Mathew Robertson wrote:

>> >> - Why would the user want to change the encoding? Especially between
>> >> UTF-16 and UTF-32?
>> >
>> > Well... Different people have different needs. If you are mostly using
>> > ASCII characters, and require small size, UTF-8 would fit your bill. If
>> > you need the best general performance on most operations, use UTF-16.
>> > If you need fast iteration over code points and size doesn't matter,
>> > use UTF-32.
>>
>> Ok, since everybody agreed characters outside 16 bits are very rare,
>> UTF-32 seems to never be needed. As for UTF-8 vs. UTF-16: yes, the need
>> for choice seems present. However, UTF-16 string class would be better
>> than no string class at all, and extra genericity will cost you
>> development time.
>
> <rant>
> umm... so your saying that no one will ever need more than 640K RAM?
> Just because YOU dont need more than 16bits, doesn't meen that I dont need
> more than 16bits. </rant>
>
> The main question of a Unicode library should _always_ be, can the library
> represent every character that can be drawn; things like iteraters,
> algorithms, etc are nice-to-haves -> the representation of the written
> language is the first priority, everything else is secondary.
>
> Also, the Unicode standard will evolve over time to include more
> characters from many more characters sets that you or I may never use but
> someone else might; who knows, maybe the ASCII character set will get a
> 27th character one day... A library shouldn't preclude the use of these
> new characters, just because we thought "no one will ever need more than
> 16bits"... So, how about we dont make the same mistakes as we made in the
> past...

Do you realize that "nobody needs UTF-32" is not the same that "nobody needs
character which can't be represented in 16 bits"? UTF-16 can represent all
Unicode characters.

> Whatever desision finially gets chosen will come down to one of two
> choices: a) variable length string format, eg: UTF8, or something similar
> b) fix width format with so many bits that humans are unlikely to use all
> the address space at any time in the next 50/100 years, eg UTF-32, or
> similar
>
> FWIW: my personal preference would be to go for a variable with encoding
> -> so that we never have to solve this problem again... although this
> makes concepts like text-reflow quite a bit harder to implement.

What's "text-reflow", BTW?

- Volodya

Next message: Michael Walter: "Re: [boost] [graph] adjacency_list::operator=()"
Previous message: John Torjo: "[boost] FORMAL review of "Output Formatters" - results"
In reply to: Mathew Robertson: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Next in thread: Mathew Robertson: "Re: [boost] Re: Re: Any interest in adding unicode support to boost?"
Reply: Mathew Robertson: "Re: [boost] Re: Re: Any interest in adding unicode support to boost?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk