Boost logo

Boost :

Subject: Re: [boost] [unicode] Interest Check / Proof of Concept
From: Kasra (kasra_n500_at_[hidden])
Date: 2008-11-20 11:09:14


Hi guys,

I have been reading this thread with enthusiasm. I like the idea of unicode string class. However, since unicode is rather different from the old fashioned ANSI string literals some interfaces might need to be changed.

Maybe what I have to say could help you in one form or the other:

\+ typedef std::basic_string<uint32_t> unicode_string;

this string class will have all of the normal std::string interface as all of the unicode encoding is mapped within a 32-bit integral type.

we could then have a class called encoding; such that it allows for static conversion from unicode_string type to utf8, utf16, etc.

for example:

\+ unicode_string s = U"Hello Wolrd!";
\+ utf8_string utf8 = encoding::utf32::to_utf8(s);
// also we could have a implicit conversion such that utf8 could
// perform the conversion from utf32 string

I personally like this approach a lot. Since when you think about it the interface does not changed we could still have efficient iterators and insertion, at the same time whenever there is a need for memory compact encoding we convert to the desired encoding as required.

So in a sense what I am saying is keep the encoding using utf32 for performance on memory however, when we want to export the encoding (to screen, file and etc) convert to the desired encoding.

If there is any interest for this approach drop me a line and I will send you the interface which I have for this.

With best regards
Kasra

      


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk