Boost :

Date view	Thread view	Subject view	Author view

From: Miro Jurisic (macdev_at_[hidden])
Date: 2004-10-19 12:58:55

Next message: Robert Ramey: "[boost] Re: serialization - build shared libs"
Previous message: Robert Ramey: "[boost] Re: serialization - build shared libs"
In reply to: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"
Next in thread: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"

In article <cl3hl9$g4e$1_at_[hidden]>, "Erik Wien" <wien_at_[hidden]> wrote:

> The basic idea I have been working around, is to make a nencoded_string
> class templated on unicode encoding types (i.e. UTF-8, UTF-16). This is made
> possible through a encoding_traits class which contains all nececcary
> implementation details for working on strings of code units.

I generally agree with this design approach, but I don't think that code point
iterators alone are sufficient. Iteration over encoded characters and abstract
characters would be needed for some algorithms to function sensibly. For
example, the simple task of:

find(begin, end, "ü")

needs to use abstract characters in order to be able to find precomposed and
decomposed versions of ü.

> You could use the encoded_string class like this:
>
> // Constructor converts the ASCII string to UTF-16.
> encoded_string<utf16> some_string("Hello World");
> // Run some standard algorithm on the string:
> std::for_each(some_string.begin(), some_string.end(), do_some_operation);

Again, taking this example, you let's say that do_some_operation performs
canonicalization to some Unicode canonical form; you can't do this by iterating
over code points.

> I am aware that this implementation will be less that ideal for integration
> with the current c++ standard, but it's issues like that I would like to get
> deeper into during the develpoment.

You should explain what problems with integration you foresee.

meeroh

Next message: Robert Ramey: "[boost] Re: serialization - build shared libs"
Previous message: Robert Ramey: "[boost] Re: serialization - build shared libs"
In reply to: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"
Next in thread: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Erik Wien: "[boost] Re: Any interest in adding unicode support to boost?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk