Boost :

Date view	Thread view	Subject view	Author view

From: Erik Wien (wien_at_[hidden])
Date: 2004-10-19 13:52:40

Next message: Robert Ramey: "[boost] Re: serialization - build shared libs"
Previous message: Neal D. Becker: "[boost] Re: serialization - build shared libs"
In reply to: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Next in thread: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"

Hi. Thanks for the feedback!

"Miro Jurisic" <macdev_at_[hidden]> wrote in message
news:macdev-BACD3C.13585519102004_at_sea.gmane.org...
> I generally agree with this design approach, but I don't think that code
> point
> iterators alone are sufficient.

Neither do I as the matter a fact, but this is as far as I have come right
now. :) There would probably be different types of iterators (or iterator
wrappers) made available to enable iterations over everything from code
units to code points/abstract characters.

> Iteration over encoded characters and abstract
> characters would be needed for some algorithms to function sensibly. For
> example, the simple task of:
>
> find(begin, end, "ü")
>
> needs to use abstract characters in order to be able to find precomposed
> and
> decomposed versions of ü.
>

True... And this is a point where implemtation would be less than trivial.
Comparing strings in unicode is anything BUT trivial, and it's imperative to
find a good way to implement this functionallity through the standard
algorithms.

> Again, taking this example, you let's say that do_some_operation performs
> canonicalization to some Unicode canonical form; you can't do this by
> iterating
> over code points.
>

Nope. A code unit iterator would be needed for things like that.

>> I am aware that this implementation will be less that ideal for
>> integration
>> with the current c++ standard, but it's issues like that I would like to
>> get
>> deeper into during the develpoment.
>
> You should explain what problems with integration you foresee.

I think I was thinking a little ahead of myself when I wrote that. :) The
implementation described here would not pose too much of a problem, I was
thinking more of the problems that arise when you take things like collation
and locales into consideration. From what i understand there is a real issue
in enabling proper unicode support in the standard classes like locale,
ctype and collate, as they assume things that do not neccesarily apply to a
unicode representation of text. A failiure to enable good support in those
classes (at least locale and ctype), would also make the iostream support
break, and things start to snowball. I could very well be wrong on this
(Actually, I hope I am! :) ), as I haven't had the time to read up on all
issues concerning this. But again, this is one of many problems I hope
running this project will help reveal.

Next message: Robert Ramey: "[boost] Re: serialization - build shared libs"
Previous message: Neal D. Becker: "[boost] Re: serialization - build shared libs"
In reply to: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Next in thread: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"
Reply: Miro Jurisic: "[boost] Re: Any interest in adding unicode support to boost?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk