Boost :

Date view	Thread view	Subject view	Author view

From: Miro Jurisic (macdev_at_[hidden])
Date: 2004-04-07 03:55:15

Next message: Vladimir Prus: "[boost] Re: Unicode string"
Previous message: Pavol Droba: "Re: [boost] Re: Regex ease-of-use ideas"
In reply to: Vladimir Prus: "[boost] Unicode string"
Next in thread: Vladimir Prus: "[boost] Re: Unicode string"
Reply: Vladimir Prus: "[boost] Re: Unicode string"

In article <200404071157.53018.ghost_at_[hidden]>,
Vladimir Prus <ghost_at_[hidden]> wrote:

> so the point is that when using string-as-code-point-container, even
> searching and removing a character/substring might get invalid string? E.g.
> even looking for string 'foo' you theoretically can find string 'foo'
> followed by composing character, and removing just 'foo' will be invalid?

Yes, and this is true of all Unicode encodings. Essentially, transformations
that select or remove portions of a string require you to be aware of character
boundaries. Searching, substrings, and character removal are such
transformations, whereas concatenation isn't, so if you have to strings in the
same encoding, you can concatenate them without dealing with character
boundaries, and that's about it.

> > basic_string is not the abstraction you are looking for, but it's also the
> > only one that is readily available in STL/boost today. It may serve as a
> > good starting point (questionable, IMNSHO), but it should most definitely
> > not be treated as the right thing to use for Unicode in the long term.
>
> I wonder what's the right abstraction then? Is it necessary to have a class
> to represent abstract character, with all composing characters?

That's one way to go, yes; note that the moment you utter those words, you put
yourself into the position of designing a Unicode API :-) which you said you
don't want to do at this time.

meeroh

-- 
If this message helped you, consider buying an item
from my wish list: <http://web.meeroh.org/wishlist>

Next message: Vladimir Prus: "[boost] Re: Unicode string"
Previous message: Pavol Droba: "Re: [boost] Re: Regex ease-of-use ideas"
In reply to: Vladimir Prus: "[boost] Unicode string"
Next in thread: Vladimir Prus: "[boost] Re: Unicode string"
Reply: Vladimir Prus: "[boost] Re: Unicode string"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk