|
Boost : |
From: Rogier van Dalen (rogiervd_at_[hidden])
Date: 2004-10-20 12:52:10
On Wed, 20 Oct 2004 12:20:22 -0400, Miro Jurisic <macdev_at_[hidden]> wrote:
> In article <e094f9eb04102006096b92c870_at_[hidden]>,
> Rogier van Dalen <rogiervd_at_[hidden]> wrote:
> > My plan was to decompose all characters in unicode::string. This makes
> > manipulation of diacritics easier. Correct me if I'm wrong, but your
> > example of finding "ü" in a string would come down to finding the
> > codepoint sequence "U+0075 U+0308" and checking whether it is not
> > followed by another combining character, pretty trivial still.
>
> You have to not only decompose them but put them in a canonical decomposed order
> in order for that to work.
Yes, of course. I left it out thinking it was trivial (which it may
be; you'd need a small part of the Unicode Database though).
Regards,
Rogier
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk