Boost logo

Boost Users :

From: John Maddock (john_at_[hidden])
Date: 2003-12-15 08:05:01


> Aside from the unified Han characters (kanji/hanzi), characters of
> the same category generally aren't neatly grouped together in Unicode.
> The 128-character blocks tend to correspond to locales, communities
> of use or specific legacy encodings and not to categories. You need
> a look-up table (or more efficiently two levels of table) to check
> character categories.

I was talking about the Unicode block ranges (defined in blocks.txt from the
Unicode ftp site), and you are correct that a two stage table is required
for general categories.

John


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net