Boost logo

Boost :

From: Damien Fisher (damien_at_[hidden])
Date: 2002-12-04 05:08:22


I know this has come up before, but I'd really like to see some basic
support for Unicode/wide character encodings in boost.

While this is a really complicated issue (which I confess I understand
very little about), I think a good start would be to only worry about
conversions between various encodings -- I know there are a few facets
lying around in the files section at yahoo doing just this.

I am definitely a rank amateur in this area (which is why I'd like to
see something in boost which does the hard work for me :) ), so I can't
offer any substantive comment on what such a library should have;
however, as a programmer I find that I often want to be able to read in
strings in encoding X and spit them out in encoding Y. I believe
Dinkumware sells a set of facets doing this for popular encodings
(http://www.dinkumware.com/libDCorX.html), but I feel that this is an
important enough problem that there should be an equivalent library in
boost -- I think a lot of free projects would benefit from it.

As far as I can see, this is a far simpler problem than full-blown
support for Unicode (with string comparisons/equality and the like) --
in the general case, I suspect that providing a friendly wrapper over a
library such as ICU
(http://oss.software.ibm.com/developerworks/opensource/icu/project/)
would be a good start...

I am more than willing to help someone else (hopefully someone with more
domain experience) with the work, but I feel I don't know anywhere near
enough in this area to embark on the project by myself. I really don't
see significant difficulty in doing the conversion facets, but I know that
every time anything to do with wide characters comes up on the
newsgroups I get lost in all the little problems people raise; so it may
just be my inexperience with the area.

Any takers?

Damien


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk