Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-07-20 06:34:51


Hi Tilman,
> I'm jumping in, because I am interested in Unicode conversion facets...
>
> > is there a reason why both program_options and serialization contain
> > very similar files utf8_codecvt_facet.cpp?
>
> I had a look at the serialization library's converter in
> utf8_codecvt_facet.cpp
> and noticed that utf8_codecvt_facet_wchar_t::do_in() doesn't check for
> non-shortest UTF8-sequences.

Hmmm... I think it's just an omission, and it would be easy to add.

> There might also be some issues on
> platforms with 16-bit wchar_t (possible overflow).
>
> I suggest using (parts of) the UTF library in the Boost files area to solve
> those problems. This could also be another step towards an officially
> supported Unicode library... ;-)
>
> http://groups.yahoo.com/group/boost/files/utf/

While I think that library is OK, and last time the author, Alberto Barbati,
posted on this, he knew about Unicode much more than I, I don't think it's
good to take that library and add it now to details. Simply put, it will take
another week until regression tests turn green again. I also don't think
there's particular difference between different utf8 implementations....

- Volodya
 


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk