|
Boost : |
Subject: [boost] [locale] UTF Iteration and conversion
From: Artyom Beilis (artyomtnk_at_[hidden])
Date: 2011-07-25 08:05:09
Hello All,
Following this discussion:
[regex] How robust are the <boost/regex/pending/unicode_iterator.hpp> adapters?
http://thread.gmane.org/gmane.comp.lib.boost.devel/221643
Boost.Locale will ship **header-only** utf_to_utf conversion functions
and provide operations that would allow implementing iteration over UTF-XX sequences.
This was one of the review requests and it will be integrated in few days
into trunk.
Functions:
namespace boost { namespace locale { namespace conv {
template<typename CharOut,typename CharIn>
std::basic_string<CharOut> conv::utf_to_utf(CharIn const *begin,CharIn const *end,method_type how=skip);
template<typename CharOut,typename CharIn>
std::basic_string<CharOut> conv::utf_to_utf(CharIn const *str,method_type how=skip);
template<typename CharOut,typename CharIn>
std::basic_string<CharOut> conv::utf_to_utf(std::basic_string<CharIn> const &str,method_type how=skip);
}}} // boost::locale::conv
Also it would provide basic utf encoding/decoding operations that would allow to create
iterators over generic streams
namespace boost { namespace locale { namespace utf {
typedef uint32_t code_point;
static const code_point illegal = ... ;
static const code_point incomplete = ... ;
// Get one code point from UTF stream
template<typename InputIterator>
code_point decode(InputIterator &p,InputIterator e);
// Write one code point to UTF stream
template<typename OutputIterator>
Iterator encode(code_point u,OutputIterator e);
// Get size of code point in basic unit
template<typename CharType>
int units_in_code_point(code_point u);
}}} // boost::locale::utf
It should be ready very soon...
Artyom Beilis
--------------
CppCMS - C++ Web Framework: http://cppcms.sf.net/
CppDB - C++ SQL Connectivity: http://cppcms.sf.net/sql/cppdb/
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk