[Boost-bugs] [Boost C++ Libraries] #9435: Erroneous character set conversions of strings with more than int32 bytes

Subject: [Boost-bugs] [Boost C++ Libraries] #9435: Erroneous character set conversions of strings with more than int32 bytes
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2013-11-26 14:18:34


#9435: Erroneous character set conversions of strings with more than int32 bytes
-----------------------------------------+---------------------
 Reporter: Martin Korp <martin.korp@…> | Owner: artyom
     Type: Bugs | Status: new
Milestone: To Be Determined | Component: locale
  Version: Boost 1.54.0 | Severity: Problem
 Keywords: character set conversion |
-----------------------------------------+---------------------
 To internationalize our software, we use Boost.Locale together with ICU
 for character set conversions. During our tests we found out that it is
 not possible to convert strings with more than int32_t bytes because
 icu::UnicodeString, which is used by the functions
 boost::locale::conv::to_utf and boost::locale::conv::from_utf to perform
 character set conversions, is limited to strings with a size of at most
 int32_t bytes. Because Boost.Locale does not check if the size of the
 given string exceeds those limit, the behavior of the functions
 boost::locale::conv::to_utf and boost::locale::conv::from_utf is undefined
 for big strings.

 PS: We already contact the ICU support mailing list. They told us that the
 UText API (http://icu-project.org/apiref/icu4c/utext_8h.html) might be
 able to handle strings with more than int32_t bytes. Another possibility,
 according to the ICU support mailing list, would be to use the lower-level
 conversion API of ICU (uconv).

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/9435>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:14 UTC