Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [string] Realistic API proposal
From: Anders Dalvander (boost_at_[hidden])
Date: 2011-01-29 06:05:35

Next message: Daniel Pfeifer: "[boost] Distributed Boost with CMake: proposal and volunteering"
Previous message: Ivan Le Lann: "Re: [boost] [string] proposal"
Maybe in reply to: Artyom: "[boost] [string] Realistic API proposal"
Next in thread: Joe Mucchiello: "Re: [boost] [string] Realistic API proposal"

On 2011-01-28 20:12, Joe Mucchiello<jmucchiello_at_[hidden]> wrote:
> // conversion for Windows API
> std::vector<wchar_t> vec;
> vec.resize(count_codepoints<utf8>(mystring.begin(), mystring.end()));
> convert<utf8,utf16>(mystring.begin(), mystring.end(), vec.begin());

I spy with my little eye a potential crash waiting to happen.
Code-points != Code-units.
vec has room for N code-units, but 2*N code-units may be written to it
if mystring contains non-BMP characters.

"Corrected" code:

    std::vector<wchar_t> vec;
    vec.resize(count_codeunits<wchar_encoding>(mystring.begin(),
mystring.end()));
    convert<wchar_encoding>(mystring.begin(), mystring.end(), vec.begin());

I think a lot of these potential crashes could be prevented if the
iterator of the new string-type (chain,text,tier,yarn) would only expose
(const) code-points. Actual code-units would be hidden, and only
accessed using a facade/adapter view/iterator.

auto u8v = make_view<utf8_encoding>(mystring);
auto u16v = make_view<utf16_encoding>(mystring);

    for (auto codepoint : mystring) {...}
    for (auto u8codeunit : u8v) {...}
    for (auto u16codeunit : u16v) {...}

I also think there isn't a reason that the new string-type *has* to be
UTF-8 internally. It could be UTF-16, UTF-32, SCSU, or CESU-8 internally
for that matter. Making a view from the internal encoding to an external
encoding when both encodings are the same should be a no-op.

Regards,
Anders Dalvander

-- 
WWFSMD?

Next message: Daniel Pfeifer: "[boost] Distributed Boost with CMake: proposal and volunteering"
Previous message: Ivan Le Lann: "Re: [boost] [string] proposal"
Maybe in reply to: Artyom: "[boost] [string] Realistic API proposal"
Next in thread: Joe Mucchiello: "Re: [boost] [string] Realistic API proposal"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk