Boost logo

Boost :

Subject: Re: [boost] [gsoc] Request Feedback for Boost.Ustr Unicode String Adapter
From: Stewart, Robert (Robert.Stewart_at_[hidden])
Date: 2011-08-15 09:19:26


Yakov Galka wrote:
> On Fri, Aug 12, 2011 at 15:04, Matus Chochlik
> <chochlik_at_[hidden]> wrote:
> > On Fri, Aug 12, 2011 at 1:08 PM, Yakov Galka
> > <ybungalobill_at_[hidden]> rote:
> > > On Fri, Aug 12, 2011 at 12:00, Matus Chochlik
> > > <chochlik_at_[hidden]> wrote:
> > >> On Fri, Aug 12, 2011 at 9:57 AM, Daniel James
> > >> <dnljms_at_[hidden]> wrote:
> > >> > On 11 August 2011 12:57, Artyom Beilis
> > >> > <artyomtnk_at_[hidden]> wrote:
>
> > >> // by default expect UTF8
> > >> text(const std::string& str)
> > >> {
> > >> assert(is_utf8(str.begin(), str.end()));
> > >> store(str);
> > >> }
> > >
> > > What you are doing is, in fact, forcing the assumed
> > > encoding of std::string to UTF-8. You just said you
> > > think it's a bad idea.
> >
> > No, I'm proposing to implement a *new* class that
> > will store the text in UTF8 encoding and if during
> > the construction no encoding is specified, then it
> > is assumed that the particular std::string is already
> > in UTF8.
>
> > This is *very* different from imposing
> > an encoding on std::string which is already
> > used in many situations with other encodings.
> > i.e. my approach does not break any existing code.
>
> Sorry, your arguments start to look non-constructive to me.
> Correct me where I'm wrong in the following reasoning.
>
> (1) You object to UTF-8 strings in boost interface because
> someone may pass something other than UTF-8 there and it's
> going to be undetected at compile time:
>
> namespace boost { void func(const std::string& a); } // UTF-8
> boost::func(non_utf_string); //oops
>
> You're proposing a `text` class that is meant to somehow
> overcome this problem. So you change the boost interface
> to accept `text` but user code is left unchanged...:
>
> namespace boost { void func(const text& a); }
> boost::func(non_utf_string); //oops, the std::string default
> constructor is called.
>
> Yes, you can make this constructor explicit, so the above code
> stops compiling and the user must write explicitly:
> boost::func(text(non_utf_string));
>
> But then there is nothing in your proposal that makes
> std::string utf-8 encoded by 'default'. Default == implicit.

As soon as the client did a cast, the client made the claim that non_utf_string met the requirements of the text class' constructor. The problem is that of the client misusing the class by an ill-advised cast. What's more, I think Soares indicated a debug-build validation that the argument indeed was UTF-8.

I don't see a problem in that design, once the constructor is explicit.

> > Besided it does not harm you in any way
>
> It does. I already use UTF-8 for all my strings, even on
> windows, and I don't want the code-bloat of all these
> conversions (even if they're no-ops).

What code bloat do you get from NOPs? Sure, there is more compilation time for the compiler to parse the text code and then for the optimizer to streamline it into a NOP, but even that is very likely negligible.

_____
Rob Stewart robert.stewart_at_[hidden]
Software Engineer using std::disclaimer;
Dev Tools & Components
Susquehanna International Group, LLP http://www.sig.com




________________________________

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk