Boost logo

Boost :

Subject: [boost] [review] string convert
From: Gordon Woodhull (gordon_at_[hidden])
Date: 2011-05-02 11:26:41


Hi Vladimir, Ed, all,

It is a curiosity of Boost reviews that a vote of conditional acceptance is actually more powerful than a No vote, because if you vote No, the author might try to convince you to change your vote but otherwise has no incentive to listen to you any further.

Therefore I vote to accept this library with the strong condition that a simpler syntax be adopted, doing away with the odd static ::from() function, and completely getting rid of the confusing, error-prone lazy functions. I am aware that the author thinks the present syntax is necessary, but I will argue against it in each case.

(As another meta-review comment, I've noticed a few authors saying they won't consider any changes unless their library is accepted. This sets up a deadlock situation: the reviewer will not accept the library without the changes, and the author won't make the changes without acceptance. Everyone digs in their heels and nothing gets done.

I think any author who really wants their library to be in Boost should be willing to submit it for a second review. And I think reviewers should tend toward conditions on acceptance rather than no votes. My two cents.)

> - What is your evaluation of the design?

That's the focus of my review, specifically, the design of the interface.

Supplying a default value, a don't-throw option, and applying manipulators/locale, are the major features missing from lexical_cast. So I think there is no doubt that a library like this is needed, if lexical_cast can't be upgraded. And according to its author, lexical_cast can't be upgraded because it must look like a cast and all these features require more parameters (template or actual). So, fine, let's have a replacement that's not .*_cast.

It is incredibly clear what lexical_cast does because it uses the standard meaning of the operators, function calls, and return values. There is a lot of value in that.

I don't think it's always a bad idea to have lazily evaluated objects or creative uses of operators, but IMHO if it is possible to do something more clearly with the standard order of evaluation and standard use of operators and function calls, then that should be preferred.

The interface of the proposed Convert library is not simple enough. To start out with the most annoying thing, I don't like "::from" at all. The replacement should look as close to lexical_cast as possible.

So, why the extra '::from'? Even though I disagree with the lazy evaluation (see below), even that doesn't seem to require this weird static member syntax. Is '::from' just so that '::result' will not feel lonely?

For the rest of this review, I'm going to refer to convert<T>::from(s), but please keep in mind that I really think it should be something like string_convert<T>(s).

So, next '::result'. It's some form of optional return. So why not just return optional? It is not much less to type
convert<int>::result maybe = convert<int>::from("not an int");
versus
optional<int> maybe = convert_cast<int, use_opt>("not an int");

(Where use_opt might also be spelled dont_throw, opt_out, .. I don't really care. Yes there may be problems because of the necessary deduced template argument for the source type. But isn't there a way around that with Boost.Parameter?)

Why should the LHS modify the behavior of the proxy object returned by ::from? It's far too much to think about for such a simple operation. I'd rather the function clearly did or didn't throw on its own, not be influenced by its context.

Lazy evaluation in expression templates is typically used to offer some kind of declarative syntax. So let's ask ourselves if that is needed here. The main innovation here is a lazily evaluated object which can then
- throw or not throw based on its context
convert<int>::result i = convert<int>::from("not a string");
- take iostream manipulators, which are applied before the conversion takes place.
- take additional options through a second application of function call operator (!)

All of these uses defy the way we ordinarily read function evaluation. What, the conversion is now feeding some data into std::hex? And then where does it go after that? It gets returned? Weird.

The right shift operator was chosen in the standard because it depicts an arrow; otherwise there is nothing that says I/O about it. Thus the only way this syntax would make sense to me is if it were
int i = std::hex >> convert<int>::from("deadbeef");
and then unfortunately it would be impossible to have more than one manipulator without some horrible parentheses, because of the left associativity.

I don't like the look of streaming operators that lead nowhere.

No, why not just
  int i = convert<int>::from(s, std::hex);
? Or
  from(s, _manip = std::hex)
Or, at worst (and I don't see why this is necessary)
  from(s)(_manip = std__hex);

If you want the operator syntax, just do
int i;
stringstream(s) >> std::hex >> i;

If it's really got to be one line, a syntax like
  int i = stringstream_(s) >> std::hex >> ret_;
would at least make sense and I'm sure that there's a way to do that almost as succinctly using Phoenix.

On a similar nice-feature-but-weird-syntax note, let's look at the Using Boost.Convert with Standard Algorithms section, which offers a few ways that you can use convert<T>::from() to create a functor:

std::transform(
    strings.begin(),
    strings.end(),
    std::back_inserter(integers),
    convert<int>::from(string(), -1) >> std::hex);
std::transform(
    strings.begin(),
    strings.end(),
    std::back_inserter(integers),
    convert<int>::from<string>()(fallback_ = -1) >> std::hex);

Apparently if you pass an empty string or omit the parameter to from(), it creates a function object which takes a string as a parameter. Is this correct? This seems like it would have unwanted consequences, since we now have a function object that can take either extra options or the string input. Again, I'm not against clever syntax, but this seems promiscuous and confusing.

I think it would be much more appropriate to provide a Phoenix function object for the purpose, something like
std::transform(
    strings.begin(),
    strings.end(),
    std::back_inserter(integers),
    convert_<int>(_1, -1, std::hex));
Sigh of relief. I see what that means immediately, and it doesn't make me worry about what convert's going to do if I forget a parameter.

I agree with Jeroen's request for a default-constructed version. I don't see why no syntactic sugar could be doled out for this, a _defcon (template or real) parameter or whatever.

I also want to mention type-to-type conversion, which Vicente Botet has thought out much more thoroughly in his Conversion library. This library offers type-to-type conversion seemingly as an afterthought, just because the syntax allows it.

I definitely see the value of a conversion mechanism that first looks for a direct type-to-type conversion and then defaults to using type-to-string-to-type. I just don't think conversion should always do that, and I don't think such a mechanism should be called plain convert. It does make any more sense to call such a thing plain convert than it does to start optimizing lexical_cast with special cases that don't use strings (sorry!). It would be called best_conversion or something that conveys that it's doing the best it can based on what converters have been "installed" at compile time. Again my objection is not to the functionality but to an obfuscating interface where you're not sure what behavior you're going to get.

I feel the same about the string-to-string conversion: it seems to be in there just because the syntax supports it, not because it actually makes sense here. I guess this is because library is called Convert instead of String Convert or Via Stream Convert, which is its main focus (and rightly so).

Apologies for repeating this: I still don't get why the extra options go into a second function call operator. I found some earlier messages where you simply added those to the first function call. Simplify, simplify, simplify the interface. Please.

To summarize, I don't get why we can't just have this simple clear syntax:

optional<int> i = string_convert<int, dont_throw>(s); // or string_convert<int>(s, _throw=false) if that's impossible
int i = string_convert<int>(s, manip_ = std::hex);

It's (plain) functional. There's no question when things get evaluated or thrown. I don't care about any of the particular names used, just the syntax.

> - What is your evaluation of the implementation?

I read the code enough to answer my own questions about what the library actually does. As noted, I disagree with many design choices, but the implementation seems fine.

The only thing that caught my eye as a little strange is that there's the string_sfinae namespace and header which in fact contains a bunch of metafunctions which are sometimes used in sfinae and sometimes just used as metafunctions. Obviously not a big deal, just some metonymy or historical artifact there.

To return to convert<>::result
On Apr 27, 2011, at 7:07 PM, Vladimir Batov wrote:
> They, indeed, seem to provide *similar* functionality. In fact, convert::result *uses* boost::optional. If you have a look at convert::result (it's quite small), then everything else in addition to boost::optional is what convert::result adds. In words, IMHO convert::result is better than boost::optional tailored for its domain (conversions).

So you made me look. I conclude that the only functionality it adds is the ability to delay throwing on error until the result::value() is called, am I wrong? I don't understand why you'd want this, because I thought the whole point of using ::result was when you didn't want to throw. But maybe I'm missing something. Again I think this could all be SO much simpler.

> - What is your evaluation of the documentation?

It's well written and explains the subject well.

It seems a bit flippant in places ("I've included this and that because why not?" kinds of statements) and IMO that's because the focus of this library is a little too fuzzy. Stick to string/stream conversion, fix the problems with lexical cast. Don't try to build a general conversion framework unless you're really willing to think that through.

I looked for a Rationale section that would explain why this functionality could not be included in lexical_cast, and could not find it. Vladimir, I know you said

On Apr 26, 2011, at 6:47 PM, Vladimir Batov wrote:
> I again urge you to please dig the archives to see why that "extra
> functionality should be added to boost::lexical_cast" view is not exactly original and most importantly why it was decided *not* to proceed as you insist.

But I don't think that's a substitute for explaining to new users why they can't get what they want with modest improvements on the lexical_cast interface.

Why is it that they really want some fancy lazy function object / "declarative" interface?

Likewise with this non-argument:

On Apr 26, 2011, at 6:47 PM, Vladimir Batov wrote:
> I've come to a firm conclusion that that additional
> functionality/configurability is *not* implementable within the boundaries of the lexical_cast interface. In particular, the TypeIn and TypeOut types must be discriminated, i.e. one type has to be specialized first. Without going into details I'll just say that otherwise is just does not work. So, it has to be convert<TypeOut>::from<TypeIn>. You can take my word for it or you could walk that road yourself.

Sorry, but I find this response either lazy or rude. You wrote the library: you explain to us why the way it is! I'd really like to know why you think it's necessary to do convert<T>::from(s) rather than the much more familiar and easy-to-grasp string_convert<T>(s). I don't want to try to write it myself to find out why it does or doesn't need to be that way.

You entered a fairly contentious arena with a majorly different interface. Write a rationale and don't just tell us to search the list.

> - What is your evaluation of the potential usefulness of the
> library?

Very, very useful. I use lexical_cast and am frustrated by the limitations.

> - Did you try to use the library? With what compiler? Did you
> have any problems?

No, I did not - I see how it works and trust that it does. I was not tempted to start looking for ways to make the library produce surprising results because it was immediately clear to me from the interface design that those misbehaviors would exist. I'm not always a Keep It Simple person, but in this case, please do. Otherwise you're destroying one of the best features of lexical_cast.

> - How much effort did you put into your evaluation? A glance? A
> quick reading? In-depth study?

I spent about 5 hours reading the documentation and code, and 5 hours reading/rereading the current reviews and writing mine.

I wish I'd had time to go back to the earlier discussions, although I searched a little bit; again I think everything that could be learned there should have been in the Rationale.

> - Are you knowledgeable about the problem domain?

Moderately. I like to know how every construct I'm using works, and I gulp at the inefficiency of lexical_cast every time I use it, even though I never use it where efficiency matters.

To further prove my point that conditional acceptance is stronger than voting no, note that I've basically made my yes vote conditional on fixing most of the things Thomas Heller complained about, notably

On Apr 26, 2011, at 2:58 PM, Thomas Heller wrote:
> Moreover, I don't get how the design really is superior to boost::lexical_cast, and why it couldn't been implemented as an extension to that library.
(It can't take a .*_cast name, but it can use the simple syntax.)

> The use of this converter function object looks like a nice concept at first sight but, as mentioned above, in the end, probably adds more confusion and problems than anything else.
(It confused me and didn't look nice.)

> I don't think that having everything hidden behind this static from function is wise and should be rethought. Different functions, for different purposes would be better suited.
On Apr 27, 2011, at 5:14 AM, Thomas Heller wrote:
> Even though you tried to be minimal and
> simplistic, the function is overloaded with too different (in concept)
> behavior that its really confusing to get what the function is really
> doing. So my end result is that its not really minimal and
> minimalistic, but bloated ... sometimes, more is less.
(I think more template or function parameters could also do the trick.)

I realize that my conditions mean some serious redesign, but this is an 834-line library and it would probably be even shorter without all the unnecessary lazy functions and weird out-of-place operators.

I want to contrast this with my No vote in the XInt review, where I also had issues with the design. I would have voted for conditional acceptance of XInt if it were possible, but the changes there seemed like they would need additional review. String Convert is much simpler and I think that a compromise could be reached to (IMO) simplify the interface and answer most of the objections that Thomas, Jeroen, and I have made.

I apologize to Ed for making a complicated vote. I'd be glad to clarify my position if needed.

Cheers to Vladimir - you've identified an important problem! Now just give it a clear interface and we'll have something good.

Gordon


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk