Boost logo

Boost :

Subject: Re: [boost] [review] string convert
From: Vladimir Batov (vbatov_at_[hidden])
Date: 2011-05-02 22:35:43


> Gordon Woodhull <gordon <at> woodhull.com> writes:
> Hi Vladimir, Ed, all,
> ...
> I am aware that the author thinks the present syntax is necessary, but I will
> argue against it in each case.

If the library is accepted (which seem unlikely so far and that's truly OK with
me), then I am prepared to discuss and try out alternative approaches (if they
are in some concrete "try-able" form). The reason I've "come up" with the API as
it is now is because during discussions some 2-3 years ago I in fact tried
various proposed interfaces as we were discussing them. The results of those
trial-n-errors is this current API.

> It is incredibly clear what lexical_cast does because it uses the standard
> meaning of the operators, function calls, and return values.

int i = lexical_cast<int>(str);
int i = convert<int>::from(str);

When I look at the above where lexical_cast and 'convert' provide the same
functionality, I am not sure if from the user perspectives I immediately see
how 'convert' differs from lexical_cast complexity-wise. I am certainly biased.

> There is a lot of value in that.

Yes and no. From the design perspectives I agree -- simple and easy to grasp.
>From the user perspectives I have to disagree as that simple design results in
an implementation which has very little value for my deployment.

> I don't think it's always a bad idea to have lazily evaluated objects or
> creative uses of operators, but IMHO if it is possible to do something more
> clearly with the standard order of evaluation and standard use of
> operators and function calls, then that should be preferred.

I hear your dissatisfaction but I am not sure I have enough information to act
on addressing it. We can discuss it further if/when the time comes.

> The interface of the proposed Convert library is not simple enough. To start
> out with the most annoying thing, I don't like "::from" at all. The
> replacement should look as close to lexical_cast as possible.

I'd try distinguishing "not simple enough" and "annoying". The *user* API is
one (!) function -- convert::from -- on one side and *optionally*
convert::result on the receiving side. I really do not know how to simplify it
any further. I personally do not consider changing that "annoying" from() to
something else to be a simplification. So, by "simplification" you probably had
something else in mind and I am all prepared to listen.

As for "I don't like "::from" at all", then it's a personal choice and we'll
never get a consensus here. The practical reason for "::from" is two-fold --
1) it clearly shows the direction of the conversion (which's been deemed
important during original discussions); 2) the syntax has to be
convert<TypeOut>::from<TypeIn> (replace 'convert' and 'from' with the
identifiers of your choice), i.e. two types need to be evaluated separately.
Otherwise, various specializations (if there are such) conflict.
That's why the lexical_cast-like API (where both TypeIn and TypeOut evaluated at
the same time) does not work (at least I did not manage it to work). For
example, one might have a string-to-type specialization. Then if one wants to
write string-to-bool, it won't compile as both specializations will match.

> So, why the extra '::from'?

See the (brief) explanation above.

> ... seem to require this weird static member syntax.

I am not sure I am following. What's exactly weird about it (apart from your
personal syntax preferences)? It's the standard C++ syntax, right?

> Is '::from' just so that '::result' will not feel lonely?

:-)

> I really think it should be something like string_convert<T>(s).

We had these discussions when I was writing 'convert'. string_convert<T>(s) is
not good (as far as *I* tried) because it does not indicate direction of the
conversion; it is limited to strings (which is again open for
(mis)interpretation); it has TypeIn and TypeOut evaluated at the same time which
did not work when I tried.

> So, next '::result'. It's some form of optional return.
> So why not just return optional? It is not much less
> to type
> convert<int>::result maybe = convert<int>::from("not an int");
> versus
> optional<int> maybe = convert_cast<int, use_opt>("not an int");

Yes, convert::result does not offer *much* beyond boost::optional. However, it
*does* offer more. That's the reason it exists. Namely, convert::result provides
two pieces of information -- if the conversion failed/succeeded *and* the
fallback value (if provided). boost::optional provides one *or* the other. The
difference might be considered subtle but it's there.

> Why should the LHS modify the behavior of the proxy object returned by ::from?

Because usages of 'convert' differ -- namely, throwing and non-throwing
behavior. I personally do not use throwing behavior at all but I feel it was
needed to provide/support the behavior (and resulting mess) for lexical_cast
backward compatibility and to cater for usages different from mine.

> It's far too much to think about for such a simple operation.

Not really. Well, I do not see it that way anyway.

convert<int>::result res = convert<int>::from("blah", -1);
if (!res) message("conversion failed");
int value = res.value(); // proceed with the fallback anyway.

The locality of 'convert' application (IMHO of course) does not leave much to
think how 'convert' is to behave.

> I'd rather the function clearly did or didn't throw on its own, not be
> influenced by its context.

Understood. I feel the behavior complexity (as perceived by you anyway) was
driven by the complexity/variability of usages/applications. If it can be done
easier, I'd like to hear that (if we proceed accepting 'convert' -- for *me* it
already does all I want it to do).

> convert<int>::result i = convert<int>::from("not a string");
> - take iostream manipulators, which are applied before the conversion takes
> place.
> - take additional options through a second application of function call
> operator (!)
>
> All of these uses defy the way we ordinarily read function evaluation. What,
> the conversion is now feeding some data into std::hex? And then where does it
> go after that? It gets returned? Weird.

"Weird" is a subjective thing. If you look under the hood (of this library,
other libraries or, say, as a washing machine), then it might indeed look
"weird". From the user perspectives though I feel it's quite straightforward.
That's my view, of course and we can discuss it later if needs be.

> The right shift operator was chosen in the standard because it depicts an
> arrow; otherwise there is nothing that says I/O about it. Thus the only way
> this syntax would make sense to me is if it were
> int i = std::hex >> convert<int>::from("deadbeef");
> and then unfortunately it would be impossible to have more than one
> manipulator without some horrible parentheses, because of the left
> associativity.

Well, again it's an endless debate of one's API look-n-feel preferences over
somebody else's. Beauty is in the eyes of beholder. I do not see parentheses as
horrible. In fact, I love them (even though I am not a Lisp guy) :-).

> I don't like the look of streaming operators that lead nowhere.
>
> No, why not just
> int i = convert<int>::from(s, std::hex);
> ? Or
> from(s, _manip = std::hex)
> Or, at worst (and I don't see why this is necessary)
> from(s)(_manip = std__hex);

The huge *practical* advantage of operator() is chaining. I personally do not
see one or the other of the below simpler or harder and both are of equal
length:

   from(s, _manip = std::hex);
   from(s)(_manip = std::hex);

However, the first one has the limit on the number of args. Second does not.
Second is much easier/cleaner to implement (therefore, to maintain which is even
more important) as #1 requires from() overloading or def. parameters. Thirdly,
second looks more natural as a progression of increasing complexity as the user
gains additional knowledge:

   int i = from(s); // simplest
   int i = from(s)(_manip = std::hex);
   int i = from(s)(_manip = std::hex)(locale_ = ...);

Implementation for the above is considerably cleaner as from() does its own
thing. op(_manip) does its own and again op(locale_) does its own. Each piece of
respective functionality is well encapsulated in the relevant function.
Beautiful. :-) IMHO of course.

I wrote it that way. You'd probably write it differently and,
then, I'd probably criticize your API. :-) So far, personal preferences aside
the op()-based API seems to be the winner due its practicality and
extensibility.

> On a similar nice-feature-but-weird-syntax note, let's look at the Using
> Boost.Convert with Standard Algorithms section, which offers a few ways that
> you can use convert<T>::from() to create a functor:
>
> std::transform(
> strings.begin(),
> strings.end(),
> std::back_inserter(integers),
> convert<int>::from(string(), -1) >> std::hex);
> ...
> I think it would be much more appropriate to provide a Phoenix function object
> for the purpose, something like
> std::transform(
> strings.begin(),
> strings.end(),
> std::back_inserter(integers),
> convert_<int>(_1, -1, std::hex));
> Sigh of relief. I see what that means immediately, ...

Apologies for sounding as a broken record but again that preference seems quite
personal. As a user not familiar with Phoenix your variant would not look
familiar to me. In order to use it I'd need to learn/remember/know additional
API -- convert_, _1, args order. My deployment seems simpler (not
surprisingly some bias here :-) ) as the user uses the same API and does not
need to learn more to use 'convert' with algorithms.

Mentioning that not for the sake of arguing tooth and nail for *my* API. I am
happy to consider alternatives. I just do not see your suggested API as a clear
winner (not yet anyway).

> I also want to mention type-to-type conversion, which Vicente Botet has
> thought out much more thoroughly in his Conversion library. This library
> offers type-to-type conversion seemingly as an afterthought, just because the
> syntax allows it.

I have to disagree. Again I consider 'convert' to be more of a conversion
framework rather than a library. Indeed, 'convert' does not provide much more
beyond string-to-type conversions because I personally did/do not need more.
However, it does not mean I/we should deny that possibility to others as their
needs may differ. Therefore, 'convert' was designed with that in mind. If
Vicente needs type-to-type conversion functionality, I think he might consider
implementing that functionality within the 'convert' framework by extending the
'convert' library for everyone's benefit.

I do not think just one person can provide every possible conversion. I cannot
anyway. And with 'convert' I am not saying I do.

> I feel the same about the string-to-string conversion: it seems to be in there
> just because the syntax supports it, not because it actually makes sense here.

No, I mention string-to-string conversion because other people might find it
interesting even though I personally do not have use for such a deployment. That
said, I wish I had that available to me quite some time back when I was dealing
with password encryption/decryption.

> I guess this is because library is called Convert
> instead of String Convert or Via Stream Convert, which is its main focus
> (and rightly so).

> Apologies for repeating this: I still don't get why the extra options go into
> a second function call operator. I found some earlier messages where you
> simply added those to the first function call. Simplify, simplify, simplify
> the interface. Please.

Again, I personally do not see one or the other of the below simpler or harder.

   from(s, _manip = std::hex);
   from(s)(_manip = std::hex);

So I suspect that "simplify, simplify, simplify" cry might have a
personal-preference under-tones. ;-) Due to some advantages described before I
currently prefer #2 for purely practical/utilitarian reasons.

> To summarize, I don't get why we can't just have this simple clear syntax:
> ... for brevity snipped alternative syntax suggestions.

The reason as I see it is fairly simple. The API grew as different uses/needs
needed to be addressed. What you consider to be a simpler syntax might have been
considered but rejected as it was not fitting some other usages the library
needed to address. Or it might be that I did not manage it to work and together
we might get it to work... or I simply missed that possibility. What I am trying
to say, is that I personally do not consider 'convert' to be perfect in every
respect and even less so complete. I believe I did implement the functionality
that I immediately needed. More so, I tried to address concerns/suggestions
brought up while the library was being written. So, I feel it might be a good
basis from which we could move *forward* improving it -- adding conversions
others needed, optimizing existing conversions. The alternative of rejecting it
is truly fine by me (less hassle for me) but it will take us where we've been
for close to 10 years. Then, someone else will try it all *again*... with I
suspect very similar results. Namely, he'll try extending lexical_cast interface
just to realize it's too restrictive. Then, he'll propose some
other-than-lexical_cast interface just to cop all
why-or-why-you-do-not-use-lexical_cast flames. Then, he'll try extending all
other-than-his usages/deployments suggested during discussions that would lead
to (I suspect) a similar design/implementation which would result in subsequent
(much later) criticism that the design/implementation was unnecessarily complex.
I am not complaining but I'd really like us to at long last start moving
forward. Maybe an alternative might be to delay reviewing this library for a
year (?) and let guys try extending lexical_cast as there were numerous
suggestions. Then, we'll re-evaluate the situation. Although my suspicion is
that in 1-2 years that lexical_cast-extension push will fizzle out once again,
new people will come in asking the same questions and we won't move an inch.

> ... some sensible comments snipped for brevity only
> Again I think this could all be SO much simpler.

I hear you. I really do. When I just started replacing lexical_cast in *my* work
it was as straightforward as aux::string::from() and aux::string::to() (to/from
and 'string' names-related arguments aside). That was what I initially
suggested after on the list we decided we could not proceed with lexical_cast.
The 'convert' is the result. I can only say that every "complication" was added
for a reason after discussions on this list. It was not just my fancy. You
and/or I might consider a particular reason not legitimate enough. Some might
disagree. Difficult choices. Still no effort can fix everything -- no
design/implementation can satisfy everyone.

> On Apr 26, 2011, at 6:47 PM, Vladimir Batov wrote:
> > I've come to a firm conclusion that that additional
> > functionality/configurability is *not* implementable within the boundaries
> of the lexical_cast interface. In particular, the TypeIn and TypeOut types
> must be discriminated, i.e. one type has to be specialized first. Without
> going into details I'll just say that otherwise is just does not work. So, it
> has to be convert<TypeOut>::from<TypeIn>. You can take my word for it or you
> could walk that road yourself.
>
> Sorry, but I find this response either lazy or rude. You wrote the library:
> you explain to us why the way it is!

Well, being rude has certainly never been my intention. If I came across as
such, then it is quite unfortunate and please accept my humble apologies. The
problem in such situations is that it's hard/impossible to figure out how much
information the person *actually* wants to know; if the respective question was
indeed a genuine interest or a rhetoric question given that the person had
already rejected the library as a whole.

> I'd really like to know why you think it's necessary to do convert<T>::from(s)
> rather than the much more familiar and easy-to-grasp string_convert<T>(s).

I am not sure if I managed to explain that above. The reason why TypeIn and
TypeOut need to be separated is that if we have specialization for, say,
string-to-type enabled with

template<class TypeOut, class StringIn>
typename boost::enable_if<convert_detail::is_any_string<StringIn>, TypeOut>::type
convert_to<TypeOut>(StringIn const&)
{...}

then, say, string-to-bool further specialization/optimization won't be possible

template<class StringIn>
bool
convert_to<bool>(StringIn const&)
{...}

It won't compile as both "specializations" will match. It did not work for me.
If you can get it to work by some other means, I'll be only happy.

> ...
> I realize that my conditions mean some serious redesign, ...

Again, it is not to say I am against a re-design and re-doing the library
altogether. However, as I indicated the existing functionality/API are in the
lib. for a reason. Taking that functionality out or changing API will affect
those use-cases that functionality/API were for. So, I'd approach that re-design
with caution. I understand that might change your vote and I am OK with that.

> Cheers to Vladimir - you've identified an important problem! Now just give it
> a clear interface and we'll have something good.

Thank you. Although I suspect I cannot take the credit for identifying the
problem -- that's been with us for long-long time. What I was seemingly naive
about was my hope that, if we had something, then we at long last could get it
moving beyond just talk. I am more than happy to scrap my implementation in
favor of some other alternative. For a user (and primarily I am a user)
ending up with nothing is not a good outcome.

Apologies if I have not addressed all questions/notes/suggestions. I am happy
to work with anyone willing to stick with this library (or suggest his own
alternative library) long enough to know and to take into consideration
different deployment patterns to produce something useful. I am happy to step
aside knowing there is someone prepared to take this (or his own) library to
fruition. No irony or sarcasm here. I think users
need that functionality and I'd really like to see that functionality available
in one form or another.

Best,
V.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk