Boost logo

Boost :

From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2022-08-22 14:53:47


On Mon, Aug 22, 2022 at 9:41 AM Zach Laine <whatwasthataddress_at_[hidden]> wrote:
>
> On Sun, Aug 21, 2022 at 6:47 PM Vinnie Falco <vinnie.falco_at_[hidden]> wrote:
> >
> > On Sun, Aug 21, 2022 at 4:03 PM Zach Laine <whatwasthataddress_at_[hidden]> wrote:
> > > ...
> >
> > It wouldn't be a proper review without at least one person forgetting
> > to use "Reply All" :) So here it is, from Zach. I have quoted his
> > message:
> >
> > > This makes sense. The larger issue I'm raising is that a user is
> > > going to have a harder time remembering this, if the names don't
> > > match. A way to mitigate that is to make this naming rationale
> > > explicit, so that user
> > > expectations will adjust. I didn't get that this was the pattern from
> > > the documentation, though I think you did say it. A call-out
> > > specifically noting this would porbably be enough.
> >
> > Yeah we can do that.
> >
> > > Hm. By lazy flat_map, do you mean that the positions are computed one
> > > step at a time during iteration? is that why they have
> > > forward_iterators?
> >
> > Yes. operator++ and operator-- intelligently find the next param or
> > segment using a non-validating parse algorithm (since we know that the
> > underlying character buffer has already been validated).
> >
> > > If so, please document that. It might make a
> > > pretty big difference whether I decided to keep using a params object
> > > that I need to update a lot, or copy everything over to a proper
> > > flat_map or vector to do my work.
> >
> > Yes I agree, but I thought we did document that. Here, no?
> >
> > <https://master.url.cpp.al/url/ref/boost__urls__params_encoded_view.html#url.ref.boost__urls__params_encoded_view.complexity>
> >
> > We could probably add something here too:
> >
> > <https://master.url.cpp.al/url/ref/boost__urls__params_encoded_view/iterator.html>
>
> Right, and I understood how those worked. However, I had no idea that
> the owning containers were based on the same lazy approach (unless I
> misunderstand what you said in this thread...).
>
> > > This works for an immutable container. It is *very surprising* for a
> > > mutable one. The rest of my notes about the containers are predicated
> > > on a false assumption that mutability implied non-laziness in the
> > > container itself. I think all of these container would be better off
> > > referred to as views, and made immutable. All you really need to
> > > support is begin/end, plus maybe key-lookup. I suggest you drop the
> > > mutability entirely. With mutability, these containers are trying to
> > > do two things at once.
> >
> > Well, that's another way to do it and it certainly would remove
> > complexity and simplify the names. Right now we have 8 containers:
> >
> > segments
> > segments_view
> > segments_encoded
> > segments_encoded_view
> > params
> > params_view
> > params_encoded
> > params_encoded_view
> >
> > All of these containers reference the underlying character buffer of
> > the URL (although you can also construct the views from separately
> > parsed string views but that's basically the same thing in terms of
> > the ownership model being reference-like).
> >
> > If we remove mutability as you suggest, then we can cut the number of
> > containers down to 4:
> >
> > params_view
> > segments_view
> > params_encoded_view
> > segments_encoded_view
> >
> > But this leaves us with a problem - how do you modify individual path
> > segments and query params?
>
> Easy -- you just copy them into a flat_map, vector, or whatever. The
> 90% use case is just to look at the result of the parse. Also making
> the parsing library responsible for mutations you do to the data later
> seems unnecessary. If you provide parsing -- including the lazy views
> -- and then writing of URLs, and the writing can be done via a generic
> container, that's all your library needs, IMO. To be clear:
>
> Parsing -> lazy_containers -> user mutations outside of Boost.URL ->
> writing URLs
>
> That last step only needs to understand the different kinds of data it
> might write, abstractly. That abstraction might be a params_view, a
> flat_map, or a std::vector<std::pair<strd::string, std::string>>.
>
> > How do you do position-based insert and
> > erase of elements using an iterator? These features seem kind of
> > useful, I'm not sure I want to lose them.
>
> You won't lose them, because the STL exists.
>
> > > > Did you mean that it "doesn't" have proper error reporting? Where do
> > > > you see this lack of reporting (if that's what you meant)?
> > >
> > > By "proper" I mean that if the parse fails it should provide an
> > > end-user-friendly diagnostic, and optionally quote the line and put a
> > > caret under the position that the parse failed (a la the Clang
> > > diagnostics). Your non-end-user error reporting is fine.
> >
> > Oh... hmm.. well, each rule does leave the iterator out-param at the
> > spot where an error occurred. But I'm not quite sure how to emit the
> > diagnostic - surely you aren't suggesting we write to std::cout or
> > std::cerr? Are we using curses or something? Or maybe just plain ASCII
> > art? How exactly would this look?
>
> The diagnostics go to some optional output channel, which might be a
> stream or file or logger. If you don't set the output, the diagnostic
> is never generated. You set this output in the call to parse*(), as
> an optional param -- usually a std::function<void(std::string const
> &)>.
>
> > Thanks to Peter's work on boost::system::error_code, when a parse
> > error occurs the error is augmented with file name, line, and function
> > name information. And I have worked on adding visualizers so these are
> > nicely presented in the debugger. So at least we have that.
>
> That's really nice for programmers, but it leaves the programmer with
> a lot of work to do to present the error to the user.
>
> Zach

Forgot to reply-all *again*.

Zach


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk