Boost logo

Boost :

Subject: Re: [boost] [graph] dijkstra pull request ping
From: Marcin Zalewski (marcin.zalewski_at_[hidden])
Date: 2015-10-23 11:05:59


On Fri, Oct 23, 2015 at 10:47 AM alex <alexhighviz_at_[hidden]> wrote:

>
>
> >-----Original Message-----
> >From: Boost [mailto:boost-bounces_at_[hidden]] On Behalf Of Marcin
> >Zalewski
> >Sent: 23 October 2015 14:33
> >To: boost_at_[hidden]
> >Subject: Re: [boost] [graph] dijkstra pull request ping
> >
> >On Thu, Oct 22, 2015 at 4:56 PM alex <alexhighviz_at_[hidden]> wrote:
> >
> >> > > > What do you mean by not evaluating decreased? Then we use
> >decreased
> >> > > > to check which visitor function we should call, so I am not sure
> if
> >> > > > one
> >> > > could
> >> > > get rid
> >> > > > of that without having two versions of the code (one for the
> >> "classical"
> >> > > version,
> >> > > > and one for an arbitrary distance map).
> >> > > >
> >> > >
> >> > > Sorry I am not clear. Yes, I do think that you should get rid of the
> >> > > redundant check. And, yes that would mean that if you want the
> >> > > arbitrary distance map, you should have two versions.
> >> > >
> >> >
> >> > I would only agree with that if we could demonstrate that there is a
> >> statistically
> >> > significant performance advantage of doing so. Looking at the code, I
> am
> >> > skeptical that this check has much or any impact on performance on
> most
> >> > modern platforms. Do you have any evidence that shows there would be a
> >> > benefit to having two versions of the code?
> >> >
> >>
> >> There are the following advantages:
> >>
> >> 1. you avoid one unnecessary check for each vertex
> >> 2. you do not need to initialize values in the distance map
> >> 3. you do not need to specify an "inf" value for distance
> >>
> >> (1) and (2) should bring some performance gain, but it is so little that
> I
> >> did not notice it and did not feel the urge to measure it. I don't think
> >> performance is a good reason to take out the redundant check.
> >>
> >
> >The check only becomes redundant if there are two versions of the code,
> one
> >for dijkstra_shortest_paths_no_init and one for dijkstra_shortest_paths.
> >The code that covers dijkstra_shortest_paths_no_init automatically covers
> >dijkstra_shortest_paths.
>
> Now you lost me. I thought the dilemma is to have either one version of
> dijkstra_shortest_paths_no_init based on the patch by Piotr, or two have
> two
> versions, one as in the patch by Piotr and one as currently implemented. In
> either case, dijkstra_shortest_paths can remain just as it is.
>
> The patch by Piotr stands in the way of some other optimizations that were
> proposed 3 years ago. The benefit of this optimizations in terms of
> performance is only marginal, but they do make the dijkstra_shortest_path
> simpler and cleaner. Also, to me it seems his patch is no longer really
> Dijkstra's but some extension of it. To me it is natural to make that a
> separate function and give it another name.
>
> However since Piotr's patch doesn't break anything, but just extends the
> function, I can see that it is really just an issue of preference.
>

The current implementation works both for the case of no_init where the
distance property map has values in it already and for the "plain" case
where we initialize the values to inf. Piotr's patch does not change that.
He removes the visitor code and adds a direct loop code, but there is still
only one version of the actual traversal. His patch makes the code
consistent with the pseudo code here:

http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths.html

Note that in the pseudo code discovery of a vertex happens only if the new
distance is less than the old distance.

If we would go with the ideas you describe, the code would not work for
arbitrary distance map. There are 2 cases:

1) All values in the distance map are inf (or are implicitly inf).
2) User provides a distance map with arbitrary values and they must be
respected per pseudo code.

As far as I understand, you want to do something specific for case 1) where
we do not need to do the check in the pseudo code because we know it will
always be true, right? If we did so, we need to have code that also works
for case 2). Right now we just have the code for case 2) and it also works
for case 1). Let me know if I misunderstood something.

> >There is an advantage to it, and if one would provide separate code for
> the
> two, there should be a compelling reason. In
> >my opinion, in this case the reason should be performance. Otherwise, why
> >do this?
> >
> >> (2) and (3) however make the algorithm simpler and cleaner to use. It
> >> becomes easier / possible to apply the method when there is no good
> value
> >> for "inf". I do think that is a good reason to take out the redundant
> >> check.
> >>
> >
> >Yes, I agree with you that it makes *one version* of the algorithm
> cleaner,
> >but we have to make a decision that we want to provide that version in the
> >first place. I think that providing an elegant version at the cost of two
> >different implementations is actually additional complexity.
>
> In my opinion it would not really be two versions of the same algorithm,
> but
> two functions for two similar algorithms. You can see that they are two
> different algorithms because in the application that Piotr references
> (http://paal.mimuw.edu.pl/docs/do_vv_2kminus1.html) it is used as a means
> of
> computing approximate shortest paths rather than exact shortest paths (if I
> understand correctly, because the precalculated distances supplied to the
> algorithm are approximate rather than exact).
>

I think it is the same function. It implements pseudo code in:

http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/dijkstra_shortest_paths.html

The only difference in the two algorithms you describe is whether the
distance map is provided by the user or whether it is initialized to inf
for all vertices. We could write different code for the user-initialized
and the inf-initialized maps, but that would give us more code to maintain.
Piotr's implementation actually implements the check condition that can be
seen in pseudo code before checking for WHITE.

> >
> >> If you take out the redundant checks in the relax function, you no
> longer
> >> need to use boost::closed_plus and can do with std::plus. Again, to me
> that
> >> makes the function simpler and cleaner. But that is a separate issue,
> >> because you can do that whether you make two separate functions or not.
> >>
> >
> >True, agreed. Still, the same as above: do we want to have two
> >implementations instead of one?
> >
>
> In this case there is not really that dilemma. All that is required here is
> an additional variation of the relax function (relax_target_only), but that
> is an implementation detail. This works with and without Piotr's patch.
>

I am not sure about that, but it could be the case. Perhaps a patch could
demonstrate that.

> >_______________________________________________
> >Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk