Boost Users :

Date view	Thread view	Subject view	Author view

Subject: Re: [Boost-users] performance of dijkstra_shortest_paths
From: Bhaskara Marthi (bhaskara_at_[hidden])
Date: 2009-05-01 16:41:18

Next message: Roman Shmelev: "Re: [Boost-users] [asio] async write handler is not called until the socket is not closed"
Previous message: Bhaskara Marthi: "Re: [Boost-users] performance of dijkstra_shortest_paths"
In reply to: Bhaskara Marthi: "Re: [Boost-users] performance of dijkstra_shortest_paths"

On Fri, May 1, 2009 at 12:34 PM, Bhaskara Marthi <bhaskara_at_[hidden]> wrote:

>
>
> On Fri, May 1, 2009 at 11:34 AM, Thomas Klimpel <
> Thomas.Klimpel_at_[hidden]> wrote:
>
>> Bhaskara Marthi wrote:
>> > > Try building in release mode? The debugging instrumentation in the BGL
>> can
>> > > kill performance.
>> >
>> > I'm compiling my source files with -O3 using gcc on ubuntu.
>> > That should be sufficient, right?
>>
>> For release mode, NDEBUG must be defined. However, a superficial glance at
>> the sources doesn't reveal any debugging instrumentation depending on
>> NDEBUG. Andrew, is there any?
>>
>> Another question for Andrew: Why was the relaxed heap replaced with a
>> 4-ary heap? In my experience, a normal array based binary heap is a
>> nightmare for the cache, and I wonder whether the 4-ary heap will be much
>> better in this respect. However, I don't know whether the cache behavior of
>> a relaxed heap is much better.
>>
>> Regards,
>> Thomas
>> _______________________________________________
>> Boost-users mailing list
>> Boost-users_at_[hidden]
>> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>>
>
> I verified that I'm compiling with NDEBUG. Compiler flags are shown below:
>
> /usr/bin/c++ -Dtopological_graph_EXPORTS -O3 -DNDEBUG -fPIC
> -I/u/bhaskara/ros/ros-pkg/world_models/topological_map/include
> -I/u/bhaskara/ros/ros-pkg/deprecated/deprecated_msgs/msg/cpp
> -I/u/bhaskara/ros/ros/core/rosconsole/include -I/opt/ros/include/boost-1_37
> -I/u/bhaskara/ros/ros-pkg/common/tinyxml/include
> -I/u/bhaskara/ros/ros-pkg/motion_planning/sbpl/src
> -I/u/bhaskara/ros/ros-pkg/motion_planning/navfn/include
> -I/u/bhaskara/ros/ros-pkg/motion_planning/navfn/srv/cpp
> -I/u/bhaskara/ros/ros-pkg/motion_planning/navfn/msg/cpp
> -I/u/bhaskara/ros/ros-pkg/world_models/costmap_2d/include
> -I/u/bhaskara/ros/ros-pkg/common/laser_scan/include
> -I/u/bhaskara/ros/ros-pkg/common/laser_scan/msg/cpp
> -I/u/bhaskara/ros/ros-pkg/common/filters/include
> -I/u/bhaskara/ros/ros-pkg/common/loki/build/loki-0.1.7/include
> -I/u/bhaskara/ros/ros-pkg/common/angles/include
> -I/u/bhaskara/ros/ros-pkg/common/robot_srvs/srv/cpp
> -I/u/bhaskara/ros/ros-pkg/common/tf/include
> -I/u/bhaskara/ros/ros-pkg/common/tf/msg/cpp
> -I/u/bhaskara/ros/ros-pkg/common/tf/srv/cpp
> -I/u/bhaskara/ros/ros/core/roscpp/include
> -I/u/bhaskara/ros/ros/3rdparty/xmlrpc++/src
> -I/u/bhaskara/ros/ros-pkg/common/bullet/build/include
> -I/u/bhaskara/ros/ros-pkg/common/bullet/swig
> -I/u/bhaskara/ros/ros-pkg/visualization_core/visualization_msgs/msg/cpp
> -I/u/bhaskara/ros/ros-pkg/common/robot_msgs/msg/cpp
> -I/u/bhaskara/ros/ros/std_msgs/msg/cpp
> -I/u/bhaskara/ros/ros/core/roslib/msg/cpp
> -I/u/bhaskara/ros/ros/core/roslib/include
> -I/u/bhaskara/ros/ros-pkg/3rdparty/eigen/build/eigen-2.0.0
> -I/opt/ros/include -I/u/bhaskara/ros/ros/3rdparty/gtest/gtest/include
> -I/u/bhaskara/ros/ros-pkg/world_models/topological_map/srv/cpp
> -DROS_PACKAGE_NAME=\"topological_map\" -O3 -DBT_USE_DOUBLE_PRECISION
> -DBT_EULER_DEFAULT_ZYX -DBT_USE_DOUBLE_PRECISION -DBT_EULER_DEFAULT_ZYX -W
> -Wall -Wno-unused-parameter -fno-strict-aliasing -o
> CMakeFiles/topological_graph.dir/src/grid_graph.o -c
> /u/bhaskara/ros/ros-pkg/world_models/topological_map/src/grid_graph.cpp
>
>
> I also tried using my own priority-queue based version shown below. On a
> graph with about 10^4 nodes and degree 4, my version takes about .05
> seconds, while dijkstra_shortest_path takes ten times as long.
> - Bhaskara
>

I came up with a potential explanation. So my graph is actually about 10^6
nodes in size, but the initial node is in a connected component of size
10^4. Now, my implementation only scales with the size of this connected
component: nodes are added to the distance map only as and when they are
visited, so all those nodes outside the connected component don't matter.
But, I suspect that bgl's dijkstra_shortest_paths is using data structures
that scale with the graph size rather than the connected component. If heap
performance scales logarithmically, the 100-fold increase in size would make
the algorithm several times slower.

Bgl experts: does this seem like the right explanation? If so, what should
I do: are there algorithms that take advantage of the smaller connected
component, or am I stuck rolling my own?

- Bhaskara

Also, I just checked what the time is being spent on in dijkstra.
Apparently, about half the time is spent on the initialization phase of
marking every node's distance as infinite. My algorithm avoids this. The
graph I'm trying it on is actually a million-node graph, but the start node
is in a connected component of size 10,000. So that explains part of the
performance difference --- not including the initialization, my
implementation is just four times faster than dijkstra_shortest_paths. Once
the initialization is complete, though, are there other parts of the
algorithm that scale with the overall graph size?

text/html attachment: attachment

Next message: Roman Shmelev: "Re: [Boost-users] [asio] async write handler is not called until the socket is not closed"
Previous message: Bhaskara Marthi: "Re: [Boost-users] performance of dijkstra_shortest_paths"
In reply to: Bhaskara Marthi: "Re: [Boost-users] performance of dijkstra_shortest_paths"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net