Boost logo

Boost Users :

From: Jens Thiele (karme_at_[hidden])
Date: 2021-06-14 10:26:31


Adam Wulkiewicz via Boost-users <boost-users_at_[hidden]> writes:

> W dniu 14.06.2021 o 11:33, Jens Thiele via Boost-users pisze:
>> Adam Wulkiewicz via Boost-users <boost-users_at_[hidden]> writes:
>>
>>> W dniu 23.05.2021 o 17:33, Adam Wulkiewicz pisze:
>>>> W dniu 18.05.2021 o 09:34, Jens Thiele via Boost-users pisze:
>>>>> I have lots of linestrings and want to find the k nearest linestrings to
>>>>> some point.
>>>>>
>>>>> Looking at the example
>>>>> /libs/geometry/doc/index/src/examples/rtree/polygons_shared_ptr.cpp
>>>>> I first thought this should be close to the solution and I just could
>>>>> replace the polygons with linestrings. But now I think the nearest query
>>>>> in that example only finds the nearest polygon bounding boxes and not
>>>>> the nearest polygons. Am I correct?
>>>>>
>>>>> If yes, how would one extend that example to find the nearest polygons?
>>>> Hi Jens,
>> Hi,
>>
>>>> Yes, your understanding is correct. Bounding boxes of polygons
>>>> together with pointers to polygons are stored in the rtree. This is
>>>> also what is returned by the query. So you need to calculate the
>>>> distances to actual linestrings by yourself.
>>>>
>>>> I propose you to use query iterators instead of query function. Then
>>>> you can iterate over nearest boxes (passing the number of values
>>>> stored in the rtree into the nearest predicate). In the loop
>>>> calculate distances to linestrings and break when you have enough of
>>>> them. You should probably break when the number of linestrings you
>>>> have is equal to your K and the distance to the furthest linestring
>>>> is lesser than the distance to the current box returned by the rtree
>>>> query (because then you know that you will not get any closer
>>>> linestring). To track the furthest linestring you can use
>>>> std::priority_queue.
>>>>
>>>> Adam
>>> Correction: "the current box returned by the rtree query"
>>>
>>> I of course had in mind: "the current box returned by the query iterator"
>> I followed that route and the results look correct but performance is
>> really bad.
>> Nearly all time (0.5s with a really small test) is consumed by qbegin:
>> rtree_t::const_query_iterator it = t->qbegin(idx::nearest(pt,
>> tree_size));
>>
>> Any ideas how to improve performance?
>
> Are you compiling with optimizations enabled?

yes, with -O2

> Without knowing your code I can't esstimate if this is long time or
> not, I can only guess. In general qbegin() loop gathering some number
> of nearest elements should take similar time to query() call doing the
> same. So you can try getting some number of knn boxes from the R-tree
> and checking if this is the case.

it looks like I have some fundamental misunderstanding here:
t->qbegin(idx::nearest(pt, 100)) is really fast (<1ms) but
t->qbegin(idx::nearest(pt, tree_size)) with tree_size=37133 is really
slow (~500ms)

shouldn't those calls take approximately the same time? I thought the
search is done incrementally?

> The speed of the R-tree is determined by the tree internal structure
> which is created during balancing or packing (packing is the fastest),
> minimum and maximum number of elements in nodes and the data itself
> (sparsity vs overlap). How is your R-tree created and what are the
> parameters?

I am modifying existing code there and unfortunately don't have a
minimalistic test to share. Let me see:

namespace b = boost;
namespace bg = b::geometry;
namespace bgm = bg::model;
namespace idx = bg::index;
namespace ipc = b::interprocess;
typedef bgm::point <double, 2, bg::cs::cartesian> point_t;
typedef bgm::box<point_t> box_t;
typedef std::pair<box_t, way_info_t> value_t;
typedef idx::linear<51, 25> params_t;
typedef idx::indexable<value_t> indexable_t;
typedef idx::equal_to<value_t> equal_to_t;
typedef ipc::allocator<value_t, ipc::managed_mapped_file::segment_manager> \
allocator_t;
typedef idx::rtree<value_t,
                   params_t,
                   indexable_t,
                   equal_to_t,
                   allocator_t> rtree_t;

> Some users confuse maximum number of parameters in node with
> capacity/size of the R-tree. The result is the R-tree with very big
> root node which is effectively a vector storing all of the
> elements. So the query has to check all of the elements of this node
> and by extension the whole R-tree  during a query.

ok

Jens


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net