Boost logo

Geometry :

Subject: [ggl] rtree query interface
From: feverzsj
Date: 2011-08-10 08:06:35


hi:

1. IIRC, ggl still miss functor interfaces for predicates to use with
boost.range.
something like:

template<class G1>
struct Predicator
{
    G1 const& g1;

    Predicator(G1 const& g) : g1(g) {}

    template<class G2>
    bool operator()(G2 const& g2) const
    {
        return predicate(g1, g2);
    }
};

such functor may be used with both boost.range and rtee;

2. rtree may be consider as either an index structure or a boost.range:
while using with some predicates(e.g. equals/intersects/overlaps/within),
rtree works faster as an index structure;
while using with other predicates, rtree may be just taken as a boost.range.
so: "t | index::filtered(within(g))" and "t | filtered(within(g))" give the
same result but the previous one could be much more faster.

as for chained case,
     t | pred1() | pred2()| ... | predN()

Will it return a new rtree for index::filtered()?
Well, I think this could be less efficient for most use case, and not work
well with those predicates that rtree not good at.
I'd prefer simply return a sequential range at the first place.
so :
   only the first "t | index::filtered(pred(g))" makes use of index
facility. The followed adaptors will just deal with normal sequential range;

Adam Wulkiewicz wrote:
> Barend Gehrels wrote:
>
>>> t | within(g) - get values within some geometry
>>> t | intersects(g) - get values intersecting some geometry
>>
>>
>> Basically I like the clean syntax. These range adaptors, having the same
>> name, should be in another namespace and if users hoist both namespaces
>> with the using-clause, it will be ambiguous. So namespace will be
>> necessary here (or in the normal function), making it a bit longer.
>
> Of course you're right. I assumed that everything is in the
> geometry::index namespace. So some algorithms from geometry corresponds to
> the filters in geometry::index.
>
>> What if we don't do this and just use the "filtered" adaptor defined by
>> Boost.Range?
>> t | filtered(within(g))
>
> It's ok for me. There also may be some standard filters implemented:
> index::within_filtered, index::intersects_filtered,
> index::nearest_filtered. intersects_filtered is implemented already as
> spatially_filtered.
>
> Btw note that geometry::index filters should be the first ones in the
> sequence of filters aplied to the container.
>
> t | some_not_index_filter() | index::filtered(...)
>
> won't work.
>
>>
>> The advantage is that AND, OR and NOT are much easier to handle (if
>> people write their own predicate - but maybe Phoenix or Lambda can be of
>> help here, didn't check this now).
>
> Or Proto if it couldn't be done with Phoenix.
>
>> However, the basic requirement is that queries like this are using the
>> spatial index. So I don't know how the filtered would use that index. I
>> assume that "within" would do that (did you already did some work
>> here?). So we indeed need "something which is a Boost.Range adaptor,
>> uses the spatial index, and is capable to do user-defined queries at
>> compile-time".
>>
>> We might consider adding a "spatially_filtered" adaptor which acts like
>> this.
>
> spatial_filter<...> is implemented as mentioned before (intersects). It's
> just generated by
>
> operator|(index::rtree<...> const&,
> index::filters::spatially_filtered<...> const&)
>
> Currently the querry is performed in the spatial_filter<...>'s constructor
> by calling rtree<...>::find(...) method.
>
>>> Note that some filters may refer to the geometry which is passed or to
>>> the values stored in the rtree. Someone may want to get values within
>>> some geometry or to get values containing some geometry. He may just
>>> use previously described filters and connect them (described further)
>>> but then e.g. 2 operations must be done instead of 1. And for some
>>> queries this couldn't be possible (with operator| only). So there
>>> could be:
>>>
>>> - additional filters: t | outside(g)
>>
>>
>> Outside is actually not (always) the same as ! within, because if a
>> geometry partly intersects another, it is ! within and also ! disjoint.
>> Outside might be renamed to "disjoint" here.
>
> Got it. I forgot about disjoint. Any other algorithms which could be used?
>
>>> - filters generated by some operator: t | ~within(g) - confusing
>>> - object passed before in the sequence: t | inv | within(g) - even
>>> more confusing
>>> - object passed with filter: t | inv(within(g))
>>>
>>> In addition to this every filter can have it's negation generated by
>>> operator!
>>>
>>> !within(g) == not_within(g)
>>> !intersects(g) == not_intersects(g)
>>>
>>> Eventually there could be additional filters e.g. "get objects with 0
>>> coordinate greather than some other object" coords<0, more>(g) etc. It
>>> would be nice to have e.g. "get points on the right of the linestring"
>>
>>
>> Basically I like the idea of ! within(g), it is very readable.
>>
>> But it is, if I'm right, different from Boost.Range adaptors which
>> cannot be negated like this. The Boost.Range adaptor overloads operator|
>> to get its effect, you additionally need to overload operator! for a
>> range adaptor as well. I didn't try but don't know if this is possible
>> (probably it is).
>>
>> Besides that we have to be careful what is exactly the negation, see
>> comment above.
>
> In the case of using ranges, just another type of range would be generated
> by operator! e.g. index::ranges::not_within. And the meaning should be
> exactly as the meaning of !geometry::within(). Just like in the case of
> using the loop, checking all elements and returning values meeting
> !within(some_geometry) requirement.
>
>>>
>>> 2. K nearest neighbour filters
>>>
>>> nearest(g) - get nearest value to the geometry g
>>> nearest(g, 5) - get nearest 5 values to the geometry g
>>
>> What will the first line return? All values in order of distance?
>
> One value nearest to the geometry. If we have container of points and the
> geometry is a point it would be the closest point. Internally there would
> be algorithms from geometry used so if there was
> comparable_distance()/within()/intersects()/etc. implemented for Indexable
> and Geometry the query would compile, otherwise it wouldn't.
>
>>> Folowing filters describes geometrical relationship but they'd
>>> probably be used with NN filter.
>>>
>>> further(g, comp_dist1)
>>> closer(g, comp_dist2)
>>> range(g, comp_dist1, comp_dist2)
>>>
>>> In addition there could be k furthest neighbor filter:
>>> furthest(g) == !nearest(g)
>>
>> The last line I don't understand... What if we have only 4 elements and
>> we want the nearest(g, 5). It will return 4 values. The furthest will
>> also return 4 values, so it is not the negation.
>
> operator! was a suggestion. If we want to return sorted ranges for
> nearest() and furthest() then nearest() will generate range sorted from
> closest object and furthest() from furthest.
>
>>> 3. Filters connection
>>>
>>> 3.1. operator|
>>>
>>> "get values within geometry g1 AND not intersecting geometry g2"
>>> t | within(g1) | !intersects(g2)
>>
>> Concatenation is possible in Boost.Range adaptors and I like it. So it
>> will work automatically (besides the ! syntax). It might not be the most
>> efficient in all cases because they are filtered twice, so the case you
>> describe (g1 AND not g2) is implemented differently (though the effect
>> is the same). You mention that also below. And how do both make use of
>> the spatial index?
>> Therefore, the filtered predicate might be better, being able to do this
>> in one step.
>
> As you noticed, in order to work efficiently, filters from g::index
> shouldn't be processed exactly like filters in Boost.Range -
> transparently. They should be used only with the rtree object or other
> filter from g::index. operators | should generate one compound object (or
> an expression if you like) which would be passed to the rtree::find() e.g.
> when the begin() method of this object (which is also a range) is called.
>
>>>
>>> "get 5 nearest to g2 values overlapping g1" which means
>>
>> I don't understand this example, because something overlapping something
>> else does have a distance of 0.0, so nearest is more or less
>> undertermined here. With the centroid below it is more clear.
>
> I assumed that nearest just uses comparable_distance() so it will work if
> it is implemented for geometries. Moreover I assumed that
> comparable_distance() returns closest distance between e.g. geometries'
> perimeters or perimeter and some point etc.
>
> If we have a river(linestring) in a country (polygon) and rtree containing
> some indexables(e.g. POI) on the entire continent this query would be
> something like: "find 5 indexables nearest to the river inside some
> country".
>
> Btw distances between boxes and other geometries could be useful in the
> future.
>
>>> "get values overlapping g1 AND 5 nearest to g2"
>>> t | overlaps(g1) | nearest(g2, 5)
>>>
>>> Using this operator filters may be processed sequentially from left to
>>> right. Like in streams there may be manipulators used e.g. "get
>>> objects which centroids are nearest to the geometry g1" could look
>>> like this:
>>>
>>> t | centroid | nearest(g1, 5)
>>
>> Do I understand here that the centroid is a range adaptor, which returns
>> the centroids from all geometries present in the spatial index? In that
>> case I agree with this syntax.
>
> It would be a part of the expression but this syntax probably won't be
> used.
>
>>> instead of e.g.
>>>
>>> t | nearest<centroid_tag>(g1, 5) or
>>> t | nearest(g1, 5, centroid) or
>>> t | nearest(g1, 5, some_strategy)
>>>
>>> but
>>>
>>> t | centroid::nearest(g1, 5)
>>>
>>> looks nice.
>>
>> Also possible but might be confusing - it mixes namespace with
>> behaviour. In this case I would prefer centroid_nearest (which is IMO
>> looking good)
>
> Ok.
>
>>>
>>> But, queries with logical alternatives can't be described this way
>>> e.g. "get objects within g1 OR intersecting g2". This kind of query
>>> performed by one tree traversal could be faster than two separate
>>> queries. Moreover one probably will have to check if there are
>>> repeated values in both results.
>>
>> So here the spatially_filtered adaptor will be of great help.
>>
>
> The expression may be passed to some function or operator
>
> t | index::filtered(Expr) | some_filter()
> t | some_filter() | index::filtered(Expr) // Error
>
> t | (Expr) | some_filter()
> t | some_filter() | (Expr) // Error
>
> t[Expr] | some_filter()
>
> index::filter(t, Expr) | some_filter()
> index::find(t, Expr) | some_filter()
>
>>>
>>> Btw pipe operator| is used here as a conjunction and this is c++
>>> binary alternative.
>>>
>>> 3.2. expressions
>>>
>>> By using more operators we lose sequential nature of the querry but we
>>> get more powerful expressions which may be passed to the query e.g.
>>>
>>> "get objects within g1 AND not intersecting g2"
>>> t[within(g1) & !intersects(g2)]
>>>
>>> "get objects within g1 OR intersecting g2"
>>> t[within(g1) | intersects(g2)]
>>>
>>> "get points closest to the linestring in the areas intersecting
>>> geometry g1 or within geometry g2"
>>> t[(intersects(g1) | within(g2)) & nearest(ls, 10)]
>>>
>>> "get boxes which centroids are closest to the g3 in the areas
>>> intersecting geometry g1 or within geometry g2" e.g.
>>> t[(intersects(g1) | within(g2)) & centroid(nearest(g3, 10))] or
>>> t[(intersects(g1) | within(g2)) & centroid::nearest(g3, 10)]
>>
>> Yes, it is looking nice, and I would suggest to verify if we can use
>> Boost.Range filtered in combination with Phoenix to get this effect.
>> Further the & should be &&, and the | should be ||, IMO.
>>
>> So something like t | filtered(within(_1) && ! intersects(_1))
>> (fictitious)
>
> Have you had in mind something like this
>
> t | filtered(within(_1, g1) && ! intersects(_1, g2)) ?
>
> It would be nice because one might write within(_1, g1) or within(g1, _1).
> There would be no additional names.
>
> Nodes and values must be handled differently while traversing. There would
> be different checks performed for each but it should be doable.
>
>>>
>>> But I don't know if all future expresions would be easy to implement
>>> e.g. expressions with some number of KNN filters alternatives would
>>> probably gather results in appropriate number of containers because
>>> objects must be sorted by distance for each KNN filter. Then results
>>> would probably be connected (checks for repeating objects). But this
>>> is probably possible.
>>>
>>> Or there could be only one KNN search per query but this isn't nice.
>>
>> So, in general, I like the idea of querying spatial indexes. I would
>> suggest to see how Phoenix helps us by defining spatial query
>> combinations, while making use of the index as well. It is very
>> interesting and it would be great if it would all work clear and
>> efficient like that.
>
> I agree.
>
> Do you think that some simple queries are needed too? E.g.
>
> t | intersects_filtered(g)
> t | within_filtered(g)
> t | nearest_filtered(p, 5)
>
> What other types of queries besides geometry-related and knn would you
> like to have?
>
> Regards,
> Adam
> _______________________________________________
> ggl mailing list
> ggl_at_[hidden]
> http://lists.osgeo.org/mailman/listinfo/ggl
>
>

Regards, zsj


Geometry list run by mateusz at loskot.net