
Geometry : 
Subject: [ggl] rtree query interface
From: Adam Wulkiewicz (adam.wulkiewicz)
Date: 20110807 14:53:30
Barend Gehrels wrote:
>> t  within(g)  get values within some geometry
>> t  intersects(g)  get values intersecting some geometry
>
>
> Basically I like the clean syntax. These range adaptors, having the same
> name, should be in another namespace and if users hoist both namespaces
> with the usingclause, it will be ambiguous. So namespace will be
> necessary here (or in the normal function), making it a bit longer.
Of course you're right. I assumed that everything is in the
geometry::index namespace. So some algorithms from geometry corresponds
to the filters in geometry::index.
> What if we don't do this and just use the "filtered" adaptor defined by
> Boost.Range?
> t  filtered(within(g))
It's ok for me. There also may be some standard filters implemented:
index::within_filtered, index::intersects_filtered,
index::nearest_filtered. intersects_filtered is implemented already as
spatially_filtered.
Btw note that geometry::index filters should be the first ones in the
sequence of filters aplied to the container.
t  some_not_index_filter()  index::filtered(...)
won't work.
>
> The advantage is that AND, OR and NOT are much easier to handle (if
> people write their own predicate  but maybe Phoenix or Lambda can be of
> help here, didn't check this now).
Or Proto if it couldn't be done with Phoenix.
> However, the basic requirement is that queries like this are using the
> spatial index. So I don't know how the filtered would use that index. I
> assume that "within" would do that (did you already did some work
> here?). So we indeed need "something which is a Boost.Range adaptor,
> uses the spatial index, and is capable to do userdefined queries at
> compiletime".
>
> We might consider adding a "spatially_filtered" adaptor which acts like
> this.
spatial_filter<...> is implemented as mentioned before (intersects).
It's just generated by
operator(index::rtree<...> const&,
index::filters::spatially_filtered<...> const&)
Currently the querry is performed in the spatial_filter<...>'s
constructor by calling rtree<...>::find(...) method.
>> Note that some filters may refer to the geometry which is passed or to
>> the values stored in the rtree. Someone may want to get values within
>> some geometry or to get values containing some geometry. He may just
>> use previously described filters and connect them (described further)
>> but then e.g. 2 operations must be done instead of 1. And for some
>> queries this couldn't be possible (with operator only). So there
>> could be:
>>
>>  additional filters: t  outside(g)
>
>
> Outside is actually not (always) the same as ! within, because if a
> geometry partly intersects another, it is ! within and also ! disjoint.
> Outside might be renamed to "disjoint" here.
Got it. I forgot about disjoint. Any other algorithms which could be used?
>>  filters generated by some operator: t  ~within(g)  confusing
>>  object passed before in the sequence: t  inv  within(g)  even
>> more confusing
>>  object passed with filter: t  inv(within(g))
>>
>> In addition to this every filter can have it's negation generated by
>> operator!
>>
>> !within(g) == not_within(g)
>> !intersects(g) == not_intersects(g)
>>
>> Eventually there could be additional filters e.g. "get objects with 0
>> coordinate greather than some other object" coords<0, more>(g) etc. It
>> would be nice to have e.g. "get points on the right of the linestring"
>
>
> Basically I like the idea of ! within(g), it is very readable.
>
> But it is, if I'm right, different from Boost.Range adaptors which
> cannot be negated like this. The Boost.Range adaptor overloads operator
> to get its effect, you additionally need to overload operator! for a
> range adaptor as well. I didn't try but don't know if this is possible
> (probably it is).
>
> Besides that we have to be careful what is exactly the negation, see
> comment above.
In the case of using ranges, just another type of range would be
generated by operator! e.g. index::ranges::not_within. And the meaning
should be exactly as the meaning of !geometry::within(). Just like in
the case of using the loop, checking all elements and returning values
meeting !within(some_geometry) requirement.
>>
>> 2. K nearest neighbour filters
>>
>> nearest(g)  get nearest value to the geometry g
>> nearest(g, 5)  get nearest 5 values to the geometry g
>
> What will the first line return? All values in order of distance?
One value nearest to the geometry. If we have container of points and
the geometry is a point it would be the closest point. Internally there
would be algorithms from geometry used so if there was
comparable_distance()/within()/intersects()/etc. implemented for
Indexable and Geometry the query would compile, otherwise it wouldn't.
>> Folowing filters describes geometrical relationship but they'd
>> probably be used with NN filter.
>>
>> further(g, comp_dist1)
>> closer(g, comp_dist2)
>> range(g, comp_dist1, comp_dist2)
>>
>> In addition there could be k furthest neighbor filter:
>> furthest(g) == !nearest(g)
>
> The last line I don't understand... What if we have only 4 elements and
> we want the nearest(g, 5). It will return 4 values. The furthest will
> also return 4 values, so it is not the negation.
operator! was a suggestion. If we want to return sorted ranges for
nearest() and furthest() then nearest() will generate range sorted from
closest object and furthest() from furthest.
>> 3. Filters connection
>>
>> 3.1. operator
>>
>> "get values within geometry g1 AND not intersecting geometry g2"
>> t  within(g1)  !intersects(g2)
>
> Concatenation is possible in Boost.Range adaptors and I like it. So it
> will work automatically (besides the ! syntax). It might not be the most
> efficient in all cases because they are filtered twice, so the case you
> describe (g1 AND not g2) is implemented differently (though the effect
> is the same). You mention that also below. And how do both make use of
> the spatial index?
> Therefore, the filtered predicate might be better, being able to do this
> in one step.
As you noticed, in order to work efficiently, filters from g::index
shouldn't be processed exactly like filters in Boost.Range 
transparently. They should be used only with the rtree object or other
filter from g::index. operators  should generate one compound object
(or an expression if you like) which would be passed to the
rtree::find() e.g. when the begin() method of this object (which is also
a range) is called.
>>
>> "get 5 nearest to g2 values overlapping g1" which means
>
> I don't understand this example, because something overlapping something
> else does have a distance of 0.0, so nearest is more or less
> undertermined here. With the centroid below it is more clear.
I assumed that nearest just uses comparable_distance() so it will work
if it is implemented for geometries. Moreover I assumed that
comparable_distance() returns closest distance between e.g. geometries'
perimeters or perimeter and some point etc.
If we have a river(linestring) in a country (polygon) and rtree
containing some indexables(e.g. POI) on the entire continent this query
would be something like: "find 5 indexables nearest to the river inside
some country".
Btw distances between boxes and other geometries could be useful in the
future.
>> "get values overlapping g1 AND 5 nearest to g2"
>> t  overlaps(g1)  nearest(g2, 5)
>>
>> Using this operator filters may be processed sequentially from left to
>> right. Like in streams there may be manipulators used e.g. "get
>> objects which centroids are nearest to the geometry g1" could look
>> like this:
>>
>> t  centroid  nearest(g1, 5)
>
> Do I understand here that the centroid is a range adaptor, which returns
> the centroids from all geometries present in the spatial index? In that
> case I agree with this syntax.
It would be a part of the expression but this syntax probably won't be used.
>> instead of e.g.
>>
>> t  nearest<centroid_tag>(g1, 5) or
>> t  nearest(g1, 5, centroid) or
>> t  nearest(g1, 5, some_strategy)
>>
>> but
>>
>> t  centroid::nearest(g1, 5)
>>
>> looks nice.
>
> Also possible but might be confusing  it mixes namespace with
> behaviour. In this case I would prefer centroid_nearest (which is IMO
> looking good)
Ok.
>>
>> But, queries with logical alternatives can't be described this way
>> e.g. "get objects within g1 OR intersecting g2". This kind of query
>> performed by one tree traversal could be faster than two separate
>> queries. Moreover one probably will have to check if there are
>> repeated values in both results.
>
> So here the spatially_filtered adaptor will be of great help.
>
The expression may be passed to some function or operator
t  index::filtered(Expr)  some_filter()
t  some_filter()  index::filtered(Expr) // Error
t  (Expr)  some_filter()
t  some_filter()  (Expr) // Error
t[Expr]  some_filter()
index::filter(t, Expr)  some_filter()
index::find(t, Expr)  some_filter()
>>
>> Btw pipe operator is used here as a conjunction and this is c++
>> binary alternative.
>>
>> 3.2. expressions
>>
>> By using more operators we lose sequential nature of the querry but we
>> get more powerful expressions which may be passed to the query e.g.
>>
>> "get objects within g1 AND not intersecting g2"
>> t[within(g1) & !intersects(g2)]
>>
>> "get objects within g1 OR intersecting g2"
>> t[within(g1)  intersects(g2)]
>>
>> "get points closest to the linestring in the areas intersecting
>> geometry g1 or within geometry g2"
>> t[(intersects(g1)  within(g2)) & nearest(ls, 10)]
>>
>> "get boxes which centroids are closest to the g3 in the areas
>> intersecting geometry g1 or within geometry g2" e.g.
>> t[(intersects(g1)  within(g2)) & centroid(nearest(g3, 10))] or
>> t[(intersects(g1)  within(g2)) & centroid::nearest(g3, 10)]
>
> Yes, it is looking nice, and I would suggest to verify if we can use
> Boost.Range filtered in combination with Phoenix to get this effect.
> Further the & should be &&, and the  should be , IMO.
>
> So something like t  filtered(within(_1) && ! intersects(_1))
> (fictitious)
Have you had in mind something like this
t  filtered(within(_1, g1) && ! intersects(_1, g2)) ?
It would be nice because one might write within(_1, g1) or within(g1,
_1). There would be no additional names.
Nodes and values must be handled differently while traversing. There
would be different checks performed for each but it should be doable.
>>
>> But I don't know if all future expresions would be easy to implement
>> e.g. expressions with some number of KNN filters alternatives would
>> probably gather results in appropriate number of containers because
>> objects must be sorted by distance for each KNN filter. Then results
>> would probably be connected (checks for repeating objects). But this
>> is probably possible.
>>
>> Or there could be only one KNN search per query but this isn't nice.
>
> So, in general, I like the idea of querying spatial indexes. I would
> suggest to see how Phoenix helps us by defining spatial query
> combinations, while making use of the index as well. It is very
> interesting and it would be great if it would all work clear and
> efficient like that.
I agree.
Do you think that some simple queries are needed too? E.g.
t  intersects_filtered(g)
t  within_filtered(g)
t  nearest_filtered(p, 5)
What other types of queries besides geometryrelated and knn would you
like to have?
Regards,
Adam
Geometry list run by mateusz at loskot.net