Hi Tinko,

With respect to hashing, that's the reason (comparing doubles) why I asked in my first email whether the hashing could be somehow carried out with an invariant field that potentially two spatially equal linestrings could have. I am not fully knowledgeable whether such a field exists/stands. Probably, hashing experts (from boost::hash) could have an idea. I guess it is like someone wanted to store two-dimensional arrays in an unordered_set (arrays with inverse order of rows from one another should be equal). I am thinking that there might be a mathematical way to produce a quantified invariant by iterating over the rows of the 2D matrix.  In any case, I will explore further based on your suggestions on hashing, as well.

Now, as far as the tolerance is concerned, I have tried in the past to define  BOOST_GEOMETRY_NO_ROBUSTNESS macro; however, I didn't see any difference. I can't use integers for coordinate types, I am afraid. I ve been using boost::geometry ( I really like Boost::geometry, its design and flexibility ) for generating multi-dimensional design spaces with scaled coordinates (0 - 1) or unscaled (real parameter ranges). I am also tracking paths for particular solution points that result from genetic algorithms. As such, comparing spatially equal paths/linestrings (boost::geometry::equals) along with calculating intersections (boost::geometry::intersection) and linestring differences (boost::geometry::difference) with some sort of pre-defined tolerance would be really nice.

In any case, many thanks once again :)

BW

On Tue, Jun 16, 2020 at 10:14 PM  Tinko Bartels <tinkobartels@gmail.com> wrote:
Hi Giorgino,

I am happy to hear that the mask works.

With regard to hashing: As long as two linestrings share the exact same points and differ in no way other than maybe having reverse order, I think the idea of normalizing the order of points should work. I adapted your sample code to put together an example in godbolt, you can observe the output on the right side. shash is the sorting hash and it is the same for both linestrings in this example: https://godbolt.org/z/QRbTnR .

The crucial point here is "exactly the same", because hashing the points like this is in essence similar to comparing doubles or floats like "a == b" which one usually wants to avoid. Usually one checks something like "std::abs(a - b) < 0.0001". Of course your hash only looks at individual linestrings so I don't see how that could be done here. If two linestrings are generated from different calculations they may differ, even if you would expect the results to be the same if the computer could use real numbers. Unfortunately there are subtle ways in which floating-point (i.e. float, double or long double) arithmetics can mess with your results. For example, it is well-known that  a + b + c can be different from a + c + b and but I have also seen a compiler optimization that can cause  a * b + c * d to be different from c * d + a * b.

There may be ways around this with floating point types (such as double, float, long double) for your hashing functions but I can't think of any right now, maybe somebody else has an idea.

Here are two ideas for alternatives to hashing geometries with floating-point coordinates:

1) Consider the use of integer coordinates (i.e. int, long or long long). That is the solution that was chosen in some projects that I participated in. Of course, this is only an option if you can work with the limited range of integers and if integers are suitable for all calculations that you do in your application and there is no other source of roundoff error.

2) Consider the use of a map/set instead of an unordered_map/unordered_set. Of course that would change the time complexity of some operations from O(1) to O(log(n)) but it would allow you to consider two linestrings at the same time and you could use some tolerance threshold, such as one usually does with floating point types. Here is a quickly put together illustration (with std::set) of what I mean by that: https://godbolt.org/z/DCkXdY . There are probably pathological cases that can break this as well, but maybe this works with your intended application.

I don't know how one sets tolerances with the rescaling in Boost.Geometry. But I have seen examples in which it was helpful to just disable it by defining BOOST_GEOMETRY_NO_ROBUSTNESS, so you could try that and see whether that helps with your application.

Kind regards
Tinko Bartels