Hi Giorgino,

I am happy to hear that the mask works.

With regard to hashing: As long as two linestrings share the exact same points and differ in no way other than maybe having reverse order, I think the idea of normalizing the order of points should work. I adapted your sample code to put together an example in godbolt, you can observe the output on the right side. shash is the sorting hash and it is the same for both linestrings in this example: https://godbolt.org/z/QRbTnR .

The crucial point here is "exactly the same", because hashing the points like this is in essence similar to comparing doubles or floats like "a == b" which one usually wants to avoid. Usually one checks something like "std::abs(a - b) < 0.0001". Of course your hash only looks at individual linestrings so I don't see how that could be done here. If two linestrings are generated from different calculations they may differ, even if you would expect the results to be the same if the computer could use real numbers. Unfortunately there are subtle ways in which floating-point (i.e. float, double or long double) arithmetics can mess with your results. For example, it is well-known thatĀ  a + b + c can be different from a + c + b and but I have also seen a compiler optimization that can causeĀ  a * b + c * d to be different from c * d + a * b.

There may be ways around this with floating point types (such as double, float, long double) for your hashing functions but I can't think of any right now, maybe somebody else has an idea.

Here are two ideas for alternatives to hashing geometries with floating-point coordinates:

1) Consider the use of integer coordinates (i.e. int, long or long long). That is the solution that was chosen in some projects that I participated in. Of course, this is only an option if you can work with the limited range of integers and if integers are suitable for all calculations that you do in your application and there is no other source of roundoff error.

2) Consider the use of a map/set instead of an unordered_map/unordered_set. Of course that would change the time complexity of some operations from O(1) to O(log(n)) but it would allow you to consider two linestrings at the same time and you could use some tolerance threshold, such as one usually does with floating point types. Here is a quickly put together illustration (with std::set) of what I mean by that: https://godbolt.org/z/DCkXdY . There are probably pathological cases that can break this as well, but maybe this works with your intended application.

I don't know how one sets tolerances with the rescaling in Boost.Geometry. But I have seen examples in which it was helpful to just disable it by defining BOOST_GEOMETRY_NO_ROBUSTNESS, so you could try that and see whether that helps with your application.

Kind regards
Tinko Bartels