|
Boost : |
Subject: Re: [boost] Formal Review: Boost.Polygon starts today August 24, 2009
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2009-08-29 19:54:04
Thomas Klimpel wrote:
>> - What is your evaluation of the documentation?
>
> I actually like it, especially now where I read through it once (At
> least now I have the impression that I understood everything.
> However, I know from experience that such an impression is
> deceptive). I noticed that "gtl_point_3d_concept.htm" is not linked
> to the rest of the documentation, could this be intentional? (It
> seems like a point_3d concept is not required for a polygon library.)
Yes, it is intentional. I narrowed the scope of the library to explictly 2d polygon focus. The point 3d concept and related code such as 3d axis-transformation are now undocumented features. They are coupled to the 2d transformation system pretty strongly, so I left them in rather than removing the code. They may prove useful, and they do demonstrate how the type system can be extended to 3d.
> I gave up writing down all typos explicitly. Instead, I fixed my
> local version and created a patch (-> see attachment).
Great. That's a big help.
> The Polygon Set Concept documentation (and others) says "Will scan
> and eliminate overlaps" from time to time, but doesn't describe how
> overlaps are resolved. Common options are the even-odd rule and the
> non-zero winding number rule, but I guess there exist even more
> sensible options than that. Is there a way to control how overlaps
> are resolved and which resolution strategy is used by default?
I should rephrase that to state will scan to merge overlaps in an implicit OR operation. This is because the OR operation is implemented as an accumlulate and is as fast as appending the right hand side data to a vector on the left hand side. The data structure is then in a "dirty" state where there may be regions of overlap between input polygons. The self_xor and self_intersect only produce useful output when the data structure is in this "dirty" state because they pertain to these regions of self overlap within a polygon set. When you get polygons out of a polygon set I need to merge the data to produce output polygons. If you were to call self_intersect after gettting those polygons out you would get the null set. This is potentially confusing. The accumulate behavior of the += operator is important because it is very easy for a person to write a loop like:
foreach polygon in input_stream {
polygon_set += polygon;
}
which would be n^2 log n, but is n log n with the optimization. On the other hand, people should be aware that calling get_polygons is potentially an expensive operation if the polygon set is in a "dirty" state.
I'll probably have to improve the documentation with regard to these specific issues somewhat substantially....
> The "Property Merge" seems to describe a substitute for a "reduce"
> operation (in the sense of the "map-reduce paradigm"). The result
> type "std::map<std::set<property_type>,
> polygon_set_data<coordinate_type> >" is probably the substitute for
> commutative "reduce" operations and
> "std::map<std::vector<property_type>,
> polygon_set_data<coordinate_type> >" the substitute for
> non-commutative "reduce" operations. I see that I can implement any
> desired "reduce" operation, by first running the "Property Merge"
> algorithm, and then applying the reduce operations to the resulting
> sets, and then just putting all result polygons from sets with
> identical "reduce" result in the same set (which give the correct
> result since the polygons sets don't overlap each other). But I
> wonder whether it would be possible to directly offer "reduce"
> operations, and whether this could potentially have performance
> benefits.
You are right that it is possible. You could do exactly what you suggest by implementing the appropriate output functor for the sweepline algorithm to evaluate the boolean logic you want to produce result as a single polygon set directly rather than collecting all the pieces up in the output from property merge and merging them with another pass of sweepline. I've thought about this a fair ammount and while it is very interesting I think there is not much performance benefit to be had. Assuming optimal sweepline we would still not be better off to do it in one pass instead of one per boolean op of the desired boolean logic because the result of an intersect or and not operation usually reduces data volume and we would prefer to do those first if possible. I would expect (A & B) | (C - D) to be slower if executed as one sweep over all the edges in A, B, C and D than as three since we can expect A&B and C-D to reduce the data volume that goes into the OR operation. Also the peak memory required for sweeping all of them at once is greater than as separate passes for each op. One final consideration is that it is fairly rare for people to form multi-op expressions of simple booleans in a single line (such that expression templates could kick in), so even if there were a performance win we would be optimizing for the uncommon case. There is plenty left to do to make a single op run faster, I have a list, actually.
Thanks Thomas, I'm glad to hear you liked the documentation,
Luke
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk