# Boost Users :

Subject: Re: [Boost-users] [Polygon] Test among 3 libraries
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2010-09-09 13:35:01

Angus Johnson wrote:
>> That's a worthwhile goal to be sure. When I discussed scalability I
>> meant algorithmic complexity and ability to process large data
>> volumes in input geometry in a single test case.
> Ahh, thanks for the clarification. I haven't yet done any significant
> testing of processing very large data volumes, nor am I competent to
> determine an O value, so I'll defer to your estimation there.

What you have already done is significantly harder than determining O values. Don't sell yourself short. If you plot number of vertices in input plus output on the x axis and runtime on the y axis in a excel spreadsheet you can have excel create an exponential trend line for the data and display the equation on the graph. The equation will look like 2.79235 * x^1.34984. Your empirical scaling factor (empirical O value) is then 1.35. In this hypothetical example 1.35 is faster than O(n * sqrt n) but slower than O(n log n). You should be able to estimate your empirical scaling factor very quickly by generating progressively large random test cases with this method. Read the experiment section of my boostcon paper
http://www.boost.org/doc/libs/1_44_0/libs/polygon/doc/GTL_boostcon2009.pdf
to get an idea how to structure the experiment and what the graph should look like (page 9.) I can share the code for the test harness with you if you like. The tricky part is making sure that the geometry doesn't qualitatively change as you make successively larger test cases because this can introduce a systematic bias in you empirical scaling factor. I'd be interested in seeing this kind of performance analysis on your clipper implementation.

If you are using a linked list as your sweepline data structure your worst case runtime complexity is O(n^2) wrt. input plus intersection vertices and your expected runtime complexity is O(n^1.5) in the common case. In the worst case there is O(n) line segments in your sweepline at each sweepline "stop" (unique x coordinate) while in the common case there is O(n^0.5) line segments in your sweepline at each sweepline stop and O(n) stops.

>> Also, I believe GPC does handle high order intersections because I'm
>> quite certain I was generating large numbers of them in the tests I
>> was running that GPC seemed to handle just fine. I could be wrong of
>> course, but I suggest you double check that GPC doesn't properly deal
>> with complex intersection. I know it doesn't deal with overlapping or
>> self intersecting polygons at all, but that is a separate issue from
>> polygons that have vertices in common or polygons that abut along
>> edges but do not overlap. GPC is used quite extensively and paid for
>> by many people, we should be responsible about making claims about
>> its short comings.
>
> I was careful to qualify my statements about GPC with "I don't think"
> and "I believe". However, I accept that this still isn't fair
> treatment and I should be sure before making these claims. Let me
> simply state that under my stress-testing, GPC fails about 4% of the
> time (and with extreme stress testing - rounding random vertices to
> nearest 100 instead of nearest 10 - it fails far more frequently). The