Boost logo

Boost :

Subject: Re: [boost] Formal Review: Boost.Polygon starts today August 24, 2009
From: Fernando Cacciola (fernando.cacciola_at_[hidden])
Date: 2009-09-09 01:27:57


Hi People,

Here is my own review of the library (NOT as a review manager but as a reviewer)

I won't state now whether I think it should or should not be accepted to avoid
the appearance of a conflict of interest between the two roles. Naturally, what
I think in that regard will weight into the final decision, so I'll state so
when I post the veredict.

Below is a list of points that have been mostly not mentioned before, so have in
mind that I am implicitely agreeing with many of the issues raised.

[it's 2:30 AM so this goes unedited, hope it makes sense]

> - What is your evaluation of the design?

Overall, it is good, whith a more or less reasonable concept-oriented API.

Here are some specific points I dislike:

(*)
I really don't like the use of overloading-on-concept as the main API element.
Specially implemented via SFINAE as it is.

IMO, a much better design would be to a have basic traits that provides the
fundamental building blocks, as the library is doing, for example via
"point_traits<>", but then the derived functions be also provided as static
members of a template class, say, point_concept<>

This is much better for a number of reasons:

A user can specialize the functions for their own data types in a snap.
Currently is next to impossible. For instance, having the compiler call my own
assign() for my own point type, when it happens to be a model of a mutable
point, can be very very tricky.

A design based on many layers of specialzable classes is WAY more compiler
friendly and hence much more easy for the programmer to spot mistakes such as
constrains violations (wrong type) without concept checking.

Then, at the front end of the API, I would provide the overloaded versions as
there are now as a top layer. This is in fact what the library is doing for
those fundamental functions that are given directly by the traits class, but IMO
it needs to do the same with all free functions, dispatching the actual
implementation to the right specialization of the concept (i.e. free fucntions
would be merely forwarding)

Finally, the library itself wouldn't use the overloaded free functions but the
"real" member functions. This would make the code much more robust in the
presence of ADL. Right now, there are countless places where unqualified calls
to, say, assign(), set(), even "abs()", WILL cause ambiguities in many real
world contexts.

(*)
The library needs to distinguish lvalue-ness from write-ness: it is one thing to
"reset" the X ccordinate of a point and one totally diferent is to get a
non-const reference (lvalue) to it.

This general lack of "mutation" as opposed to "assignment" causes for example
gross inefficiency in the operations that modify the vertices of a polygon (such
as convolve, scale, transform, etc):

The functions that implement that, generate a new container with the modified
points and then resets the point sequence within the input/output polygon.
IMO this is nonesense (and a showstopper from a review POV). yet is it really
just the conequence of lacking "lvalue" access to the vertices WITHIN a polygon.

Generating a new container would only make sense if the operation would return a
new polygon instead of modifying the one in place.

(*)
In fact, that most opertations modify their arguments instead of producing new
ones is odd, and in some cases just plain wrong (when the reference parameter is
not even an input value)

(*)
The library is specifically 2D yet that fact isn't encoded in the namespace nor
the identifies. This should be fixed as otherwise will conflict with future 3D
libraries.

(*)
The library names something as "orientation". IMNSHO, in the context of
polygos, orientation means something else (that which the library call direction
instead). So, I would strongly suggest the rename it, to something like:
axis_alignment (which is what it is).

(*)
some enums are named after "BLABLA_enum" and some after "BLABLA".
that should be consistent... it would remove _enum

(*)
There is a lowercase "winding_direction" enum whith values like
"clockwise_winding" and "counterclock_winding" (plus unknown), but then there is
  the "direction_1d_enum" with values of "CLOCKWISE" and "COUNTERCLOCKWISE"
Not only that is confusing but it also boils down to inefficient code because
the "winding" function returns CLOCKWISE|COUNTERCLOCKWISE yet the
"polygon_traits<>::winding" code that is the actual implementation returns
winding_direction. Since the actual numeric values mismatch there is an explicit
conditional convertion:

return w == clockwise_winding = CLOCKWISE : COUNTERCLOCKWISE

(*)
In fact, the problem above comes from the design of the fucntion winding(): if
the actualt winding cannot be computed, it iself computes the area instead and
uses its sign to second guess the winding.

IMO, that alternative calculation should be done by the user if and when it
makes sense.

(*)
In fact again, the problem above comes from the design of the function area():
there is a "point_sequence_area" that does the actual calculation and returns a
signed value. But then area() returns the ABS of that, so a user would not be
able to call area() as a robust way to compute the winding of a polygon as,
well, winding() internally does.

IMO this is a mistake: the area() method should return the signed value and let
the user do the abs() WHEN and IF makes sense.

(*)

I don't think it makes sense to name "point_data<>" with the "_data" postfix. I
understand "_traits" etc... but point should be just point IMO.

(*)

I dislike that there is scale and move not being merely convenience warppers of
transform() (also there).

And I *really* dislike the existence of a scale_up() + scale_down() which are
BOTH there simply because they take "integer" factors instead of a fraction
(whether a rational number or a floating point).
Also, scale down *secretly* uses a "rouding_policy", instead of taking it as
argument.

IMO, there should be just scale(), which to add confusion already is but is
really an "applicator" that takes a an scaling function object doing whatever
and applying thee point's coordinate into it. Which is the same transform() does.

But, anyway, IMO there should be just one "scale" taking a floating point
factor, or if you prefer, a rational number (expressed as separate numerator and
denominator integer arguments for instance). The rounding policy would be given
as optional aregument.

There is also the transformation<> (along with the transform() function), and
the isotropic_scale_factor<>, which are naturally related to these scaling
operators. IMO, these is overegineered and leaves the user with too many choices.

> - What is your evaluation of the implementation?

Most implementation problems, such as unqualified calls to ubiqutuous functions,
are a consequence of the design.

Some others are jut details, like the implementation of polygon_data<>::operator
== NOT using the point concepts for comparison (it uses operator== on the point tpe)

There are nitpicking details all over that I won't enumerate, but some are
really important:

(*)
isotropy.hpp sould be renamed like "common.hpp" or such since it is actually a
super header defining lots of stuff totally unrelated to isotropy.

(*)
Likewise, polygon.hpp should be renamed since it is really the mother header of
the library, including all others. I can imagine that it is called "polygon.hpp"
since that's the name of the library which I expected to find on it stuff
specifically related to a polygon data type or concept or such.

(*)

Many classes have "inline" in the method definitions. This is completely
unnecesary since in-class definitions are inline already. IMO this needs to be
removed.

(*)

euclidean_distance() casts to double just because it needs to call a sqrt().
This is just wrong and IMO is a showstopper that totally gets in the way of
reparameterizing the library with a user defined number type.

At the very least, the cast should be removed and let ADL + implicit conversion
do the right thing (there is no sqrt() for integer types, so).
But even this would be wrong, though somewhat acceptable. The only real solution
IMO is to ask the user to provide the right kind of "numeric field type" to
correlate with the "numeric ring type" (which the integer coordinate is) AND ask
the user to provide the right Sqrt for that number type. All that via
coordinate_traits.

But I wouldn't do any of that. Instead I would have the library provide ONLY
squared_euclidean_distance, which would return an integer value, and let the
user deal with the sqrt(). This would mean that "perimeter" would need to go of
course, which I'm totally OK with.

Of course, this pushes the "right way" (IMNSHO) to the user, which is a differnt
game. At least remove the (double) cast.

(*)

There are some structs wrapping enums to make them type safe. That's useful, but
then the structs store the enum value as int (or unsigned int depending on where
you look), while a char would be enough.

(*)

In polygon_traits.hpp there are #includes "in the middle" of the header, after
some initial code. IMO this is wrong and should be fixed. If that preceding code
   was needed before the includes then it should be factored out in a separate
header.

(*)

> - What is your evaluation of the documentation?

As many others said it really needs a good introduction that a newcomer can read
to see what the library is, and perhaps more importantly, is NOT, about.

Also some tutorials, both on using the library out of the box with its own
models and on using it on user-defined types.

And as some reviwers said, the documentation is way too focus on concepts and
its needs to document models as well. In fact, I think most "junior" users won't
be interested at all in adapating their own types but on just using the library
with the types provided, so the documentation for the models should come first,
and the concepts should go after in the chapter discussing adaptation.

> - What is your evaluation of the potential usefulness of the library?

I think it is decently useful even if restricted to polygon set operations in 2D
and with integral coordinates.

I personally don't see the lack of floating-point support particularly
disturbing, but of course, I really think the name should be adjusted.
The same goes for being 2D only. In the big picture this is important because
there are points and polygons in 3D and this library is specifically 2D (it
CANNOT just be extended to cover 3D)

> - Did you try to use the library? With what compiler? Did you have any
> problems?

No, I haven't attempted to use it.

> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?

I made a very carefull in-deep study of the code.

> - Are you knowledgeable about the problem domain?

Yes.

--
Fernando Cacciola
SciSoft Consulting, Founder
http://www.scisoft-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk