Boost logo

Geometry :

Subject: Re: [geometry] accessing uninitialized memory is UB/boost.geometry currently requires well-defined default initialization
From: Barend Gehrels (barend_at_[hidden])
Date: 2016-04-10 12:54:16


Hi Volker,

Op 5-4-2016 om 11:52 schreef Volker Schöch:
>
> Hi,
>
> As of boost 1.59 (and presumably boost 1.60 is the same),
> boost.geometry assumes that a default-constructed point_type is
> well-defined. For instance, struct segment_intersection_points defined
> in boost\geometry\strategies\intersection_result.hpp may be copied
> while count < 2, i.e., when some points have not yet been assigned a
> well-defined value. As we all know, with uninitialized memory, any
> access including copy or move is undefined behavior.
>
> At think-cell, we believe in tight life cycles and undefined behavior.
> For all purposes in our own code, default-constructed variables of
> point_type are uninitialized rather than default-initalized, meaning
> not only that any value stored in there must not be used, but also
> that the contents of those variables must not be copied or moved.
>

Aren't these different things?

To answer your main question (I think), the point type, supplied as a
default point type by Boost.Geometry (but it is possible to define your
own types), is not initialized, indeed on purpose, so I think we agree here.

However, it does not keep track that it is initialized or not, so it can
be copied (uninitialized) without any problem, so a copy will contain
the same garbage as the input does.

The library user is responsible to initialize the point type, exactly
the same way as he is responsible to initialize any variable like int,
double, etc.

I don't think behaviour is changed at 1.59

This is all about the point type supplied by the library
(geometry::model::point), libraries are free to use their own point
types and the might or might not be initialized.

> This works very well for our purpose and helps to discover bugs that
> otherwise are still there but go unnoticed. In theory, this paradigm
> also facilitates additional compile-time optimization which may or may
> not be leveraged in current compilers. In our opinion,
> default-initialization to, e.g., zero, is quite arbitrary and
> meaningless. If you want to construct an object of point_type with
> some specific value, it is entirely adequate to require that this
> specific value be explicitly stated upon construction.
>
> In this regard, we follow the gist of the C++ standard: “8.5.12: If no
> initializer is specified for an object, the object is
> default-initialized. When storage for an object with automatic or
> dynamic storage duration is obtained, the object has an indeterminate
> value, and if no initialization is performed for the object, that
> object retains an indeterminate value until that value is replaced
> (5.18). […] If an indeterminate value is produced by an evaluation,
> the behavior is undefined except in the following cases: [… all cases
> only apply to unsigned char …]”
>
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4582.pdf
>
> That said, I have some questions to the boost.geometry team:
>
> -Can you tell if you actually use the values that happen to be found
> in a default-constructed object? Does your code rely on specific
> values as the result of default construction, e.g., zero?
>

No

> -Do you think you could avoid access to default-constructed objects,
> e.g., by using boost::optional, or variant, or dynamically sized
> containers, or some other obvious tools?
>

The library is not designed like this. It is designed to be
concept-based, any point might be used (even a plain c-array, though
that can not be used in polygons for example). Because of that, we
cannot make any assumption about the initialization of the contents. It
can be initialized, or not, and in the last case also the output will be
garbage.

> Currently, we use some work-around and provide a special point_type
> class for boost.geometry for the sole purpose of
> default-initialization to keep your code from entering UB.
> Perspectively, we’d love to remove that workaround and simply use our
> own point_class which after default-construction contains
> uninitialized memory.
>

So you can remove that workaround, however, it is (of course...) better
to not supply uninitialized points to the library, as its behaviour is
then undefined...

> It would be very helpful to know where you are heading: Do you think
> that our approach is a good idea, or do you see some benefits to
> having default-constructed objects initialized with some arbitrary but
> specific value? If so, would you amend your documentation to
> explicitly state that default-constructed objects must be
> well-defined, and which specific value (zero?) you expect?
>

The library already states that points are not initialized in the doc:
http://www.boost.org/doc/libs/1_60_0/libs/geometry/doc/html/geometry/reference/models/model_point.html

We can add something that it is the users responsibility to supply an
initialized point here:
http://www.boost.org/doc/libs/1_60_0/libs/geometry/doc/html/geometry/reference/concepts/concept_point.html

Regards, Barend



Geometry list run by mateusz at loskot.net