Boost logo

Boost :

Subject: Re: [boost] [interest] underlying type library
From: Sebastian Redl (sebastian.redl_at_[hidden])
Date: 2011-08-22 08:55:13


On 22.08.2011 13:04, Julian Gonggrijp wrote:
> Nevin Liber wrote:
>> Even examining the implementation for all your member variables isn't
>> enough. The boost::function which holds a boost::bind(..., this, ...) where
>> the function object is stored inside the boost::function object itself may
>> now exhibit different semantics than when the function object it is holding
>> is in the heap. Ugh.
> Users should simply be warned that if their type has member data that
> were implemented by a third party and which are initialized with a
> pointer to (a part of) the main object, they should assume their type
> to depend on the object's memory location for its validity.
There is nothing simple about this warning. It's a pretty complicated
condition.
>> Way back when, C++ did bitwise copying for the compiler generated copy
>> operations, and was changed to member wise copying for good reason. I
>> really don't want to go back to that world.
> Let me reassure you that move_raw is not inherently about bitwise
> copying; you can do exactly the same with memberwise copying but that
> requires the author of the type to provide their own implementation.
As far as I can tell, in fact, in Stepanov's paper move_raw has
absolutely nothing to do with bitwise copying.

> Sebastian Redl wrote:
>> On 21.08.2011 21:23, Julian Gonggrijp wrote:
>>> Dear all,
>>>
>>> I think the set of types with which bitwise move_raw will yield
>>> undefined behaviour can be sharply defined:
>> The standard already does. Undefined behavior is a notion of the standard, not of a particular implementation. The standard says that this technique works only for PODs (trivially copyable types in 0x), nothing else.
> Doesn't the standard say that bitwise copying of non-PODs is undefined
> behaviour /because there are cases where bitwise copying will give you
> an invalid result/?
No, the standard doesn't give a reason.
> Have we not identified the cases for which an
> invalid result will be obtained? Can we therefore not maintain --
> making use of the semantical definition of move_raw and restricting
> ourselves to the set of unproblematic types -- that bitwise copying is
> a safe implementation of move_raw even though in a very strict
> juridical sense it may be undefined behaviour?
Absolutely not.
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
Undefined behavior isn't just about "a very strict juridical sense" of
things. Compilers are allowed to assume that UB doesn't happen. If you
memcpy over a non-POD, the compiler is allowed to assume the whole
branch containing the memcpy is dead code - it cannot ever be reached,
because reaching it would invoke UB. Let's say you have this:

if (x != 42) {
   memcpy(&nonpod, &source, size);
} else {
   other_code();
}
std::cout << x << std::endl;

The behavior of the memcpy is undefined. As such, the compiler can
generate any code it wants for this branch - like the code of the second
branch, thus eliminating the branching entirely.
But not only that. In fact, because the compiler will assume that the
program is valid, and entering the memcpy branch would invoke UB, it can
deduce that x cannot possibly be anything but 42! That is, the std::cout
could output "42" even if you set x to something other than 42, because
the optimizer replaced all occurrences of x with the constant 42.

Now, I don't know any compiler that is actually that strict, especially
with memcpy, but my point is that *there is no such thing as benign
undefined behavior*.

>> Conceptually, if you think of classes as "aggregates and then some",
>> there is the underlying aggregate.
> I'm not thinking of classes in that way. The underlying type is
> something different from an "underlying aggregate" (although I realise
> I have given that impression in my first email; my apologies).
> Instead, the underlying type is defined by its semantic relation to
> the 'overlying type' as I have also stated in my reply to Mathias
> Gaunard (http://lists.boost.org/Archives/boost/2011/08/184913.php):
>
> The underlying type of T is a POD which can store the state of an
> instance of T.
>
>
> So far, it seems that those who have read Stepanov's paper are more
> positive about the possible value of my proposal than those who didn't
> read it. This seems to confirm that Stepanov is still better at
> explaining the value of move_raw than I am.
OK, here's the problem I see.
Stepanov's approach requires quite a bit of manual interaction from the
user. Not only requires it that the user defines underlying_type for
types where the copy constructor isn't good enough, it also requires
three separate overloads of move_raw.
IIUC, your library attempts to solve this task generically, by providing
an underlying type and move_raw implementations that work with all or at
least most types.
C++03 provides two generic ways of transferring data from one place to
another. One is the copy constructor (and the whole point of move_raw is
that copying is a waste of time). The other is memcpy, which is
undefined for non-PODs. (Note that for PODs, the copy constructor is
always good enough.)
Therefore, what services can your library provide?
- It cannot provide a perfect underlying type: it would require
introspection, and such mechanisms are basically non-existent in C++. In
fact, I do not see how you could provide any approximation beyond what
Mathias proposed without the user effectively doing the work, but I'm
interested in your approach.
- It cannot provide a reliable, generic move_raw. I pointed out problems
with memcpy(), and others did as well. Some people on this list might
see these as minor or theoretical, but I don't.
What, then, does your library still do? I can see it offering a standard
way of implementing underlying_type<>. That's nice, but I'd rather
implement proper move construction/assignment and fix my type if it
doesn't have a cheap default constructor. Boost.Move makes this possible
in 03. I can see it offering a memcpy()-based move_raw for opt-in, but I
don't think it's a good idea to do that. What's left?

In my opinion, C++0x's move support has obsoleted the move_raw concept.
Yes, there may be some types for which move_raw is more efficient,
because it's expensive to bring a moved-from object into a valid state.
But such types will, I believe, die out.

> Therefore I invite those who haven't read the paper yet to still do
> so. Only lectures 4 and 5 are required; it's only 11 pages with some
> trailing source code that you can skip. It's a very enjoyable read and
> I almost dare to promise you that you'll want to read the rest of the
> paper as well. :-)
>
> The URL: http://www.stepanovpapers.com/notes.pdf
>
This is a very interesting paper, thank you.

Sebastian


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk