Boost logo

Boost :

From: Howard Hinnant (hinnant_at_[hidden])
Date: 2007-04-23 15:16:21


My apologies. It was not my intent to start a mini-firestorm and then
leave town for the next 10 days. ;-)

On Apr 16, 2007, at 4:32 PM, Ion Gaztañaga wrote:

> I'm terrible explaining my points.

I can relate. :-)

There exists a small gap between what was voted into the WP last week
regarding move semantics, and what my intent is. Below I attempt to
explain my intent, and where that differs from what is in the WP, I
will attempt to fix with defect reports.

---
In general, std::types, when dealing with rvalue references which are  
overloaded with lvalue references, the std::code is allowed to assume  
that it is dealing with an actual temporary, and not a moved-from  
lvalue.  Reference:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1858.html#23.1%20-%20Container%20requirements
paragraph 13:
> -13- Those container member functions with rvalue-reference  
> parameters may assume that the reference truly refers to an rvalue,  
> and specifically is not a reference into the same container.
Example:
template > class vector {
...
    iterator insert(iterator position, const T& x);
    iterator insert(iterator position, T&& x);
...
};
Inside the insert overload taking a T&&, the code is allowed to assume  
that x has bound to a temporary.  This means that the code does not  
have to check for the self-referencing case, and is thus faster.  If a  
client moves an lvalue to the insert signature, the onus is on the  
client to make sure that all existing references to that lvalue  
promise not to notice (or care) that the lvalue has been moved from.
void foo()
{
     std::vector<A> v(10);
     v.insert(v.end(), std::move(v.front()));
}
The above is a run time error similar to:
void foo()
{
     std::vector<A> v(10);
     v.insert(v.end(), v.begin(), v.begin()+1);
}
(the latter is currently forbidden by 23.1.1, p4)
---
Now let's look at the other side:
void foo()
{
     A a;
     std::vector<A> v;
     v.insert(v.end(), std::move(a));
     // Here "a" is required to be in a valid state, but the value of  
"a" is unknown
}
The definition of "valid" is up to the author of A.  At a minimum,  
generic code will need to destruct and/or assign "a" a new value.   
However the author of A is free to define as part of the class  
invariant:
     After an A is moved from, you can't call A::bar()
I don't think that is a very good design myself.  But I also don't  
think the standard should prohibit it.
For std::types I do not want to see any such restrictions on moved- 
from values.  For example if a vector gets moved from, I expect it to  
still be fully functional, just with an unknown state (but you can  
test the state with the existing interface, e.g. size(), capacity(),  
empty(), etc.), and set the state however you like.
With perhaps a few exceptions, I do not want to see the value of a  
moved-from std::type specified.  For example:
std::string s1("123");
std::string s2(std::move(s1));
// value of s1 not reliable here.  "123" and "" are two good guesses
---
Originally I preferred vector move assignment as just swap().  However  
I've recently become convinced that this is not the best definition.   
I now prefer the semantics of clear(); swap();.  This means that move  
assignment is O(N) instead of O(1).  Ok Ion, take a deep breath and  
stay with me for just a minute longer. :-)
The semantics of ~thread() is going to be cancel(); detach();.  That  
is, if thread A is holding a std::thread referencing thread B, and  
thread C cancels thread A, as thread A's stack unwinds, it will  
execute b.~thread() which cancels B, but does not wait around for B to  
respond to the cancel.  Thus canceling A will not hang waiting on B to  
finish up.
Now consider vector<thread>:
vector<thread> v1;
...
vector<thread> v2;
...
v1 = std::move(v2);
After the move assign, I think it best that whatever threads v1  
happened to previously refer to, are now busy canceling themselves,  
instead of continuing to run under v2's ownership.  If you want the  
latter, you've got swap.
Cost analysis:  It isn't nearly as bad as it sounds.  Promise!!! :-)
First, for vector<type with trivial destructor>, clear() is O(1) in  
practice (or it should be).  And clear() does not dump capacity, so  
the capacity gets reused via the swap.  So we only have to worry about  
vector<type with non-trivial destructor>.
Consider vector<vector<thread>>::insert.  And we're going to insert a  
new vector<thread> at begin().  The case where there is insufficient  
capacity only uses move construction, and not move assignment.  So  
assume there is sufficient capacity.
In this case we first move construct *--end() into *end().  This move  
construction leaves a zero capacity vector<thread> at *--end() (though  
that value should not be guaranteed, there is really little else  
vector can do for move construction).
The next step is:
    *(end()-1) = std::move(*(end()-2));
Now we already said that *(end()-1) is a zero capacity vector<thread>  
prior to this move assignment.  Therefore when the move assignment  
clears it, that clear is a no-op.  Then the states are swapped, making  
*(end()-2) a zero capacity vector (again, that state should not be  
guaranteed, but it is practical).
Repeat:
    *(end()-2) = std::move(*(end()-3));
Again we are move assigning into a zero-capacity vector, so  
clear();swap(); and just swap(); are virtually identical sequences of  
instructions.  This repeats all the way down to:
   *(begin()+1) = std::move(*begin());
leaving *begin() as a zero capacity vector<thread>.  Then the outside  
vector<thread> is moved assigned into *begin().  The outside temporary  
vector<thread> is then a zero capacity vector and presumably destructs.
Summary:  For vector<vector<thread>>::insert, defining vector move  
assignment as clear()+swap() instead of just swap() has a nearly  
identical cost.  There are no extra destructions.
I've gone through the same analysis with vector::erase, and all  
sequence modifying algorithms in <algorithm> which make use of move  
semantics.  During all of these generic algorithms, move assignment is  
always assigning to a source that has already been moved from.  Thus  
"clearing" the source is a no-op.  And I consider the  
container::insert/erase and <algorithm>'s as one of the biggest  
consumers of move semantics, and even more importantly, as typical use  
cases of move semantics.
In general, if you are move assigning *to* something.  You've probably  
already move assigned or move constructed *from* it.  That being said,  
if you haven't already moved from the target, and the target is a  
std::container or other std::type which owns objects, I think a  
"clear" is prudent so that move assignment has the same semantics as  
copy assignment.
Getting back to shared_ptr move assignment, I would like to see the  
target's reference count decremented if it is not already at 0 prior  
to the move assignment (just as in copy assignment - this is the  
"clear" part of the algorithm).  I would also like to not see any  
*new* constraints placed on the source as a result of being moved  
from.  And finally I do not see the need to specify the value of the  
source after the move.  I would like vendors to have the freedom to  
implement shared_ptr move assignment exactly as they do shared_ptr  
copy assignment.  If atomic operations can be avoided (for example) by  
assuming that the source is a temporary (under move assignment) then  
so much the better.
-Howard

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk