|
Boost : |
From: Howard Hinnant (hinnant_at_[hidden])
Date: 2007-04-23 15:16:21
My apologies. It was not my intent to start a mini-firestorm and then
leave town for the next 10 days. ;-)
On Apr 16, 2007, at 4:32 PM, Ion Gaztañaga wrote:
> I'm terrible explaining my points.
I can relate. :-)
There exists a small gap between what was voted into the WP last week
regarding move semantics, and what my intent is. Below I attempt to
explain my intent, and where that differs from what is in the WP, I
will attempt to fix with defect reports.
--- In general, std::types, when dealing with rvalue references which are overloaded with lvalue references, the std::code is allowed to assume that it is dealing with an actual temporary, and not a moved-from lvalue. Reference: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1858.html#23.1%20-%20Container%20requirements paragraph 13: > -13- Those container member functions with rvalue-reference > parameters may assume that the reference truly refers to an rvalue, > and specifically is not a reference into the same container. Example: template > class vector { ... iterator insert(iterator position, const T& x); iterator insert(iterator position, T&& x); ... }; Inside the insert overload taking a T&&, the code is allowed to assume that x has bound to a temporary. This means that the code does not have to check for the self-referencing case, and is thus faster. If a client moves an lvalue to the insert signature, the onus is on the client to make sure that all existing references to that lvalue promise not to notice (or care) that the lvalue has been moved from. void foo() { std::vector<A> v(10); v.insert(v.end(), std::move(v.front())); } The above is a run time error similar to: void foo() { std::vector<A> v(10); v.insert(v.end(), v.begin(), v.begin()+1); } (the latter is currently forbidden by 23.1.1, p4) --- Now let's look at the other side: void foo() { A a; std::vector<A> v; v.insert(v.end(), std::move(a)); // Here "a" is required to be in a valid state, but the value of "a" is unknown } The definition of "valid" is up to the author of A. At a minimum, generic code will need to destruct and/or assign "a" a new value. However the author of A is free to define as part of the class invariant: After an A is moved from, you can't call A::bar() I don't think that is a very good design myself. But I also don't think the standard should prohibit it. For std::types I do not want to see any such restrictions on moved- from values. For example if a vector gets moved from, I expect it to still be fully functional, just with an unknown state (but you can test the state with the existing interface, e.g. size(), capacity(), empty(), etc.), and set the state however you like. With perhaps a few exceptions, I do not want to see the value of a moved-from std::type specified. For example: std::string s1("123"); std::string s2(std::move(s1)); // value of s1 not reliable here. "123" and "" are two good guesses --- Originally I preferred vector move assignment as just swap(). However I've recently become convinced that this is not the best definition. I now prefer the semantics of clear(); swap();. This means that move assignment is O(N) instead of O(1). Ok Ion, take a deep breath and stay with me for just a minute longer. :-) The semantics of ~thread() is going to be cancel(); detach();. That is, if thread A is holding a std::thread referencing thread B, and thread C cancels thread A, as thread A's stack unwinds, it will execute b.~thread() which cancels B, but does not wait around for B to respond to the cancel. Thus canceling A will not hang waiting on B to finish up. Now consider vector<thread>: vector<thread> v1; ... vector<thread> v2; ... v1 = std::move(v2); After the move assign, I think it best that whatever threads v1 happened to previously refer to, are now busy canceling themselves, instead of continuing to run under v2's ownership. If you want the latter, you've got swap. Cost analysis: It isn't nearly as bad as it sounds. Promise!!! :-) First, for vector<type with trivial destructor>, clear() is O(1) in practice (or it should be). And clear() does not dump capacity, so the capacity gets reused via the swap. So we only have to worry about vector<type with non-trivial destructor>. Consider vector<vector<thread>>::insert. And we're going to insert a new vector<thread> at begin(). The case where there is insufficient capacity only uses move construction, and not move assignment. So assume there is sufficient capacity. In this case we first move construct *--end() into *end(). This move construction leaves a zero capacity vector<thread> at *--end() (though that value should not be guaranteed, there is really little else vector can do for move construction). The next step is: *(end()-1) = std::move(*(end()-2)); Now we already said that *(end()-1) is a zero capacity vector<thread> prior to this move assignment. Therefore when the move assignment clears it, that clear is a no-op. Then the states are swapped, making *(end()-2) a zero capacity vector (again, that state should not be guaranteed, but it is practical). Repeat: *(end()-2) = std::move(*(end()-3)); Again we are move assigning into a zero-capacity vector, so clear();swap(); and just swap(); are virtually identical sequences of instructions. This repeats all the way down to: *(begin()+1) = std::move(*begin()); leaving *begin() as a zero capacity vector<thread>. Then the outside vector<thread> is moved assigned into *begin(). The outside temporary vector<thread> is then a zero capacity vector and presumably destructs. Summary: For vector<vector<thread>>::insert, defining vector move assignment as clear()+swap() instead of just swap() has a nearly identical cost. There are no extra destructions. I've gone through the same analysis with vector::erase, and all sequence modifying algorithms in <algorithm> which make use of move semantics. During all of these generic algorithms, move assignment is always assigning to a source that has already been moved from. Thus "clearing" the source is a no-op. And I consider the container::insert/erase and <algorithm>'s as one of the biggest consumers of move semantics, and even more importantly, as typical use cases of move semantics. In general, if you are move assigning *to* something. You've probably already move assigned or move constructed *from* it. That being said, if you haven't already moved from the target, and the target is a std::container or other std::type which owns objects, I think a "clear" is prudent so that move assignment has the same semantics as copy assignment. Getting back to shared_ptr move assignment, I would like to see the target's reference count decremented if it is not already at 0 prior to the move assignment (just as in copy assignment - this is the "clear" part of the algorithm). I would also like to not see any *new* constraints placed on the source as a result of being moved from. And finally I do not see the need to specify the value of the source after the move. I would like vendors to have the freedom to implement shared_ptr move assignment exactly as they do shared_ptr copy assignment. If atomic operations can be avoided (for example) by assuming that the source is a temporary (under move assignment) then so much the better. -Howard
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk