|
Boost : |
Subject: Re: [boost] [smart ptr] Any interest in copy-on-write pointer for C++11?
From: Ralph Tandetzky (ralph.tandetzky_at_[hidden])
Date: 2013-02-13 10:46:37
SUMMARY
=======
I would like to summarize the discussion about copy-on-write so far:
* I wrote a copy-on-write-pointer implementation
<https://github.com/ralphtandetzky/cow_ptr.git> with the following use
cases:
1. A const-correct pimpl pointer.
2. Helper class for implementing copy-on-write for higher level
structures, where copying is expensive. (E.g. matrix or image classes)
3. cow_ptr<Base> wraps polymorphic classes giving them genuine value
semantics, so they can be put into standard containers, even if Base
is abstract.
4. It can be used to add cloning to a class hierarchy non-intrusively.
* Thread-safety was discussed. (brought up by Alexey)
- The reference counter is atomic.
- All constant operations on cow_ptr and its pointee are thread-safe
as long as const operations on the pointee are thread-safe.
- If for the pointee constant operations are thread-safe and if it
is safe to write to a pointee from one thread as long as no one else
is reading or writing, then the same is true for all individual
cow_ptrs pointing to that object and all access though these cow_ptrs.
* Is cow still necessary? (brought up by Mathias)
- Since in C++11 you can move objects cheaply instead of copying
them an important use case of copy-on-write is gone. Before
move-semantics returning objects by value was sometimes a bad
performance issue. Copy-on-write solved that.
- Cow is still useful today for matrix classes or image classes or
even trees that share state under the hood, but should not influence
each other when writing.
- Example: If you want to implement a property tree, you can use the
approach
class PropertyTree : AbstractProperty
{
public:
/* implementation of public interface. */
private:
std::list<cow_ptr<AbstractProperty>> properties;
};
- Having this you can keep a history of a big property tree in
memory easily.
std::vector<PropertyTree> history;
auto current = history.back();
current.modify();
history.push_back( current );
* Is COW unsafe? (brought up by Mathias)
- COW is sometimes considered unsafe. That's why the C++ standard
COW implementations of std::string.
- The code
std::string a("Hello world!");
char * p = &a[11];
std::string b( a );
*p = '.'; // modifies a and b, if std::string was
implemented using COW.
does not work correctly, for COW-implementations of std::string.
- The reason this does not work is the escaped pointer. When
escaping pointers are strictly avoided, this effect cannot happen.
Therefore cow_ptr does not provide a non-const version of the get()
member function, but a member function modify() (formerly known as
apply()) which can be used in the following way:
cow_ptr<MyType> p( new MyType );
auto q = p;
p.modify( [&]( MyType * p ){ p->doSomething();
p->doSomethingElse(); } );
COW_MODIFY(p) { p->doSomething(); p->doSomethingElse(); };
// equivalent to the line above
- It is still possible for a pointer to escape, but the interface is
such that it is easy to use it correctly and hard to use incorrectly.
- The interface design of std::string prevents the possibility for
implementing it correctly. Hence COW must be considered during
interface design phase of a class.
* Alternatives to COW (brought up by Mathias)
- C++11 move and cloning.
-Most often unnecessary copies can be avoided using C++11
move-semantics and cloning where necessary.
- Flyweight factory.
- Objects are accessed by a hash value. There's always only one
copy of identical objects. For complex objects that are modified
often recalculating the hash and synchronizing the hash table
thread-safely can be a bad performance bottleneck.
- shared_ptr<T const>
- Even with shared_ptr<T const> you never know, if there's a
shared_ptr<T> object (non-const) through which the pointee is
modified. shared_ptrs are really shared. It is likely more error
prone to use shared_ptr to implement COW. If T is a polymorphic
class but does not have a clone() member function, then cloning
will not work properly because of slicing. shared_ptr is useful
for many things, but it's probably not the best tool to
implement COW.
* The name (brought up by Peter)
- cow is an acronym and lower case. It's a farm animal ... enticing
me to write member function names like "moo". The name does not
reflect the ability to contain polymorphic value pointers. (Peter)
- clone_on_write<T> would be a suggestion of mine. It might be
useful to drop the _ptr suffix completely, since the class has value
semantics.
- Others have suggested to split cow_ptr<T> into a read_ptr<T> and
write_ptr<T> classes.
* Slicing problems (brought up by Vincente)
- The constructor taking an Y * pointer might lead to slicing
problems, if the pointee is not an Y object, but somethings derived.
- The default_copier will make a runtime-check assert( typeid(*p) ==
typeid(Y) ).
* Comparison to adobe::copy_on_write<T>
<http://cppnow.org/session/value-semantics-and-concepts-based-polymorphism/>
(brought up by Andreas)
- This class is constructed by moving a T object into itself.
Copying is implemented as cheap copy of a pointer with reference
counting.
- other than constructors, destructors and assignment operators
there are only the public member functions read() and write().
read() returns a const reference to the contained object, write()
makes an internal copy, if the reference count is greater than 1,
and then returns a non-const reference to the contained object.
- The class does not support cloning for polymorphic T, but always
uses the copy-constructor of T in order to copy.
- Hence the class interface is extremely simple.
* Comparison to value_ptr<T>
<http://www.google.de/url?sa=t&rct=j&q=n3339&source=web&cd=3&sqi=2&ved=0CD4QFjAC&url=http%3A%2F%2Fwww.open-std.org%2Fjtc1%2Fsc22%2Fwg21%2Fdocs%2Fpapers%2F2012%2Fn3339.pdf&ei=umkbUabjN6nh4QTAmoCYAg&usg=AFQjCNGikPTGbnWijae8tzd1KTLvz1C63Q>
in N3339 (open-std) (brought up by Vincente)
- Basic properties: A value_ptr<T> mimics the value semantics of its
pointee. Hence the pointee lifetime is the pointer lifetime, and the
pointee is copied whenever the pointer is copied. Internally the
pointee can be of a derived class of T. In this case the object is
cloned properly.
- Hence value_ptr<T> has the use-cases 1, 3 and 4 of cow_ptr<T>, but
does not implement copy-on-write (use case 2).
- value_ptr has the cloner and the deleter as template arguments of
the class. The current implementation of cow_ptr only has the
pointee type as template parameter. The cloner and deleter are
stored dynamically.
- value_ptr does not have a reference counter.
- Other than that value_ptr<T> and cow_ptr<T> are extremely similar
from the public interface.
- In conjunction with copy_on_write<T> this can be used to do the
same stuff as cow_ptr<T> does. The way to use it would be
copy_on_write<value_ptr<T>>.
* pointer-semantics or value-semantics and nullptr (brought up by Vincente)
- Should the COW-class be nullable? If not, then it should probably
not be called cow_ptr.
- This question has not been discussed to the end yet. Personally, I
don't think that null-cow_ptrs are very useful.
* Different member and non-member functions (brought up by Vincente)
- relational operators (brought up by Vincente)
- It is not clear, whether operator==() on cow_ptrs should only
compare pointers or also pointees. This would depend on whether
the COW-class is considered a pointer or a genuine value.
- release()
- Should not exists, because the callee would not know what
deleter to call. (similar to shared_ptr)
- reset()
- Will be implemented in order to provide the performance benefits.
* The write_ptr<T> and read_ptr<T> solution (brought up by Peter)
- read_ptr<T> would be similar to shared_ptr<T const> and
write_ptr<T> would be a unique_ptr<T> equivalent. read_ptr<T> has a
member function which returns a write_ptr<T> through which the
pointee can be modified. Afterwards the write_ptr<T> can be moved
back into the read_ptr<T>:
read_ptr<T> pr;
if ( write_ptr<T> pw = pr.write() )
{
pw->modify();
pr = std::move( pw );
}
- This possibly provides a better separation of concerns (safer,
clearer, more flexible).
- However, the above code is not exception-safe, if pr becomes a
nullptr when the write() function is called. It makes exception-safe
code harder to write.
- In case the use_count is greater than 1: Should pr.write() make
the copy? Or should pw.operator->() make the copy? This is not
sufficiently discussed yet.
Thank you for all your constructive feedback!
Ralph
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk