Boost logo

Boost :

Subject: Re: [boost] Review of a safer memory management approach for C++?
From: Bartlett, Roscoe A (rabartl_at_[hidden])
Date: 2010-06-04 15:46:53


David,

I think we are converging ...

> -----Original Message-----
> From: David Abrahams [mailto:dave_at_[hidden]]
> Sent: Friday, June 04, 2010 12:06 PM
> To: Bartlett, Roscoe A
> Cc: boost_at_[hidden]
> Subject: Re: Review of a safer memory management approach for C++?
>
> ...
>
> > I am not trying to put words in your mouth. I am just trying to pin
> > down your argument/position. You can say that I have a "bias" for
> > good software if you would like :-)
>
> I'm not sure what that is supposed to mean. Isn't it safe to assume
> that everyone here has that bias?

[Bartlett, Roscoe A]

It was a joke (a bad one I guess).

> > As argued in the Teuchos MM document
> >
> > http://www.cs.sandia.gov/~rabartl/TeuchosMemoryManagementSAND.pdf
> >
> > these boost classes are a good start but they are *not* sufficient
> > to build a safe and effective memory management system that
> > encapsulated *all* raw pointers. What is missing for single
> > objects, as described in Section 5.14 and other sections, is the
> > Teuchos::Ptr class. Without integrating a class like Ptr in with
> > the reference-counted pointer classes, you can't build a safe and
> > effective memory management system in C++ that comes closer to
> > eliminating undefined behavior.
>
> Yes, wrapping every pointer and adding debug-mode-only checks for
> invalid usage is one way to help programmers debug their code. In a
> strict sense, though, eliminating undefined behavior requires keeping
> the checks on in optimized code.

[Bartlett, Roscoe A]

We are getting at some core issues here. In order to eliminate bad undefined behavior in production code using the Teuchos MM approach, we have to assume the following:

1) Very good unit, integration, and other verification tests exist such that if they pass with flying colors in a debug-mode build, then the behavior of the software is shown to be likely free from undefined memory usage behavior.

2) Validation of user input is left in the production code in major system boundaries to ensure that preconditions are not being violated. Therefore, if the code gets past accepting the user input, there is a very high probability that it will run free of undefined behavior (i.e. that can cause segfaults if we are luckly but cause much worse if we are unlucky).

If #1 and #2 are satisfied, then there is a very high probability that the non-debug optimized version of the code will also be free of undefined memory usage behavior. The Teuchos MM classes are designed such that valid programs without debug-mode checking turned on will also be valid programs when debug-mode checking is turned on. This is key.

> By the way, I actually like undefined behavior. It creates a clear
> boundary between valid and invalid usage. If all usages have defined
> behaviors, then neither the compiler nor runtime tools are allowed to
> interfere with such usages to help the user detect bugs. So I'm not
> in favor of eliminating it, at least not without some other measure of
> what's illegal.

[Bartlett, Roscoe A]

I am fine with the language allowing for undefined behavior. Clearly if you want the highest performance, you have to turn off array-bounds checking, for instance, which allows for undefined behavior. What I am not okay with is people writing programs that expose that undefined behavior, especially w.r.t. to usage of memory. Every computational scientist has had the experience of writing code that appeared to work just fine on their main development platform but when they took to over to another machine for a "production" run on a large (expensive) MPP to run on 1000 processors, it segfaulted after having run for 45 minutes and lost everything. This happens all the time with current CSE software written in C, C++, and even Fortran. This is bad on many levels.

Are you suggesting that people should write programs that rely on undefined behavior or are you just saying that you don't think it should be eliminated from the language?

> > Also, as described in Section
> > 5.9.2, separating weak_ptr from shared_ptr is not ideal from a
> > design perspective since it reduces the generality and reusability
> > of software (see the arguments for this in the document).
>
> I see the arguments. They boil down to, “there are complex cases
> where weakness needs to be determined at runtime rather than compile
> time.” IMO any design that uses reference counting but can't
> statically establish a hierarchy at ownership is broken. Even if the
> program works today, it will, eventually, acquire a cycle that
> consists entirely of non-weak pointers, and leak. So that *shouldn't*
> be convenient to write, so it's a problem with Teuchos that it
> encourages such designs. The fact that shared_ptr and weak_ptr are
> different types is a feature, not a bug. Furthermore if you
> really-really-really-really need to do that with boost, it's easy
> enough to build a type that acts as you wish
> (e.g. boost::variant<shared_ptr<T>, weak_ptr<T> >).

[Bartlett, Roscoe A]

I have concrete use cases where it is not obvious that an RCP should be strong or weak w.r.t to a single class. One use case is where a Thyra::VectorBase object points to its Thyra:VectorSpaceBase object but where the scalar product of the Thyra::VectorSpaceBase object can be defined to be a diagonal p.d. matrix with the diagonal being another Thyra::VectorBase object. This creates a circular dependency between the diagonal VectorBase and the VectorSpaceBase objects that can only be resolved by making the RCP pointing to the VSB object weak. This is an unusual use case that the developers of the VectorBase and VectorSpaceBase subclasses should not have to worry about or change their design to handled. There are other examples too that I can present in other Trilinos packages.

This point is debatable but in having to chose between shared_ptr and weak_ptr you are injecting notions of memory management into the design of a class that in many cases is orthogonal to the purpose of the class. This technically violates the Single Responsibility Principle (SRP). By using RCP consistently, you don't violate SRP or the Open-Closed Principle (OCP). External collaborators make the final decision about strong or weak ownership.

We can agree to disagree but I can show other examples where having the feature where RCP can be weak or strong (decided at runtime) solved a circular reference problem without damaging the cohesion of the classes involved or even requiring a single line of code be changed in those classes. You can't do that with shared_ptr and weak_ptr without something like boost::variant<shared_ptr<T>, weak_ptr<T> > (which would require changing the code that currently uses shared_ptr or weak_ptr).

> > Did you read at least the sections called out in the reduced table
> > of contents given in the Preface? I think these sections should
> > address all of these questions (and more).
>
> I am afraid I haven't had time, and probably won't until the end of
> next week.

[Bartlett, Roscoe A]

I would be grateful to get feedback on the material.

> > > > When runtime performance or other issues related to dynamic
> > > > allocations and classic OO are not a problem, classic OO C++
> > > > programs using runtime polymorphism are typically superior to
> highly
> > > > templated C++ programs using static polymorphism
> > >
> > > Straw man; I'm not arguing for ?templated C++ programs using static
> > > polymorphism.?
> >
> > [Bartlett, Roscoe A]
> >
> > Okay, the runtime vs. compile-time polymorphism debate is another
> > issue (but a very important one). Are you instead arguing for a
> > static and stack-based approach to programming?
>
> No, I'm arguing for a value-based approach. If you're strict about
> it, you even turn references into values using type erasure… although
> doing so is often unfortunately so tedious that I wouldn't insist on
> it.

[Bartlett, Roscoe A]

Value types, by definition, involve deep copy which is a problem with large objects. If a type uses shallow copy semantics to avoid the deep copy then it is really no different than using shared_ptr or RCP. If you store raw references, you are just doing the same thing as using RCP and shared_ptr except that you have no safely mechanism to catch dangling references (which you do with weak RCPs, but not with always-strong shared_ptrs). Raw (or otherwise) dumb references are *not* the solution for addressing sharing of large objects in persisting associations (value types or not).

> See http://stlab.adobe.com/wiki/index.php/Papers_and_Presentations,
> particularly
>
> * Classes that work video: http://my.adobe.acrobat.com/p53888531/
>
> * Thread Safety pdf (expanded version of slides in “Classes that
> work”):
>
> http://stlab.adobe.com/wiki/index.php/Image:2008_09_11_thread_safety.pd
> f
>
> * A Possible Future of Software Development video:
> http://www.youtube.com/watch?v=4moyKUHApq4

[Bartlett, Roscoe A]

I will take a look at these references.

Clearly sharing mutable objects is a problem in multi-threaded programs but most OO designs would push threading into lower-level functionally and therefore threading would not typically affect top-level software architecture (where OO is typically the best approach to glue everything together). Clearly stack-based approaches have less problems in threaded programs where the object is created, used, and destroyed just within a single thread.

Cheers,

- Ross


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk