|
Boost : |
From: Larry Evans (jcampbell3_at_[hidden])
Date: 2003-05-28 12:53:35
Chuck Messenger wrote:
[snip]
> collections, put them in object heirarchies, etc). This freedom should
> ideally apply both internally (within library L code) and most
> importantly, externally (in the code of users of library L). Crucially,
Would you require the users to use a smart pointer instead of a raw?
If not, then you're only option, that I can tell, is use a conservative
collector like Boehm's.
> {C} is recursive -- circular references are normal, in particular
> externally (which we can't control). I want to ensure the destruction
> of any {C} objects which are not ultimately reachable externally.
> Periodically invoking a garbage-collector is fine.
>
> -------------------------------------------------
>
> boost/libs/smart_ptr/src/sp_collector.cpp:
>
> There is no sample program to compile/run, so I have to guess somewhat
> at how to use this one.
>
> The basic idea is to intercept all memory allocations -- p = new X; --
> saving the info with "map[p] = sizeof(X);". To find the
> interconnections between objects, you do this:
>
> for (map_type::iterator it = map.begin(); it != map.end(); ++it) {
> char *p = it->first;
> unsigned size = it->second;
> for (char *v = p; v + 4 <= p + size; v += 4) {
> char *q = *(reinterpret_cast<char **>(v));
> if (map.count(q)) {
> // connection detected from object p to object q
> }
> }
> }
This is similar to Mirek Fidler's code:
http://groups.google.com/groups?q=OGC2+-+major+improvement+group:comp.lang.c%2B%2B.moderated&hl=en&lr=&ie=UTF-8&group=comp.lang.c%2B%2B.moderated&selm=3DCD36C3.4090502%40nospam_prodigy.net&rnum=1
>
> That is, for each known object p, you scan its allocated memory,
> examining each 4-byte chunk, q. If map[q] exists, then you assume q
> refers to an object, and therefore, that a connection exists from p to
> q. Given this map, you can detect stranded pools of objects (using
> dynamic programming, presumably, although I don't see any in
> sp_collector.cpp). You then run this algorithm periodically to detect &
> destroy stale objects.
>
> You might object that you'll sometimes mis-identify q as an object, when
> in fact it is just some random data. True, although tricks can be (and
> are, in sp_collector.cpp) used to minimize this to a vanishingly small
> likelihood (i.e. by marking legit pointers with a signature byte
> sequence). The consequence of mis-identification is that you may fail
> to destroy some unused objects. If this leads to nothing worse than
> some leaked memory, then it's not a real problem -- it will be a
> vanishingly small amount of memory.
This is the justification for Boehm's conservative collector.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
>
> Another problem is, how do you track external references to your
> objects? Indeed, that's the whole goal -- to find which objects are
> ultimately referenced from the "outside world" (and to destroy the
> rest). I don't see how sp_collector.cpp does this. Sample code would
> be nice -- then I could quickly watch in the debugger and see. Perhaps
> it depends on the fact that only boost::shared_ptr's are supported, and
> these store their info in allocated memory. I don't know - just
> speculating.
>
> One big problem with this approach is that you end up having to scan all
> of your memory. This could (and for me, would) be an outrageous
> proposition, as only a tiny portion of memory relates to my object set.
> Most of it will be raw data (e.g. images, etc).
This is what Christopher's (alias= global rc mark-scan) algorithm does.
It's also what mark-sweep algorithms do, i.e. they have to mark all reachable
objects, and then scan thru the whole heap sweeping the unmarked into the
garbage collector. However, since this global scan is done, usually,
infrequently, the time isn't that big a factor, at least compared with the
updating of reference counts during each pointer copy.
That's why Boehm's gc should work faster than refcounting.
>
> ------------------------------------------------------------------------
>
> My own semi-uber pointer implementation (concept):
>
[snip]
> Still, that's something. It would be necessary to implement your own
> containers for {C} objects, for use within I:C objects -- these
> containers would themselves be {C} objects. A very simple container you
> get "for free" is an array, which might be enough for some problem
> spaces. Note that externally, you could use regular STL containers, or
> anything you want.
This is the solution implemented (by simply deriving from the stl container)
by an earlier upload to boost...files/shared_cyclic_ptr. I've just
uploaded them again in file, shared_cyclic_ptr.zip. Explanations
on output of test are in iplimits.out. The solution for STL containers
is in template scoped_cyclic_container in file shared_cyclic_ptr.hpp.
[snip]
> ---
>
> Note: any {C} objects on the stack will be considered to be external
> (hence, not garbage-collected), even if they are on the stack in
> "internal" (i.e. library) code. That is exactly as it should be. After
> all, we can't go deleting objects which are being actively referenced!
This is the reason for the "global rc mark-scan" (the one used by cyclic_ptr)
algorithm working.
>
> ----------------------------------------------------------------------
>
> cyclic_ptr.hpp in Boost contributions
>
> I studied this a while, but must confess I couldn't wrap my head around
> it. I did find one interesting thing: the way cyclic_ptr "traces"
> pointer references is by doing the following:
>
> T *p = reinterpret_cast<T*>(raw memory pointer);
> *p = *p;
>
> that is, by copying an object onto itself. This causes copy
> constructors to be called on all contained objects. The copy
> constructors are the "hook" by which any embedded cyclic_ptr's are
> detected.
Don't you mean the operator=(T const&)? At least that's the
way I understand it.
[snip]
> -----------------------------------------------------------------------
>
> NoPtr lib -- noptrlib.sourceforge.net
>
> From what I can tell, this library is not at all suitable for what I
> want. It appears to be a set of traditional smart pointers -- with
> components which are somewhere between auto_ptr and shared_ptr. I saw
> no mention of detecting dead cyclic object pools.
>
The users of such a "sole-owner" gc method believe this happens rarely:
http://suif.stanford.edu/pipermail/suif-talk/2000-January/000491.html
However, I agree that it can happen, and if it can, it will (that's my
experience).
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk