Boost logo

Boost :

From: Larry Evans (cppljevans_at_[hidden])
Date: 2007-11-21 10:50:42


On 11/21/07 02:41, Mingnan Guo wrote:
>
> 11/21/07 014:03 "Larry Evans" <cppljevans_at_[hidden]> wrote:
> > On 11/20/07 07:21, Mingnan Guo wrote:
>>> The HnxGC Library for C++ is released at
>>>
>>> http://hnxgc.harnixworld.com
[snip]

> So, based on the HnxGC algorithm, I am answering your questions as
> follows:

> (1)
> > which contains:
> >
> > More specifically, we use reference counting technique to
count the
> > number of references from outside of the managed heap. When the
> >
> > suggests a similarity to the three-phase cyclic refcounting garbage
> > collectors mentioned on p. 62 of the 1996 edition of:

> I did *NOT* suggest such similarity.

I apologize. I used the wrong word. I should have said "it sounds
like it may be similar to the three-phase ...".

> In HnxGC, the counts reflecting external pointers to an object is
> directly maintained by CLockedPtr<> smart pointer throughout the
> execution of application. At any point of time, we maintain a
> count for every object reflecting the number of reference from
> root set (external pointers). So, when performing tracing
> collection, the root objects have already identified.

> In your quotation of three-phase cyclic refcounting algorithm, the
> count only reflect external pointers to nodes at the end of the
> first traversal, which I think it means the counts are calculated
> when performing tracing collection

Yes. That's also one reason for what I think maybe the slowness,
although I hadn't benchmarked it vs. Boehm's collector. Actually I
did have, at one time, a benchmark of Lin's method which did show the
quadratic behavior mentioned in:

http://www.research.ibm.com/people/d/dfb/papers/Bacon01Concurrent.pdf

> , as described in Peter Dimov's paper

> http://www.pdimov.com/cpp/shared_ptr_0x.html#cycles.

> The following is from the page: [Note: The reachability analysis
> may be performed using Christopher's algorithm. Using the instance
> graph, compute the use counts the objects would have if there were
> no external shared_ptr instances. Compare the computed use counts
> with the actual use counts. For every mismatch, mark the object as
> reachable from the outside. Propagate reachability throughout the
> graph. --end note]

> (2)
> >However, after briefly looking at the code:
> >
> > http://www.pdimov.com/cpp/enable_sp_collect.cpp
> >
> > I can't tell where the trial decrements for each external
reference is
> > done, which is what both [Chritopher,1984] and [Lins,1992a] methods
> > do,

> I agree with you. The enable_sp_collect.cpp seems use another
> algorithm to determine the root objects that traversal will begin
> with. I guess, it registers every external pointers by sp_track()
> function as:

> int main() {
> boost::shared_ptr<X> root( new X );
> sp_track( root );
>
> }

> This design will introduce some costs, especially when
> function-local variables are frequently accessed. In HnxGC, there
> is not such registration requirement.

I think there's also the memory cost of the two maps. I think one of
the maps is a map from refcounted-object to external refs, but that
conclusion is based on my long-ago analysis of the code. I've
forgotten what the other map is for.

[snip]

> (4) One major difference between HnxGC and Peter Dimov's
> algorithm is: in HnxGC, we use different type of smart pointers in
> different area, such as CLockedPtr<> in root area (external
> pointers in pdimov's words), and CMemberPtr<> in managed heap
> (internal pointer?), while Peter only uses one type of smart
> pointer shared_ptr<>. Our approach does introduce some
> inconveniences but I think it is acceptable. This type of design
> also helps to significantly reduce RC cost by using "move"
> assignment of CLockedPtr<>.

Ah! That's the key. So, the programmer has to assure that each
CLockedPtr<>'s address is in the root area? If so, then what happens
if the programmar violates this restriction? Does the CLockedPtr CTOR
test it's "this" pointer to assure it's *not* in the managed heap? I
assume the CLockedPtr CTOR and operator=, when accessing a managed
pointer, increments the external pointer count of the managed object.
Of course, this raises the question of whether this is more error
prone than just using weak pointers to break the cycles. Then, you'd
have strong_ptr's and weak_ptr's vs. CLockedPtr's and CMemberPtr's.
OTOH, if my assumption about CLockedPtr CTOR knowing where it's
located (in root area or not), then CMemberPtr could do same, and then
there wouldn't be any need for two different smart-ptrs. What am I
missing?

> (5) As concerning tracing collection, HnxGC uses a mark & sweep
> algorithm similar to the Peter's choice. We needs a Traverse()
> routine defined by application program like Peter's
> sp_enumerate() except we do not require user-class to derives from
> a base class, e.g. enable_sp_collect, and disallow that collector
> asynchronously changes pointers of managed objects.

Does Peter's collector asynchronously change pointers of managed
object? Maybe he does that to break a cycle and then use delete on a
pointer to reclaim memory, but I don't see that as a problem.

> HnxGC provides reclamation ordering control to help application to
> use RAII design pattern.

OK, I guess this is needed in case of cycles, and I guess this is used
to break a cycle before HnxGC uses delete on a pointer to reclaim
memory, just as described above, only the ordering control is used to
determine where to break the cycle. However, this raises again the
question of why not use weak_ptr's instead. These, in effect, provide
reclamation ordering control, AFAICT.

> > How is the HnxGC method for precisely collecting cycles better
than
> > the enable_sp_collect.cpp method?

> In comparison with enable_sp_collect.cpp method, HnxGC has more
> considerations by design for a multi-threading environment. It
> provide pause-less concurrent garbage collection (lock-free,
> no-suspension, no memory ordering for SMP); it removes significant
> portion of the cost of regular reference counting; it provides
> reclamation ordering control for RAII design pattern; it provides
> rich programming features, such as interior pointer, resurrection,
> etc.

I'm know very little about threading, but this stuff about RAII sounds
good. I'm wondering if you could provide both multi-threading and
single-threading versions so that those not needing multi-threading
wouldn't have to pay for the synchronization costs.

-regards,
Larry


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk