Boost :

Date view	Thread view	Subject view	Author view

From: jsiek_at_[hidden]
Date: 2000-01-29 02:44:14

Next message: John Maddock: "[boost] type_traits - almost there?"
Previous message: Greg Colvin: "[boost] Re: new approach to smart ptrs -- no new()s is good news"
In reply to: Greg Colvin: "[boost] Re: new approach to smart ptrs -- no new()s is goodnews"
Next in thread: Dave Abrahams: "[boost] Re: new approach to smart ptrs -- no new()s is goodnews"
Reply: Dave Abrahams: "[boost] Re: new approach to smart ptrs -- no new()s is goodnews"

Greg Colvin writes:
>
> This looks much better, and may indeed be much faster. I'll leave it
> to Mark to tweak his code to get this expansion.
>
> I still count 6 loads/stores versus 12 loads/stores.

But those 12 loads/stores are to the stack instead of the heap, and
in many situations they will go away via register allocation. This is
what happen's in Mark's example code. Also, alias analysis handles
stack based objects very well, while with most compilers stuff on the
heap is a complete mystery. This makes it difficult for the compiler
to optimize the instruction around the loads and stores. For instance,
with Mark's example, the SGI compiler doesn't attempt any loop
optimizations in the shared ptr case, while for the linked ptr it does
some (though the presence of a possible exception prevents it from
unrolling). The end result, is that on an Origin2000 a tight loop that
just copies pointers (Mark's example) runs 6X faster with linked over
shared.

Here's the summary of the loop for linked_ptr.

#<loop> Loop body line 33, nesting depth: 1, estimated iterations: 100
#<loop> Not unrolled: in exception region or handler
#<sched>
#<sched> Loop schedule length: 45 cycles (ignoring nested loops)
#<sched>
#<sched> 31 mem refs ( 68% of peak)
#<sched> 20 integer ops ( 22% of peak)
#<sched> 49 instructions ( 27% of peak)
#<sched>
#<freq>
#<freq> BB:13 frequency = 92.79869 (heuristic)
#<freq> BB:13 => BB:14 probability = 0.81116
#<freq> BB:13 => BB:15 probability = 0.18884
#<freq>

For shared pointer the compiler didn't print a loop summary because it
didn't treat the loop as a "loop", and didn't schedule the
instructions accordingly. If you look at the assembly code
there's a huge difference.

Ciao,

Jeremy

----------------------------------------------------------------------
Jeremy Siek
Ph.D. Candidate email: jsiek_at_[hidden]
Univ. of Notre Dame work phone: (650) 933-8724
and cell phone: (415) 377-5814
C++ Library & Compiler Group fax: (650) 932-0127
SGI www: http://www.lsc.nd.edu/~jsiek/
----------------------------------------------------------------------

Next message: John Maddock: "[boost] type_traits - almost there?"
Previous message: Greg Colvin: "[boost] Re: new approach to smart ptrs -- no new()s is good news"
In reply to: Greg Colvin: "[boost] Re: new approach to smart ptrs -- no new()s is goodnews"
Next in thread: Dave Abrahams: "[boost] Re: new approach to smart ptrs -- no new()s is goodnews"
Reply: Dave Abrahams: "[boost] Re: new approach to smart ptrs -- no new()s is goodnews"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk