Boost Users :

cdr_at_[hidden]

---
You hit on a key point - you do have some knowledge of your graph's
semantics and the algorithms you're running so you it's likely that you
_can_ make some assertions about what the memory access patterns are likely
to be. I'll let Dave step in elaborate further on exactly what he was
talking about but I think that generally speaking, if your runtime heap
requirements exceed physical memory, then ultimately you need to worry about
paging virtual memory to/from physical memory. This brings up an interesting
question: how best to minimize the paging? Several general truisms: when
you're executing your graph algorithm (regardless of if using BGL or your
own hand-coded graph abstraction) you want to work hard to get the memory
footprint of the algorithm aligned with a page boundary and hopefully small
enough to remain resident in the CPU's cache memory. Similarly, as your
algorithm processes data addressed in process virtual memory space, you
would like to minimize the frequency of cross paging boundaries. Now this
depends on your graph topology and algorithm - but I guess you have some
data about this. I think Dave was eluding to the fact that you might be able
to control the memory layout of the graph data more deterministically if you
hand-code your graph abstraction to explicitly address this memory layout
issue. I can't speak for other operating systems, but at least on Windows
it's possible for a user mode application to manually allocate contiguous
virtual memory regions with calls through the Windows API to the kernel. You
can then use this huge chunk of contiguous virtual memory to store your data
using the knowledge you have about the order that things will be accessed in
order to minimize the number of page crossings during execution of your
algorithm. Tools like Intel's VTune are invaluable for evaluating your
runtime performance as it will give you cache and paging metrics. VTune is
available for Windows and now Linux as well (as is Intel's C++ compiler
(what I'm using) and high-performance SIMD instruction support libraries).
And when you're satisfied you can do no better, then if it's still not fast
enough (and it never is) then you can decide if shelling out the $$$ for
things like striped fiber channel drive arrays controlled by 64-bit PCI
controllers is worth it. They certainly make a difference - we used this
monsters a lot back when I was working on video editing stuff. Hope this
helps and good luck! What are you crunching out of curiosity (if you're at
liberty to say?)
- Regards
cdr
"Erik Arner" <yg-boost-users_at_[hidden]> wrote in message
news:3E3E428D.90107_at_cgb.ki.se...
>
> David Abrahams wrote:
> > Erik Arner <yg-boost-users_at_[hidden]> writes:
> >
> > Modelling Sequence or RandomAccessContainer correctly for data that is
> > not always in-memory is notoriously difficult.  Typically it requires
> > iterators which store their value_type internally, so dereferencing
> > returns an internal reference to the iterator itself.
>
> OK, as my knowledge regarding this is fairly limited, do you have any
> pointers to books or articles (on paper or on the web) that discuss this
> technique?
>
> > The Boost
> > Iterator Adaptors library can be a help with constructing conforming
> > iterators (also notoriously difficult).
>
> Point taken.
>
> > Of course, the BGL's use of
> > descriptors instead of heavy data may help get you around this problem.
>
> If you have the time, please elaborate!
>
> > You might consider whether the problem gets any easier if you don't
> > try to use adjacency_list, but instead model the Graph concept
> > yourself.  I've done that several times; it's not as hard as it might
> > seem.
>
> This certainly seems like a better solution than my original idea. Thanks!
>
> /Erik
>
>
>
>
> Info: <http://www.boost.org>
> Wiki: <http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl>
> Unsubscribe: <mailto:boost-users-unsubscribe_at_[hidden]>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>