Le Jeu 7 ao&ucirc;t 2008 21:09, Andrea<br />Denzler a &eacute;crit :<br
/>&gt; Working with indexes is defintively easier and more confortable.
And<br />&gt; probably in many situation the overhead is not worth to be
optimized at<br />&gt; all.<br /><br />That's what my experiments shown
for months<br /><br />&gt; I only was surprised when you said there is a
performance gain using<br />&gt; indexes.<br />Again my hands were faster
than my thoughts. Apologizes for the misunderstanding<br /><br />&gt; You
are loosing performance unless the compiler is so intelligent to<br />&gt;
avoid the multiplication at each step.<br />I yet have to find a compiler
not that intelligent. From VC6 to ICC and g++ since v 3.x, i never had to
complain about the compiler output. Then again, most computation is
prefetched during the array allocation ( see&nbsp; nrc_alloc_array
function in my source). <br /><br />&gt; When performance is really an
issue then I first re-think the algorithm<br />algorithmic optimisations
are always best anyway<br /><br />&gt;after I check the produced assembler
listing to see what the compiler did. <br />I was used to do this but I
really think it's not necessary by now with modern compiler.<br /><br
/>&gt; If you are lucky the compiler can produce code that does this in
one<br />&gt; instruction set, if not you will get an overhead. Again if h
and w are low<br />&gt;  values you will not notice it.<br />And that's
exactly what happens when you do loop tiling, the inner h and w are rather
small (tile often occupies less than half the L1 cache in size)<br /><br
/>&gt; This example is of course much faster, and yes, it is not elegant
nor clear.<br />&gt; int *p=array,*pend=&amp;array[h][w];<br />&gt; while
(p &lt; pend) *p++ = uni() ;<br /><br />Well, I copy/pasted this in my
array.cpp in place of the loop nest.<br />My 2D array take 9.938s to
iterate 10000 over a 512*512 image of float, while your &quot;while
loop&quot; took 9.891. I only lose ~0.5% by using NRC allocation +
indexing. It's indeed faster but not by that much and, indeed, far less
elegant.<br /><br />If needed, I can try to implement a simili-multi_array
using my method and compare it to the original one in term of performance.