|
Boost Users : |
Subject: Re: [Boost-users] [GIL] Some questions about 2D iteration
From: Tim Day (timday_at_[hidden])
Date: 2010-04-11 19:24:36
On Fri, 2010-04-09 at 00:56 -0400, Brett Gmoser wrote:
> The documentation clearly says that v(x, y) is slower than iteration.
Interestingly, I've just been benchmarking the various GIL image
traversal methods, as I find the coordinate access method very
convenient, but I wanted to get some idea of how much improvement I
should realistically expect from converting code to use iterators.
Results obtained on standard Debian Lenny on an Intel Core i7, compiled
with -march=core2 -mfpmath=sse -msse4.1 -O3 -DNDEBUG ; the main bits of
code appended to this email.
GIL coord access 1120.9697 Megapixels/s
GIL row iterator access 1293.0218 Megapixels/s
GIL image iterator access 77.9477 Megapixels/s
I was pretty surprised just how efficient the v(x,y) accessor is,
and even more surprised by how inefficient the whole-image iterator is!
Inspecting the assember, the inner loop of v(x,y) looks like:
.L768:
xorb (%rcx,%rdx), %sil
movq %rax, %rdx
incq %rax
cmpq %r8, %rax
jne .L768
which is very lean, but not quite as good as the inner loop of the row
iterator:
.L734:
xorb (%rdx,%rax), %cl
incq %rax
cmpq %rax, %r8
jg .L734
However, the inner loop of the all-image iterator is:
.L696:
movzbl (%rcx), %eax
incq %rdx
leaq 1(%rcx,%rbp), %rcx
xorl %eax, %r10d
.L708:
testq %rdx, %rdx
jne .L696
cmpq %rcx, %rbx
jne .L696
which is a bit more complicated, although it seems remarkable it runs
~15 times slower than the other methods.
What I took away from this:
- Avoid the all-image iterator like the plague (although I don't really
understand how it manages to be quite so spectacularly slow).
- You need to be pretty desperate for performance to convert working
and basically fast enough coordinate-access based code to iterators.
- Compilers can do a pretty nice job with GIL classes. I've used other
image classes which leave far more to run-time (e.g virtual function
calls) and you have to basically "unload" the class information to
pointers and ints and do it all yourself to get performant inner loops.
-----
BOOST_AUTO_TEST_CASE(coord_access_benchmark)
{
unsigned char hash=0;
scoped_timer t("GIL coord access",images().size(),"Megapixels");
for (images_t::const_iterator it=images().begin();it!=images().end();++it)
{
const boost::gil::gray8c_view_t v=boost::gil::const_view(**it);
for (int y=0;y<v.height();++y)
for (int x=0;x<v.width();++x)
hash=(hash^v(x,y));
}
force_result=hash;
}
BOOST_AUTO_TEST_CASE(row_iterator_access_benchmark)
{
unsigned char hash=0;
scoped_timer t("GIL row iterator access",images().size(),"Megapixels");
for (images_t::const_iterator it=images().begin();it!=images().end();++it)
{
const boost::gil::gray8c_view_t v=boost::gil::const_view(**it);
for (int y=0;y<v.height();++y)
{
boost::gil::gray8c_view_t::x_iterator p=v.row_begin(y);
for (int x=0;x<v.width();++x,++p)
hash=(hash^*p);
}
}
force_result=hash;
}
BOOST_AUTO_TEST_CASE(image_iterator_access_benchmark)
{
unsigned char hash=0;
scoped_timer t("GIL image iterator access",images().size(),"Megapixels");
for (images_t::const_iterator it=images().begin();it!=images().end();++it)
{
const boost::gil::gray8c_view_t v=boost::gil::const_view(**it);
for (boost::gil::gray8c_view_t::iterator p=v.begin();p!=v.end();++p)
{
hash=(hash^*p);
}
}
force_result=hash;
}
-----
Hope that's of interest to some
Tim
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net