Boost :

Date view	Thread view	Subject view	Author view

From: Lubomir Bourdev (lbourdev_at_[hidden])
Date: 2006-10-18 01:09:21

Next message: Ullrich Koethe: "Re: [boost] GIL Review - Integration with other graphics libraries, such as Cairo."
Previous message: Joel de Guzman: "Re: [boost] GIL Image Processing Library - specific comments"
In reply to: Ullrich Koethe: "[boost] GIL Image Processing Library - specific comments"
Next in thread: Ullrich Koethe: "Re: [boost] GIL Image Processing Library - responses and more comments"
Reply: Ullrich Koethe: "Re: [boost] GIL Image Processing Library - responses and more comments"

Hi Ulli,

Thanks for spending time to review GIL.

[...]
> 1. I like the concept of views to adapt images to algorithms'
> needs. Users can understand this quite easily. But there is
> also a danger of overuse (this is pointed out in the
> tuorial): doing all transformations on the fly may be very
> inefficient. For example, conversion between RGB and CMYK in
> an adaptor is OK, but RGB to LAB is not. For the latter
> cases, the GIL authors propose an idiom
>
> copy_pixels(some_expensive_view(source),
> some_other_view(destination));
>
> A traditional STL style solution would probably be easier:
>
> transform(source.begin(), source.end(), destination.begin(),
> SomeFunctor());
>
> (since implementing a view is more involved than implementing
> a functor, at least in terms of reading documentation).

They are not quite equivalent. Views can transform both the domain and
the range of images. For example, a view may change the way its
horizontal and/or vertical iterators advance, and similarly may perform
a transformation upon dereferencing (which your SomeFunctor does). We
find it appealing to represent this as a single object (view) rather
than splitting it into multiple parts, such as ranges and accessors.
Yes, I know these are conceptually different transformations, and they
are implemented in different classes; they are just combined in the
View. It is appealing because conceptually it just represents an
abstract image. And you can pipe views nicely together.
Another difference is in performance: copy_pixels(src,dst) is usually
faster than the STL equivalent std::copy(src.begin(), src.end(),
dst.begin())
This is because the one-dimensional iterators are slower, as they have
to deal with potential padding at the end of rows, and also because we
provide performance specializations.

> On the other hand, a PixelDereferenceAdapter is semantically
> also a view, so the use of the view idea could be extended.

PixelDereferenceAdapter is not a view. It is a function object to be
invoked upon pixel access, similar to your Pixel Accessor. You can
easily attach it to any iterator, locator or a view.

> The interpretation of pixel adaptors as views is more
> apparent when data accessors are used (they work essentially
> like the image views). The difference to the
> PixelDereferenceAdapter is mainly syntactic (using operator*
> rather than get/set functions), and it was already pointed
> out that this is not without problems (reference proxies are
> required for operator*, which are more difficult to implement
> than data accessors and may sometimes exhibit unexpected behavior).

I understand your motivation for using pixel adaptors, but in my opinion
the benefits of using PixelDereferenceAdapter outweigh its
disadvantages. I will elaborate, but I am first waiting to hear from
others on the thread we started on this subject.

>
> 2. I like the idea of handling unknown input formats by means
> of a variant pixel. However, it remains to be shown whether
> this solution is indeed practical for algorithm
> implementation. For example, when a binary operation is
> called for a variant with 4 possible types, the compiler is
> forced to generate 16 algorithm instantiations (for all
> pairs) for every result type. There are a number of design
> alternatives (for example coercion to a common type) which
> are probably better.

GIL incorporates a mechanism for active reduction of code bloat due to
variant instantiations. I will present this approach at the LCSD
workshop in the upcoming OOPSLA conference, which is in less than a
week: http://lcsd.cs.tamu.edu/2006/

> 3. I don't like that the input/output image format (jpeg, png
> etc.) must be hardcoded in the program (by using
> jpeg_read_image() etc.).

We initially had a combined I/O methods like:

template <typename Image>
void read_image(const char* file, Image& img) {
    switch (image_type_from_file_name(file)) {
        case kJPEG: jpeg_read_image(file,img); break;
        case kTIFF: tiff_read_image(file,img); break;
        ...
    }
}

We decided to remove them for three reasons - first, the set of file
formats is open and listing them in a switch statement is not very
flexible. Second, using the above would require people to link against
all libraries while they may only need/have one. And third, there are
plans to put in ASL a more advanced GIL I/O that allows for dynamic
registration of file format modules.
Of course, the above methods are trivial to implement and people might
want to just do them for convenience. On the client side it is ok to be
less generic...

> 4. GIL has no (public) test suite whatsoever.

We have a regression suite, though not on the web site. You will get it
if you install GIL through ASL.

>
> 5. A GIL image can only be used with POD pixel types, because
> the allocated memory is never initialized (by
> uninitialized_fill() or equivalent). Neither is the
> destructor for individual pixels called before deallocation.

This was an item on our TODO list but don't know how it fell between the
cracks. We will put it back on the list. Thanks! Good catch!

> 6. GIL may have alignment problems. For example, the variant
> template allocates internal memory as "char[MAX_SIZE]" but
> then reinterpret_casts it to an image type.

MAX_SIZE is determined by taking the maximum of the size_of of each
element in the MPL vector of image types with which the variant is
instantiated. So for every variant it is different, and the appropriate
size is evaluated at compile time.

> Casts from
> "signed short *" to "pixel<int16s, rgb_t> *" and similar (see
> the tutorial) are also problematic

This cast is necessary to go from the outside world (pointer to the
beginning of the image) into GIL.
It is not obvious to me how to avoid it. How does VIGRA deal with the
case where the user constructs an 8-bit RGB image by providing VIGRA
with a raw pointer to the first pixel?

>
> 7. Is transform_pixel_position() able to handle image borders
> properly and efficiently (page 7 of the tutorial)? Otherwise,
> it is wrongly used (may cause segfaults).

No it doesn't handle corner conditions. Actually you found a typo in the
tutorial. The function that contains transform_pixel_position should be
called x_gradient_unguarded to indicate this. As the tutorial shows,
x_gradient_unguarded is called from x_gradient using subimage_view that
excludes the first and last column.

>
> 8. The cached_loaction_t is a nice, elegant optimization.
>
> 9. Generic algorithms that work efficiently for both planar
> and interleaved multi-channel images are not as easy as the
> authors imply.

Why not? (see below)

> On page 5 of the tutorial, the authors
> recommend transform_channels() in order to unroll the inner
> loop over the channels. But this optimization is useless for
> planar images - in this case, one wants to work on one
> channel at a time, i.e. the channel loop must be the outer
> rather than the inner loop.

I wouldn't call it useless for planar images. Just the opposite - this
code illustrates one of the nice abstractions in GIL; that you can write
the algorithm once and have it work for both planar and interleaved
images. There is typically no loss in efficiency even though the three
planes are separated, because modern architectures can support many
simultaneous cache lines.

Of course, this is not always the case. The optimal performance depends
on the exact algorithm, the hardware, and the image size. There may be
contexts for which processing the planes one at a time is noticeably
faster. In this case a planar-only specialization of the algorithm can
be provided. The point is that by using GIL you are not forced to
provide a planar version for the algorithm.

We have many versions of the same imaging algorithms at Adobe - ones
that work for planar images, and ones for interleaved. GIL is a hope
that these could be consolidated.

____

We have been discussing the accessors/DereferenceAdaptors and promotion
traits issues in separate threads. Aside from these differences with
VIGRA, do you see any issues with the GIL interfaces that would prevent
porting the Vigra algorithms to GIL, or that would affect their
performance?
If yes, what interfaces are missing and what algorithms need them?

Thank you again for your review!

Thanks,
Lubomir

Next message: Ullrich Koethe: "Re: [boost] GIL Review - Integration with other graphics libraries, such as Cairo."
Previous message: Joel de Guzman: "Re: [boost] GIL Image Processing Library - specific comments"
In reply to: Ullrich Koethe: "[boost] GIL Image Processing Library - specific comments"
Next in thread: Ullrich Koethe: "Re: [boost] GIL Image Processing Library - responses and more comments"
Reply: Ullrich Koethe: "Re: [boost] GIL Image Processing Library - responses and more comments"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk