Boost logo

Boost :

From: Lubomir Bourdev (lbourdev_at_[hidden])
Date: 2006-11-01 21:48:59


Hi Ulli,

Thanks for your review!

Ulli wrote:
> I vote to accept GIL into boost after revision. In addition,
> I consider
> addition of an algorithm library in the not-too-distant
> future mandatory
> in order to realize the full potential of the image classes.

As I already mentioned, we also would like very much to see image
processing algorithms added to GIL, whether or not it is accepted in
Boost.
In particular, the Vigra algorithms provide a good starting point.
Although they could be invoked through a compatibility layer, as you and
I demonstrated, it would be great if they are rewritten in native GIL,
because we don't believe a compatibility layer can handle all possible
image abstrations.
If you and/or others would like to volunteer to do that, you can expect
and extra level of support from us.

We also are looking for volunteers to implement other algorithms, such
as the ones in OpenCV. It would be great to also have a GIL-layer over
Intel's IPP library. Also, providing interfaces to AGG would be
terrific.
Providing parallel versions of image processing algorithms would be
great. GPU support too.
More I/O formats too.

>
> > - What is your evaluation of the design?
>
> I like the design of the image classes (including the
> dynamic_image) and
> the heavy use of views as a major algorithm interface. The various
> iteration abstractions are also quite powerful. The proof-of-concept
> implementation of the VIGRA interoperability layer was quite
> encouraging.
>
> I'd like to see the following additions to the image class:
> - constructors that allow copy construction of the pixels values

You mean deep-copy the images? The image class has a copy constructor
which deep-copy's the images.

> - resize() functions with default construction and copy
> construction of
> the newly allocated pixel values (without resize(), the default
> constructor image() is pointless).

There is a function called resize_clobber_image. Is that what you are
looking for?
If the images are of different dimensions, it constructs an image of the
target size and swaps them.
It does not copy the pixels. This could be done manually with
copy_pixels if needed.
We are looking for a better name, but resize_image implies the pixels
are copied (and truncated) as with vector resize. We didn't add copying
of the pixels because doing so is often not necessary, and it is unclear
what the copy policy should be exactly. (copy into top left? Truncate?)

> I'm less convinced of the pixel classes (which, as I argued
> in another
> post, should be renamed into pixel_value or just value).
> IMHO, a pixel
> class without a set of operations specific to this particular
> pixel type
> is not very useful. As far as I can tell, there are not even
> conversion
> functions between RGB, LAB and HSB (and, unlike the RGB-CMYK
> conversion,
> these are too expensive to be implemented as pixel adapters).
>
> I recommend to either
> - drop the specific color classes for the time being and use
> a general
> static_vector<3, T> together with accordingly changed
> pixel_value_tag
> concepts, or
> - add a set of functions justifying the existence of the
> color classes.

Why does VIGRA have a separate class for RGB Value then? What is special
about RGB?

There are lots of functions that take the color information into
account. Color information goes down to the very basics. For example,
any per-channel operation that takes two or more pixels operates on
semantic channels, not physical channels. These will do the right thing:

bool eq=equal_pixels(rgb8_view, bgr8_view);
copy_pixels(rgb8_view, bgr8_view);
rgb8_pix = bgr8_pix;

rgb8_image_t img(bgr8_img);

Color information also is useful for compile-time type safety. GIL
constructs that don't have the same base color space are incompatible
and cannot be copy-constructed, assigned and compared to each other.
These will generate a compile error:

copy_pixels(lab8_view, rgb8_view); // error
rgb8_pix = lab8_pix; // error

This level of compile-time safety is not possible if you only consider
the type and number of channels.

Of course, there are a variety of color conversion functions:

copy_and_convert_pixels(rgb8_view, cmyk16_view);

cmyk16_pixel_t c16;
color_convert(rgb8_pixel_t(255,0,0), c16);

You are right that some color conversion combinations are not yet
provided. This has nothing to do with the performance to do color
conversion, but more with the fact that there is a combinatorial
explosion in the number of conversions - every color space to every
other.
Color conversion between RGB and LAB could be implemented just like it
is done between, say, RGB and CMYK.
Is there any performance advantage to implementing it differently?

I agree that sometimes it doesn't make sense to have a specific color
space. Sometimes you really want to think of the data as image of pixel
with N channels of type T. GIL provides models of anonymous N-channel
homogeneous pixels, which have "device-N" color space. Here are the
types of 5-channel anonymous 16-bit unsigned integral pixels and image
views:

dev5n16_pixel_t p(0,1,3,2,3);
dev5n16_view_t view5;

Their color space is devicen_t<N> where N is the number of channels.
(We are open to suggestions for a better name)

> Regarding pixel type conversion and mixed-valued functions:
> It was already
> pointed out that the resulting code bloat can easily get out
> of control,
> especially in the context of the dynamic_image. I read
> Lubomir's OOPSLA

Which, by the way, I noticed is posted here:
http://sms.cs.chalmers.se/bibliography/proceedings/2006-LCSD.pdf

> paper and think that it only offers a partial solution - it
> only covers
> the case were different type combinations actually create the
> same binary
> code. IMHO, this is not the most problematic case.
>
> Consider just the scalar pixel types 'unsigned byte', 'signed short',
> 'unsigned short', 'int', and 'float'. All of them are
> regularly occuring
> in VIGRA algorithms, and no combination will be binary equal
> to any other.

As described in my paper, the types can be equivalent in the context of
the algorithm.
For example, for bitwise-copy, copying "signed short to signed short"
and copying "unsigned short to unsigned short" can be treated as
equivalent.

> What is needed is a consistent set of coercion rules. For example,
> 'unsigned char' x 'unsigned char' might be implemented natively (for
> speed), while 'unsigned char' x 'float' will be implemented
> by a coercion
> of the first pixel to 'float', followed by a native 'float' x 'float'
> operation.
>
> These rules, together with appropriate result_of types, can be easily
> formulated in a set a traits classes, and should also be
> customizable on a
> per-function-call basis. Coercion may not be a prerequisite for the
> usefulness of the core part of GIL, but it is critical for the
> dynamic_image and the future image algorithm library
> (including the color
> conversion function suggested above).

I can see how automatically converting one type to another can be
useful, but I am not sure how this relates to the topic of
dynamic_image.
Whether or not the type of the image is specified at run time, and
whether or not you want to convert one type to another are two
orthogonal things... unless I am misunderstanding you.

> > - What is your evaluation of the implementation?
>
> It is mostly OK. I would like the following improvements:
>
> - there are several comments saying 'potential performance
> problem' (or
> equivalent) in the code. The code should be changed to
> eliminate these
> problems because the affected functions are performance critical.

I agree, on a case by case basis. We are trying to get what we believe
are the bottlenecks as efficient as possible. It would be quite a job to
make all parts of GIL completely optimal. Very often extra efficiency
comes at the cost of code size, extra stuff in data structures, or
decreasing the performance in other places of the code... Sometimes the
tradeoff is not worth it.

For example, currently view.end() is not very efficient as it is
implemented as "return begin()+size()".
If you say:
for (it=view.begin(); it!=view.end(); ++it) {...}

You may suffer a performance hit there. But how do we fix this? Store
the end iterator inside the view? Then the view class will grow and
contain less than minimum info... Unlike std::vector implementations, we
need to keep the dimensions explicitly as they cannot be inferred simply
by end() - begin(), which is 1D.

>
> - C-style casts and reinterpret_casts should be eliminated
> (if this is
> impossible in the latter case, the casts should be guarded by
> static_assert to do the right thing).

I would love to eliminate them but I don't see how. Static asserts are
always a good idea.

>
> - proper initialization of the pixel values should be added (e.g. by
> uninitialized_fill).

Already on our todo list.

> - I didn't check in detail whether all functions (especially
> constructors
> and destructors) are fully exception safe. The authors should make a
> statement about this.

Unless there are bugs, everything should be exception-safe.

> - code readability should be improved (more line-breaks).
>
> - the test suite should be made part of the distribution.

It is posted on our download page.

Thanks again for your review.

Lubomir


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk