From: Ullrich Koethe (koethe_at_[hidden])
Date: 2006-11-02 16:47:22
Lubomir Bourdev wrote:
> Of course, we could have done the same by keeping resize_clobber inside
> the image class and provided one for the dynamic_image too... I don't
> have a strong opinion. Do people think we should move all global
> functions like this to be part of the image class? Currently they are:
> get_height(img or view)
> get_width(img or view)
> get_dimensions(img or view)
> get_num_channels(img or view)
> Except for the first one, all these global functions have member method
> equivalents. We could:
> 1. Make image::resize_clobber a method to be consistent
> 2. or get rid of the global functions
> 3. or make the member functions private, so people only use the global
I like to think of images in terms of Abstract Data Types. My rules for
member functions vs. free functions are therefore:
- functions that retrieve the internal state of a single image or change
this state belong to the class (i.e. resize(), get_height() etc.)
- functions that invlove several images and create new images (or rather
write the values of a pre-allocated destination) belong outside the class,
e.g. transform() and convolve()
- functions that have potentially open-ended semantics belong outside the
class (e.g. view creation).
There are some border cases. For example, norm() or the most basic views.
These might be provided either way or even both ways.
> Also any suggestions for a better name than resize_clobber_image (just
> not resize, not to be confused with vector resize behavior)
Yes, please. IMHO image::resize() is ok, the semantic difference to
std::vector is no problem for me, but others' opinions may be different.
> Color models are an open class, so we cannot represent all of them even
> if we wanted.
> This applies to just about any other part of GIL. Our goal is to provide
> a comprehensive set of the most frequently used models.
But this doesn't answer the question which ones warrant their own class,
and which ones are just instances of static_vector.
>>2. If a color model does not have specific functions (in the current
>>version of the code base), a specific class is definitely not
>>This applies especially to the lab and hsb classes in the
>>current GIL version.
> We could get rid of them.
> Or we could provide color conversion for them.
The first possibility migth be ok for the first release, but eventually
the second will be required. Still, the question is: is a color conversion
function sufficient to warrant an own class? (This is a general question,
not targeted against GIL).
> It is not just type safety. My rule is to push complexity down and
> resolve it down at the building blocks. In my experience that results in
> simpler and more flexible high level design.
Yes, but it may create complexity of its own (e.g. code bloat, endless
compiles, unguarded reinterpret_casts). It is probably necessary to decide
this on a per-case base.
> For example, GIL pixels are smart enough to properly handle channel
> rgb8_pix = bgr8_pix;
So does VIGRA.
> A similar case is for dealing with the planar vs interleaved
> organization. We could have treated planar images as separate 1-channel
> images and leave the burden on each higher level algorithm to handle
> them explicitly (or, what is worse, not support them at all!). Instead,
> GIL's approach is to resolve this down at the fundamental pixel and
> pixel iterator models. As a result, you can write an open set of higher
> level GIL algorithms without having to deal with planar/interleaved
This is a good thing, and VIGRA tries the same (e.g. interleaved vs.
planar images can be abstracted by means of accessors). But the problem is
also open-ended. For example, what about tiled images (necessary for very
large files that don't fit into memory, and for better chache locality)?
> (If this reminds you of the iterator adaptor vs data accessor
> discussion, it is not an accident - this is the same discussion)
I don't agree with this - the distinction between iterator adaptor and
accessor is mostly syntactic, provided the proxy idiom works as expected
(which it didn't 5 years back :-).
> The same principles could apply down to the channel levels.
> The channel could be made smart enough to know its range, which allows
> us to write channel-level algorithms that can convert one channel to
> another. Having smart channels allows us to write the color conversion
> routines that just deal with color, and delegate the channel conversion
> down to the channels:
This is actually an idea we were contemplating in VIGRA as well. A general
solution would be nice.
> 1. We want to use simple built-in types to represent channels. Wrapping
> the channel in a class may result in potential abstraction penalty
> (although we haven't verified it does)
In my experience, it doesn't cost much (if anything), at least for
integral underlying types. As an analogous example: a pure pointer and an
iterator class which contains just a pure pointer behave identically on
all systems I tried.
> 2. At the same time, we want the channels to be smart, i.e. to associate
> traits types, such as min value and max value, to the channel type, so
> that we can implement channel-level operations like channel_convert.
No problem with a wrapper class.
> 3. The same built-in types may be used for semantically different
> channels. For example, float often has an operating range of 0..1, but
> if you want to use high dynamic range images you may want to use floats
> with different ranges.
No problem with a wrapper class.
> Right now GIL hard-codes the commonly used types to their commonly used
> ranges - float is 0..1.
Really? I didn't realize this and don't like it at all as a default
setting. IMHO, the pleasant property of floats is that one (mostly)
doesn't need to take care of ranges and loss of precision.
In fact, whenever time and memory constraints allow, I try to do _all_
image processing in float. Thanks to modern CPUs, this is not a big
performance penalty anymore, results are far superior, and algorithms
become much simpler. Speed is mainly limited by the fact that only a
quarter as many pixels fit into the cache (compared to unsigned char).
(It's a different story when MMX/SSE optimization comes into play).
What's really a performance killer is mixed expressions between integer
and float, and the integer/float conversion. This makes good coercion
rules all the more important.
> You mean, writing conversion to/from a common color space? Although this
> is appealing, the two problems I see with this are performance (it is
> faster to convert A to B than A to C and C to B) and loss of precision
> (there is no common color space that includes the gamut of all others)
Some color conversions are defined by means of the rgb or xyz color space
as intermediate spaces anyway. Others are concatenations of linear
transforms, which are easily optimized at compile-time or run-time. Loss
of precision is a non-issue if the right temporary type is used for
> True, but at the same time you can write a view pipeline that ends up
> executing faster than if you were to copy to an intermediate buffer.
> Consider this:
> color_converted_view<lab8_pixel_t>(subsampled_view(my_view, 2,2)));
> This performs color conversion only for the even pixels. And it is
> faster than having to copy to an intermediate buffer.
Do you have objective numbers for this claim? Unfortunately I don't, but
I've sometimes seen the intermediate buffer version run faster, because
the processor was able to make more efficient use of caches, internal
pipelining, and register allocation. So it is probably very difficult to
come up with general rules.
> I am in favor of letting the programmer choose what is best rather than
> imposing one strategy, which in my opinion is the style of C++ and STL
> as compared to other languages.
Definitely, but the library should also avoid unpleasant surprises.
> Now I understand what you mean.
> But I still think whether you convert one type to another, and whether
> you apply that operation on statically-specified or
> dynamically-specified types are two orthogonal things.
Yes, but when types are statically specified, you usually know what to
expect, so the number of supported types is fairly limited. In contrast,
the dynamic type must (statically) support any type combination that could
happen. Thus, the two things are linked in practice.
> First of all, in GIL we rarely allow direct mixing images of
> incompatible types together.
What are the criteria for 'incompatible'?
> For compatible images there is no need for type coercion and no
> loss of precision.
Loss of precision and out-of-range values become important issues as soon
as arithmetic operations enter the picture -- you only want to round once,
at the end of the computation, and not in every intermediate step.
> For example copy_pixels(rgb8, cmyk16) will not compile.
What's incompatible: rgb vs cmyk or 8 bit vs 16 bit?
> If you want to use incompatible views in algorithms like copy_pixels,
> you have to state explicitly how you want the mapping to be done:
> copy_pixels(rgb8_v, color_converted_view<rgb8_pixel_t>(cmyk16));
> copy_pixels(nth_channel_view(rgb8_v,2), gray8_v);
Fair enough, but why not make the most common case the default? It worked
well for us.
-- ________________________________________________________________ | | | Ullrich Koethe Universitaet Hamburg / University of Hamburg | | FB Informatik / Dept. of Informatics | | AB Kognitive Systeme / Cognitive Systems Group | | | | Phone: +49 (0)40 42883-2573 Vogt-Koelln-Str. 30 | | Fax: +49 (0)40 42883-2572 D - 22527 Hamburg | | Email: u.koethe_at_[hidden] Germany | | koethe_at_[hidden] | | WWW: http://kogs-www.informatik.uni-hamburg.de/~koethe/ | |________________________________________________________________|
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk