Boost logo

Boost :

From: Ullrich Koethe (koethe_at_[hidden])
Date: 2006-11-15 12:43:08

Hi Lubomir,

Lubomir Bourdev wrote:
>>>First of all, in GIL we rarely allow direct mixing images of
>>>incompatible types together.
>>What are the criteria for 'incompatible'?
> Compatible images must allow for lossless conversion back and forth. All
> others are incompatible.

That's a nice rule, but it's not the rule of the underlying C/C++
language, where compatible means more or less that an implicit conversion
between two types is defined. Our default behavior in VIGRA resembles this
as far as possible, with the exception that the default conversions
include clamping and rounding where appropriate. In practice, both
definitions of compatibility will probably work equally well as long as
conversions can be customized. I still think that a default implicit
conversion makes life easier in many situations, without causing
unpleasent surprises.

The customization problem is especially hard if the conversion happens
deep inside a nested operation (which has possibly been created by some
automatic function/functor composition mechanism) when an intermediate
type must be converted back to some fixed type, e.g. to the result type
(recall that intermediate types usually differ from the result types in
VIGRA, because that allows us to round/clamp only once).

> I agree - in cases you need arithmentic operations you need to worry
> about specifying intermediate types and there is loss of precision.

Arithmetic operations are the bread and butter of image processing as I
know it. The same applies to filters, edge detectors, interest operators
etc. Loss of precision accurs only when the intermediate or result types
are chosen badly.

> How about image::recreate(width,height)?

I like this name -- it says exactly what's happening. Please remember that

image::recreate(width,height, initial_pixel_value)
image::recreate(size_object, initial_pixel_value)

should also be defined.

> It comes down to some basic principles:
> 1. We want equality comparison to be an equivalence relation.
> 2. We also want equality comparison and copying to be defined on the
> same set of types.
> 3. The two operations should work as expected. In particular, after a=b,
> you can assert that a==b.
> I hope you will agree that these are fundamental rules that hold in
> mathematics and violating them can lead to unintuitive and bug-prone
> systems.

Unfortunately, the CPU itself violates rule 3: a double in a register and
the same double written to memory and read back need no longer compare
equal - a behavior that is really hard to debug :-(

> But the first principle requires that the types be compatible (i.e.
> there must be one-to-one correspondence between them). To see why, lets
> assume we can define operator== between int and float, and define it to
> round to the nearest int when comparing an int to a float.

No, never do that. Mixed type expressions are always coerced to the
highest of the types involved or to an even higher type. Otherwise, you
will really get unpleasant surprises. (OK, your code is only an
illustration, I know).

> You may define operator== to not round but "promote" the int to a float.
> That will make your equality comparison an equivalence relation. The
> problem then will shift to rule 3 because you cannot do the same
> promotion when copying. Consider this:
> float a=5.1;
> int b=a;
> assert(b==a); // fails!

I don't see this as a surprise -- after all, we have performed a lossy
assignment in between. Type promotion is a well understood operation.

> That is, "a=b" should be defined for exactly the same types for which
> "a==b" is defined. Therefore, copy should only be defined between
> compatible images.

I like this rule. So, the fundamental question is: should a = b
imply a == b?

What are the language gurus saying about this? Niklaus Wirth and Bertand
Mayer are certainly in favour of the implication, whereas the C/C++
inventors opted against. In practice, it amounts to two questions:

1. Will a default implicit conversion be useful enough to tolerate its
potential for surprises?
2. Can one design the system so that customized conversions can be
conveniently configured, even if deep inside a nested call?

I'd like to hear the opinion of others about this.

> Regardless of what we do, there is always a notion of the operating
> range of a channel, whether it is implicit or explicit.

I don't agree. For example, when you use Gaussian filters to compute
derivatives or the structure tensor etc. the result has no obvious range -
it depends on the image data and operator in a non-trivial way. For
example, the minimal and maximal possible values of a Gaussian first
derivative are proportional to 1/sigma, but sigma (the scale of the
Gaussian filter) is usually only known at runtime. It is the whole point
of floating point (as opposed to fixed point) to get rid of these range

> You are essentially suggesting that the range of a channel be defined at
> run time, rather than at compile time, right?

More precisely, we avoid explicitly defined ranges, until some operation
(e.g. display) requires an explicit range. Remember that it is no longer
slow to do all image processing in float.

> But you still need to know the range for many operations.
> For example, converting between additive and subtractive color spaces
> requires inverting the channel, which requires knowing its range.

Not necessarily. In many cases, one can simply negate the channel value(s)
and worry about range remapping later, when the required output format is

Likewise, if you don't require fixed ranges, you may perform out-of-gamut
computations without loss of precision. You can't display these values,
but they are mostly intermediate results anyway.

Thus, I strongly argue that the range 0...1 for floating point values
should be dropped as the default behavior.

> GIL's dynamic image doesn't do anything fancy. It simply instantiates
> the algorithm with all possible types and selects the right one at run
> time.
> If your source runtime image can be of type 1 or 2, and your destination
> can be A,B or C, when you invoke copy_pixels, it will instantiate:
> copy_pixels(1,A)
> ...
> copy_pixels(2,C)
> and switch to the correct one at run-time.

That's all well and good. But in practice, the dynamic image might well
support more than 3 types, and the system will need operations with more
than two arguments. Then a combinatorial explosion occurs unless a
powerful coercion mechanism is provided.

> In some contexts specific color
> models are widely used and others are not needed. But each color model
> was invented because it was needed somewhere.
> I see little harm in providing an extensive set of color models.

That's not what I was arguing against. I was arguing against _pretending_
support for a color space when just an accordingly named class is
provided, but no operations.

> In my opinion tiled images are a different story, they cannot be just
> abstracted out and hidden under the rug the way planar/intereaved images
> can.

I'm not so pessimistic. I have some ideas about how algorithms could be
easily prepared for handling tiled storage formats.

Best regards

|                                                                |
| Ullrich Koethe  Universitaet Hamburg / University of Hamburg   |
|                 FB Informatik        / Dept. of Informatics    |
|                 AB Kognitive Systeme / Cognitive Systems Group |
|                                                                |
| Phone: +49 (0)40 42883-2573                Vogt-Koelln-Str. 30 |
| Fax:   +49 (0)40 42883-2572                D - 22527 Hamburg   |
| Email: u.koethe_at_[hidden]               Germany             |
|        koethe_at_[hidden]                        |
| WWW:      |

Boost list run by bdawes at, gregod at, cpdaniel at, john at