|
Boost : |
From: Lubomir Bourdev (lbourdev_at_[hidden])
Date: 2006-10-09 16:08:38
> > - What is your evaluation of the documentation?
>
> Not good, really.
> It consists of two parts: the tutorial and the design guide.
> the first is too superficial and the second is both superficial and
too
> detailed in some places. It is easier to say what the documentation of
> this and any other generic library should be:
> 1) The description of the domain covered (it is equal to abstract
> mathematical model)
The domain covered (if I understand you correctly) is providing
abstractions for image representations from algorithms applied on
images, which allows us to write imaging algorithms once and have them
work for images of any representation.
It is specified:
- in the abstract
- in the first paragraph of opensource.adobe.com/gil
- slide 4 of the presentation video
- section 1 of the design guide
> 2) introductory examples of simple library usages
Simple examples of use of the library are provided:
- throughout the tutorial
- throughout the video presentation
- section 15 of the design document
- gil_sample_code.cpp
- numeric_example.cpp
> 3) definitions of concepts _with_ examples of it's implementations and
> adoptation of existing types to model the concept
Examples of implementations of each concept are listed after the concept
is introduced. See the "implementation" subsections in the design guide.
> This section covers the important question of how one can extend the
> library. It is too hard to harvest this information from current
concepts
> definitions in GIL docs.
We have an entire section 16 of the design guide to help with extending
GIL.
In addition, the Virtual Image Views section of the tutorial shows you
an example of how to create arbitrary virtual images.
>
> Mathematically speaking, image is a N-dimensional vector function in
M-
> dimensional space.
> In other words, we have a set of points in one N-dimensional space
(let
> call it S(N)) and a set of points in another M-dimensional space (let
call
> it S'(M), where ' corresponds to the target space), that corresponds
to
> the first set.
> In short form the same can be formulated as:
>
> Image := S(N) -> S'(M)
>
> where ":=" means "by definition"
>
> For example, for the 2D RGB image N == 2 and M == 3.
>
Your definition is an abstract definition of a function. It applies to
images equally well as to any other function, so you can use it to
describe any other library that deals in some way with mathematics.
While our concepts allow for images of arbitrary dimensions, GIL
provides models exclusively for images whose domain is two dimensional
(N=2), and whose range can have an arbitrary (M) dimension.
> To complete this definition, we need to add that each dimension in
both
> spaces is discretized and finite in some interval.
> For example, it is [0,255] for one byte channel in GIL terms.
Setting aside the fact that the machine representation of any number is
inherently discrete, discretizing the domains makes the concept too
narrow. Floating point channels are not discrete. You could also define
virtual image views whose coordinates are not integral types.
>
> After defining abstract object we can play with - the Image, we can
define
> Transformations on such objects. and different kinds of
transformations.
>
> The most general transformation can be expressed like this:
>
> Transform := {Img1} -> {Img2}
>
> where Img1 := S(N1) -> S'(M1) and
> Img2 := S(N2) -> S'(M2) and
> {} - defines a set in mathematical sense, so {Img1} is a set of
> images
>
> in other words, all attributes can change:
> dimensions of both, source and target spaces (S and S')
> and a number of images (let call it |{Img}|) in a source and
destination
> image sets:
This is just an abstract definition of an arbitrary transformation. Yes,
I believe all GIL does could be described as a subset of this
formulation, but so can any other math library.
GIL provides transformations that change:
The domain: (subsampled_view, transposed_view,....)
The range: (color_converted_view, nth_channel_view, ...)
Of course, everything stays in the context of 2D images.
>
> The reference to the STL is not by chance. The part of STL that
defines
> and operates on a Random Access Sequences has the same Abstraction
behind
> it, that is formulated here, but only for case when N is identical to
1.
> Wait a minute! STL operates on objects of arbitrarily type T, not on a
> linear spaces of dimension M as in our formulation here. Yes, but
there is
> a one to one relation between the two. (Each class object can be seen
as a
> set of values that it holds and this set of values can be seen as a
set of
> choords in our M-dimensional space)
>
> Note, that even for the case when N == 1 STL doesn't cover all kinds
of
> transformations possible, and it is really sad, GIL should be better
in
> this aspect. Moreover, GIL can (and may be should) supersede STL in a
> Random Access Sequences Transformations business.
> I want to claim that GIL should provide a primitives to express any
kind
> of Transformation possible.
Oleg - these are pretty ambitious goals and thank you for your trust
that GIL should be the library to provide them. Our goals are much more
humble and we would like to stay within the image processing domain.
If you have concrete ways to improve STL that you can put into code, I
encourage you to provide your own library. Once you have the library we
will consider the option of using your transformations for GIL, and I am
sure other library developers will be interested too.
I am curious, what do you think are the primitives that STL lacks and
that allow for expressing any kind of transformation possible?
> I claim that GIL has to remove algorithm concept completely, and leave
> only View with a new meaning - it is not only a lightweight Image
> representation, but it is a representation of the Transformation
> abstraction from now.
> View would be invoked when, and only when it would be assigned to
Image:
> Image img = some_view; // all computation made here
>
You are obviously a fan of functional programming. While functional
programming has its appeal, providing functional programs that match the
efficiency of imperative programs in every context is out of the
question.
GIL allows image transformations to be expressed as combinations of
functional programming (by creating image views) and imperative (via GIL
algorithms). While we have suggestions of where to use each (by
providing a set of both) we leave the ultimate choice to the user's
judgment.
> Here you should say that View already is a Transformation in GIL. Yes,
but
> from docs I've realized that that its potentioal is not clearly
understud
> by GIL authors. I claim that TransformationView should become the
central
> part in GIL, almost all GIL programming and using/extending should be
> centered around the View concept.
Again, it is a matter of your preference and your tolerance for the
associated performance overhead.
>
> Next, and very important issue: laziness produces overhead without
some
> kind of memoization technique. I claim that GIL should have one
built-in.
There is no magical memorization technique that could match the
performance of imperative programming, because what you call the
TransformationView is not in control of the algorithm. It doesn't know
the pattern of access of the pixels. That pattern is determined by the
type of algorithm and the transformation views that precede it. One
example that Dave points out is whether to memorize the value at all. If
it will be used again it makes sense to memorize it. Otherwise it could
be a huge waste of memory and time.
> Now GIL docs have examples of suboptimal algorithms, that are
suboptimal
> because of this absence of memoization support.
> I mean examples, where one needs to apply non-trivial views on each
step
> in order to increase performance, but it is:
> 1) suboptimal
> 2) looks agly
> 3) wastes memory
>
> The same examples could be rewritten by hand without GIL in a much
more
> efficient way. It is not a good conclusion for library that claims
that
> "Performance. Speed has been instrumental to the design of the
library.
> The generic algorithms provided in the library are
> comparable in speed to hand-coding the algorithm for a specific image
> type."
Please be specific. What example do you have in mind that could be
written by hand without GIL in a much more efficient way?
>
> My final claim is about representativity of implementations of
concepts in
> GIL. I claim that they are insufficient. For example, sub-byte
channels
> should be implemented to ensure that GIL's concepts are good enough to
> handle it. In particular, I'm very interested to see an implementation
for
> 1bpp colorspace that is very common in OCR development.
I can understand that. But we have provided models for the vast majority
of cases, even for image views that represent an arbitrary function.
Certainly it would be nice to have models for everything that people
might ever need but at some point we have to draw the line.
>
> I hope, that my comments would help authors to make the GIL that would
be
> in a magnitude better than STL is now.
>
Again, thanks for the ambitious goals you have for GIL.
Regarding the concept definitions that you propose, please be concrete
on what specific suggestions you have for changes to the GIL code, if
any.
Regarding your ideas for improving STL and providing fully optimized way
to do functional programming, we encourage you to put your ideas into
code - this is the best way of specifying what they are. Once you do
that we will have something more concrete to discuss.
Thanks for your review.
Lubomir & Hailin
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk