Boost logo

Boost :

From: Mateusz Loskot (mateusz_at_[hidden])
Date: 2019-05-29 11:15:33


Hi,

Miral and I, we have been brainstorming on what would be a clear yet
optimal interface for the thresholding, on Gitter and comments at
https://github.com/BoostGSoC19/gil-miral/pull/1

This made me think the issue of interface design for new algorithms is more general,
not just specific to the thresholding. I think it may be worth to set some common
guidelines, to tune ourselves w.r.t. interface design in the direction that may be
preferred in GIL.

### Problem Example

For instance, OpenCV takes approach of single function to 'kill'em all', i.e.
cv::threshold, that takes multiple parameters where combination of values and
flags is significant
https://docs.opencv.org/4.1.0/d7/d1b/group__imgproc__misc.html#gae8a4a146d1ca78c626a53577199e9c57

In this approach documentation is crucial to explain what combination is valid,
when `maxval` is used (e.g. with THRESH_BINARY) and when ignored.

I have quite strong issues with this approach.
Why? Read the documentation above and try to tell me

- What it means "special values THRESH_OTSU or THRESH_TRIANGLE may be
  combined with one of the above values"?
  Does it mean bit flag combined, arithmetic addition combined?
- What is `thresh` value, depending on the thresholding `type`?

If user is motivated enough, she may search deeper and find
"flag, use Otsu algorithm to choose the optimal threshold value"
https://docs.opencv.org/4.1.0/d7/d1b/group__imgproc__misc.html#gaa9e58d2860d4afa658ef70a9b1115576
Again, is it clear enough?

If user is still motivated, she continues searching to find, for instance, this page
https://docs.opencv.org/3.4/d7/d4d/tutorial_py_thresholding.html
with this example (amond many):

```
cv.threshold(img,0,255,cv.THRESH_BINARY+cv.THRESH_OTSU)
```

It now displays that flags are `+` added and for `THRESH_OTSU` (what also implies
for `THRESH_TRIANGLE), one has to pass Zero for `threshold` value argunent.

To me, the interface of `cv::threshold` is not self-descriptive enough, is
cluttered, it tries to be single hammer for nails of all sizes.

### Alternative roposal

Functions for simple global thresholding:

```
enum class threshold_direction { regular, inverse };
enum class threshold_optimal_value { otsu, triangle };
enum class threshold_truncate_mode { threshold, zero };

threshold_binary(src, dst, threshold_value, max_value, threshold_direction);
threshold_binary(src, dst, threshold_optimal_value, max_value,
threshold_direction);

threshold_truncate(src, dst, threshold_value, threshold_truncate_mode mode,
threshold_direction);
threshold_truncate(src, dst, threshold_optimal_value, threshold_truncate_mode
mode, threshold_direction);
```

The overall set of modes (the enums) was inspired by
https://docs.opencv.org/4.1.0/d7/d1b/group__imgproc__misc.html#gaa9e58d2860d4afa658ef70a9b1115576

Functions for adaptive thresholding:

```
enum class threshold_adaptive_method { mean, gaussian };
threshold_binary_adaptive(src, dst, max_value, threshold_adaptive_method,
threshold_direction, kernel_size, );
```

The `threshold_adaptive_method` set was inspired by
https://docs.opencv.org/4.1.0/d7/d1b/group__imgproc__misc.html#gaa42a3e6ef26247da787bf34030ed772c

### Conclusions

What do you think about the alternative?

- Does it make the interface simpler?
- Does it make the interface self-descriptive, requiring less documentation?

I would like to ask our students, Miral and Olzhas, to give some thoughts to this
and overall issue of interface design.

Stefan, what do you think about it?

(I find it it very hard to search Gitter archives, so I prefer to discuss
important issues here on the boost-gil.)

Best regards,

-- 
Mateusz Loskot, http://mateusz.loskot.net
Fingerprint=C081 EA1B 4AFB 7C19 38BA  9C88 928D 7C2A BB2A C1F2

Boost list run by Boost-Gil-Owners