|
Boost : |
From: Mateusz Loskot (mateusz_at_[hidden])
Date: 2019-06-28 17:39:21
Hi Miral,
Reviewing the Otsu implementation made me realise something, my bad,
that I have missed during discussions about the binary thresholding.
The issue is
general (whole) image binarization
vs
multi-channel (channel-wise) binarization.
## Global binarization
Let's consider this a canonical (common) case of thresholding.
This thresholding is a binarization of image that outputs
a binary (black & white) image.
Technically, it does not have to be represented as a 1-bit channel image,
but 8-bit grayscale with pixels having only two values:
- Zero
- Maximum
I think, the general binarization should be a default behaviour.
Regardless of type of `src_view`, it should enforce `dst_view`
is as grayscale view (or binary view, 1-bit values).
For example:
threshold_binary(gray8_src_view, gray8_dst_view, 128, 255);
threshold_binary(rgb8_src_view, gray8_dst_view, 128, 255);
threshold_binary(cmyk8_src_view, gray8_dst_view, 128, 255);
The behaviour should be default for all `threshold_*` functions, I think.
## Multi-channel binarization
This is a sophisticated case of thresholding,
after https://homepages.inf.ed.ac.uk/rbf/HIPR2/threshld.htm
> For color or multi-spectral images, it may be possible to set different
> thresholds for each color channel, and so select just those pixels within
> a specified cuboid in RGB space.
In this case, thresholding should enforce that both, `src_view` and
`dst_view` need to be N-channel images. In fact, we could simplify and
enforce that `dst_view` is of exactly the same type as `src_view`.
For example:
threshold_binary(gray8_src_view, gray8_dst_view, 128, 255);
threshold_binary(rgb8_src_view, rgb8_dst_view, 128, 255);
threshold_binary(cmyk8_src_view, cmyk_dst_view, 128, 255);
This mode is performed on explicit request, via combination of parameters.
## Solution #1: control via dst_view
If `dst_view` is single-channel, then global image binarization is performed.
If `dst_view` is multi-channel, then channel-wise binarization is performed.
## Solution #2: additional parameter
The two different behaviours could be captured by additional
parameter of all `threshold_*` functions, just after the `direction`,
as it would be used the least, I expect.
enum class threshold_binarization_mode { image, channel };
1. These two would produce the same output:
threshold_binary(gray8_src_view, gray8_dst_view, 128, 255
threshold_direction::regular,
threshold_binarization_mode::image);
threshold_binary(gray8_src_view, gray8_dst_view, 128, 255
threshold_direction::regular,
threshold_binarization_mode::channel);
2. Multi-channel image to binary image
threshold_binary(rgb8_src_view, gray8_dst_view, 128, 255,
threshold_direction::regular,
threshold_binarization_mode::image);
3. Multi-channel image to multi-channel image
threshold_binary(rgb8_src_view, rgb8_dst_view, 128, 255,
threshold_direction::regular,
threshold_binarization_mode::channel);
If `src_view` and `dst_view` do not make sense for given
`threshold_binarization_mode`, then exception is thrown.
## Conclusion
I prefer the solution #1 as simpler with less parameters to control.
AFAIK, the #1 is our current approach and the channel-wise
binarization is controlled by combination of views.
Correct?
Related design and behaviours should be clearly stated and documented.
Tests of the implementation should do better, I think.
By the way, this is issue shows that tests should cover more cases :-)
Otherwise, it is too easy to overlook what behaviours are by design
and what are accidental, and what are missing.
For `threshold_*` functions, I think this it is bare minimum to cover
thresholding with tests for these three pairs of input/output:
- gray8_image_t / gray8_image_t
- rgb8_image_t / gray8_image_t
- rgb8_image_t / rgb8_image_t
If there are any invalid combinations of parameters, at least some
should be tested. For example, what if user calls:
threshold_binary(gray8_src_view, rgb8_dst_view, 128, 255)
The current implementation (incl. Otsu) needs to be reviewed against
this overall issue.
I'd start with adding test cases as explained above and see where
the implementation blows up :-)
Please, could you comment on that? Is my understanding correct?
Best regards,
-- Mateusz Loskot, http://mateusz.loskot.net Fingerprint=C081 EA1B 4AFB 7C19 38BA 9C88 928D 7C2A BB2A C1F2
Boost list run by Boost-Gil-Owners