Boost logo

Boost :

Subject: Re: [boost] [outcome] user's experience
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2017-05-27 09:19:44


Le 27/05/2017 à 01:02, Andrzej Krzemienski via Boost a écrit :
> Hi Everyone,
> In order to test Boost.Outcome, I applied it to the problem I had to solve
> at work a while ago. For people who do not know what Boost.Outcome is for,
> this could be an introduction. For me it can be an opportunity to obtain
> feedback from Niall and Vicente, If I am using `expected` correctly, and
> they can get some feedback from me.

Thanks Andrzej for working on this. Having someone else that use the
library in concrete cases is always a good thing.
>
> Here is the task. I get an std::string as input: it is a data structure
> encoded in ASCII text:
>
> "1-10|11-20|21-99;1-10|10-50|51-99"
>
> Semicolons separate "distributions". Within distribution, vertical bars
> separate "ranges". Within ranges, dashes separarate integer numbers. There
> are some additional constrains. like ranges within distribution cannot
> overlap, no gaps between ranges are allowed, and a couple more.
>
> I need to validate if the input matches the required format, and build a
> corresponding data structure. But the vaidation cannot be just binary
> valid-invalied answer: I want to give user a description of what exactly is
> wrong with her input, as detailed as possible.
>
> So, I have function parse, that internally calls parse_distribution, which
> in turn calls parse_range. Each of them can return different parsing error
> that needs to be propagated up.
>
> My original implementation used "out" function parameters for signalling
> errors. Code was unnecessarily long and it was easy to forget to check
> results from functions, but I didn't want to use exceptions to indicate
> validation failures, because there is nothin exceptional about it. You
> expect input validation to fail -- quite often.
This is a typical use case for the Parser Monad.
> Now, this is a more philosophic question: when to use (or not) exceptions.
> One citerion I sometimes use is, "do I want my debugger to automatically
> break if this function reports faiure?"
>
> So I didn't use exceptions for input validation, but the ugliness of the
> code forced me to go back to rewrite it with exceptions. The code was
> shorter, and "saer" in the sense that I could not forget ot pass the
> failure up the stack.
I understand your concern.
>
> Now with Boost.Outcome I have a third option. It is almost as concise as
> exceptions, but does not use exceptions.
>
> Here is my implementation:
>
> https://github.com/akrzemi1/__sandbox__/blob/master/outcome_practical_example.md

>
> Note a couple of things. I am creating a number of temporay strings under
> way, they all can throw bad_alloc. I am fine with that. I am not affraid of
> throwing an exception, it is only the validation failures that I want to
> report with `expected`. For other failures I want exceptions.
I like the separation of concerns here. You expects to have some input
errors, but not be out of memory.
>
> This is input validation. It does not have to be super fast.
>
> In the public API of this file, I will not be exposing the information
> about parsing failues. `process` will consume the error information and
> just output something. So, the choice of `E` in `expected<T, E>` is
> internal to a file, and I can choose something specific to my use case. I
> provid my own type, which encodes different erroneous situations in an
> enum, and two ints for additional details.
>
> ```
> struct BadInput
> {
> enum {
> no_distrib = 1,
> range_count_not_3,
> distrib_starts_with_no_1,
> distrib_ends_with_no_99,
> gap_between_ranges,
> overlap_between_ranges,
> inverted_range,
> no_2_numbers_in_range,
> not_a_number,
> } type;
> int val1;
> int val2;
> };
> ```
> It is quite similar to Outcome's error_code_extended.
>
> Now, I also wanted to have an std::string member inside, but `expected`
> forbids it: It says as per LWG requirements `E` in `expected<T, E>` must
> not kave a throwing copy constructor. Is that right? In my use case it
> wouldn't harm anyone: if copying of `expected` throws, entire function
> `process` is exited, with a bad_alloc, which is fine. No `expected` with
> partial state will be ever observed. Do we need this restriction?
The question is what exception safety do you want. If we want that
original is not changed after a exception while doing an assignment, we
have no choice. We could get it with double-buffering, but I don't want
to go this way now.

How many would like an additional expected that use double_buffering to
support all the operations and the never-empty warranties?

However, we could support types that throw but we will not support
assignment and emplace. I don't see any problem with copy/move constructors.
I'll check the wording so that we restrict only when we use the
operations that need the restriction.
Niall, could you check from your side what operations need the nothrow
contraints?

Note that we can ensure almost the never-empty-warranty for
optional_expected (if we consider empty_t as an alternative).
>
> Also, at some point I will be returning an `expected<vector<T>, E>`, which
> makes me uneasy. Returning vectors wrapped in something: will I get runtime
> penalty for it? So, I naturally arrived at a pattern that Niall described
> in the other thread:
>
> ```
> out::expected<vector<T>, E> parse (...)
> {
> out::expected<vector<T>, E> ans {out::value};
> vector<T> & vec = ans.value(); // will if statement be ellided?}
>
> // fill vec
>
> return ans;
> }
> ```
>
> (BTW, this constant out::value is not documented in the docs, unlike
> out::value_t.)
>
> It has been mentioned a number of times that an optimizing compiler can
> remove a redundant if-statement. I know that the if-statement inside
> `value()` is redundant, but in this particular case, will compiler know it
> to? Nobody has called `has_value()` before.
You don't need. Use `*ans`. You know hat you have a value.

Is this

        out::expected<std::vector<Distrib>, BadInput> ans {out::value};

initializing the expected with a default vector?

Is this the equivalent to

        stde::expected<std::vector<Distrib>, BadInput> ans {std::in_place};

>
> Also, the following piece of code illustrates how concise the
> implementation of a single function can be:
>
> ```
> // I use a shortet alias for macro BOOST_OUTCOME_TRYX.
> // It is not portable, though. It only works on Clang and GCC.
> #define TRY(ex) BOOST_OUTCOME_TRYX(ex)
>
> out::expected<Range, BadInput> parse_range (const std::string& input)
> {
> tokenizer numbers_s {input, sep_3};
> auto it = numbers_s.begin(), itEnd = numbers_s.end();
> int count = std::distance(it, itEnd);
>
> if (count != 2)
> return out::make_unexpected(BadInput{BadInput::no_2_numbers_in_range,
> count});
>
> Range r {TRY(parse_int(*it++)), TRY(parse_int(*it++))};
Humm, side effects inside an expression :(. UB?
Better to define local variables and ensure sequential parsing.
>
> if (r.lo > r.hi)
> return out::make_unexpected(BadInput{BadInput::inverted_range, r.lo,
> r.hi});
>
> return r;
> }
> ```
>
> It is almost as short as if I used exceptions, except for the TRY macro.
> Note in particular this line:
>
> ```
> Range r {TRY(parse_int(*it++)), TRY(parse_int(*it++))};
> ```
>
> `parse_int()` returns `out::expected<int, BadInput>`.Now,
> `TRY(parse_int(...))` is an expression and means, "if parse_int() failed
> immedaitely return function `parse_range` with the same BadInput,
> otherwise use its `value()` to initialize `r`.
>
> What do you think. Am I using the library correctly?

With some minor exception I've signaled, yes, I believe it is a good
example.

It will be worth comparing with the code that use a parser monad approach.
But this is out of the scope of this thread. I will see with you in private.

Best,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk