Boost logo

Boost :

From: Marcelo Zimbres Silva (mzimbres_at_[hidden])
Date: 2023-01-23 21:43:53


On Mon, 23 Jan 2023 at 14:43, Ruben Perez via Boost
<boost_at_[hidden]> wrote:
>
> > To fix this properly I would need a tool similar to what protobuff is to
> > grpc: generate requests and responses stub from a schema.
> >
> > Aedis however is not trying to do that, it provides the infrastructure with
> > which such a tool could be created. Perhaps there is no place in Boost for
> > such a library, but it is still necessary.
>
> I do think there is a place in Boost for a library like this. I don't think
> we need neither a 100% type-safe solution, nor to completely hide the general
> runtime interface that request::push() gives us. But I think we do have some
> usability barrier here, and you need to have a plan on how to evolve it to
> make it better, before the library gets into Boost and you loose the ability
> to introduce big breaking changes.

My plan is to leave the generic resp3::request untouched and if any
good idea about safe requests comes along I will add it to e.g.
aedis::redis::request.

This review triggered many discussions about type-safe requests but
they are too vague and at an early stage to be worth mentioning them
here.

> I don't think the type-safe solution should span *all* Redis commands, but
> only the most common ones. If the command the user wants does not have
> type-safety implemented, she can always use what it's in place now. Simple
> interfaces for simple uses, complex ones for advanced use cases (that's
> something I learnt in MySQL the bad way).

That is a good pragmatic solution, I think other libraries go this
way. Notice however there are many hundreds of commands so the
majority will be left out and maintenance effort will grow unbounded.
I am still hoping it can be automated with generators.

> > > As an example of this, the library's own example cpp20_containers, in the
> > > hgetall function, uses the type std::map<std::string, std::string> for
> > > the hgetall response, which will be okay unless the key is not set, in
> > > which case you will get an error. I think this has an enormous potential
> > > for causing hidden bugs (and even vulnerabilities). I've also noticed
> > > that if you mismatch the sizes of the pipeline request and response, you
> > > get an assertion (so UB).
> >
> > To avoid this you can wrap the response type in std::optional, for example
> >
> > std::optional<std::map<std::string, std::string>>
> >
> > I don't lock users in this behaviour because they might not want it.
> > Perhaps it is desirable to have an error instead of having to check
> > whether the optional has a value. The user must know what he needs.
>
> Hand't think of that. Although I don't have data to speak, I get the impression
> that this feature will be used unintentionally more than intentionally.
> But it might just be an opinion.

Yes, you can shoot yourself in the foot. Better use safe defaults with
std::optional.

> > As an alternative I could lock-in users on resp3 types: maps, sets,
> > etc. But then how to represent data structures that have no
> > corresponding C++ types. It would also be inefficient as it would
> > require first copying and then converting.
>
> That's something I didn't take into account when I wrote the review, and that
> gets solving this harder. Is there a way, given a type T, to know whether
> the response to a given command (say "GET") is compatible with T? Without
> performing the actual parsing?

I did not find any way of checking this statically. Most docs still
refer to resp2, the old protocol. I think we will have to live with
the fact that command return types are only known dynamically at parse
time by the application, if the user wants to treat them statically,
he will have to read the docs.

> I can think of the following solutions (excuse me if any of these
> have flaws I haven't taken into account):
>
<snip>
>
> Do you find any of these solutions appealing? Knowing that a strategy is
> in-place to evolve the library towards type-safety would alleviate many
> of my concerns.

Please, give me some time. I have to think more about them.

> > I am trying to understand how you avoid these risks in Boost.MySql but
> > it seems it suffers from the same problems?
> >
> > I mean this: https://github.com/boostorg/mysql/blob/ac7285c62163c0de40a466ecaeca13f7de5af0c4/example/text_queries.cpp#L108
> >
> > conn.query("SELECT salary FROM employee WHERE first_name =
> > 'Underpaid'", result);
> > double salary = result.rows().at(0).at(0).as_double();
> >
> > The query is not type safe [1] and the result can mismatch the the
> > query. Why would it be considered ok for Boost.MySql and not
> > Boost.Aedis?
> >
> > Disclaimer: I am not trying to make a case or open a precedent, it is
> > purely technical. I might be wrong but I would like to clarify anyway.
>
> So we have two parts here. The "client -> server" interface has two flavours:
> connection::query(string_view), and statement::execute(const ParamsTuple&).
> query() should only be used for queries that don't have parameters provided
> by the user (I should change that example, BTW). statement::execute takes a
> tuple of parameters. The client will check for compatibility in terms of number
> of parameters. Types are not checked in the client because the server is
> *very* flexible with type conversions.
>
> The second part is the "server -> client" interface (mysql::resultset), which
> is the one I think you're referring to here. It's a purely variant-like
> interface. Still, there is code to runtime-check that what the server sent
> matches the query metadata, which is provided by the server.
> I view this variant-like interface similar to Boost.JSON parsing into a
> json::value.
>
> I'd say the differences I see from the Aedis case are:
>
> * There is no way for the MySQL client to know the resulting data structure
> from a query given the contents of a query. "SELECT product FROM products"
> can yield 10 different data types, and without the table definition, you
> can't know.
> I got the impression this is not the case in Redis - "HGETALL" always
> returns the same data structure (please correct me if I'm wrong).

Yes, more specifically

* resp3::type::null: If the key is not set.
* resp3::type::simple_error: If there is a type mismatch
* resp3::type::map: On success.

But commands like SET might return different types (quoting from
https://redis.io/commands/set/)

   - Simple string reply: OK if SET was executed correctly.

   - Null reply: (nil) if the SET operation was not performed because
the user specified the NX or XX option but the condition was not met.

   - If the command is issued with the GET option, the above does not
apply. It will instead reply as follows, regardless if the SET was
actually performed:

   - Bulk string reply: the old string value stored at key.

   - Null reply: (nil) if the key did not exist.

> * From a user perspective, when I use a variant-like interface (like the one in
> Boost.JSON or in Python database access), my head is set to "type checking is
> my responsibility". I know I'm in runtime. When I use a compile-time
> interface, I get much more relaxed. I got the "I'm at compile-time safety"
> with Aedis, but that impression is not right. Again, this is just based on
> what I felt while using the library for the first time, other users might
> feel differently. I think if I had encountered an aedis::value, similar to
> json::value, I would have switched my mind to "I'm at runtime".

I believe the equivalent of a variant in Aedis is the infamous
std::vector<resp3::node<std::string>> for which I can to provide more
facilities like

   using response = std::vector<node<std::string>>;

   request req;
   response resp;
   co_await conn.async_exec(req, adapt(resp));

   // Throws if null or error was received, error message is discarded.
   auto set = as<std::set<std::string_view>>(resp, index)

   // Null accepted, but throws if error is received error message is discarded.
   auto set = as<std::optional<std::set<std::string_view>>>(resp, index)

   // Does not throw, error must be manually checked and error message
is discarded.
   auto set = as<std::tuple<std::optional<std::set<std::string_view>>,
error_code>>(resp, index)

   // Does not throw, does not discard messages.
   auto set = as<std::tuple<std::optional<std::set<std::string_view>>,
error_code, error_info>>(resp, index)

and other overloads for deserialization.

> > This is to make it safe by default and in part because Aedis is
> > centered around the adapter. In the same sense that you can write an
> > adapter to receive responses in the data-structure that makes most
> > sense to the application e.g. std::map, std::unordered_map,
> > boost::flat_map etc. You can also customize the way errors are
> > handled. For example,
> >
> > * Errors like type mismatch are usually unrecoverable and you might
> > want to fail fast and let the async_ operation itself fail instead of
> > proceeding in an unknown state.
> >
> > * If you can tolerate errors like that you can write an adapter that
> > does not communicate an error to the parser. I did not make this the
> > default because it looks unsafe, people will usually expect
> > async_exec(req, adapt(resp)) to throw when the response can't be
> > stored in resp.
> >
> > I am perhaps at fault here for not documenting this feature. I thought
> > however not many people would want this behaviour.
>
> I think you're right and that this is the most common case. I had a look at
> the Python redis driver, and they do the following:
>
> * By default, fail the operation if any of the pipeline requests failed.
> * If you pass a certain config option, individual errors are reported as
> error response objects. So if you sent [C1, C2, C3] and C2 failed, you
> will get [C1_response, C2_error_info, C3_response].
>
> I think your default is right. Maybe you could provide the option to report
> errors individually by having the user provide
> a) std::tuple<leaf::result<T1>, leaf::result<T2>>
> Instead of
> b) std::tuple<T1, T2>
> If the user did b), then do the same you're currently doing, otherwise report
> errors individually.

Yes I need something like this to complement my adapter module. It is
something I will work on in the future.

Thanks for the engagement,
Marcelo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk