Boost logo

Boost :

From: Marcelo Zimbres Silva (mzimbres_at_[hidden])
Date: 2023-01-19 21:10:32


On Thu, 19 Jan 2023 at 18:32, Ruben Perez via Boost
<boost_at_[hidden]> wrote:
>
> I have questions too
>
> 1. From what I gathered from docs and source code, connection
> implements a queue where requests are placed before sending.
> async_exec puts/pops from the queue,

Correct.

> and async_run does the actual I/O.

No, The IO (read) is done by each individual async_exec call, so it
can parse resp3 directly in their final storage without copies (i.e.
the data structure passed by the user on the adapt function). When the
first async_exec is done reading its part of the response it will pass
IO control to the next async_exec in the queue and so on until the
whole pipeline has been processed, only then control is returned to
async_run.

That means, async_run coordinate IO between async_execs and async_receive.

> The benefits being you can call async_exec without the usual
> restriction of "single outstanding operation" that Beast and MySQL
> impose. Is that right?

Correct. This model can make better use of command pipelines and keep
the number of connections to the database low (one is enough).

> 2. If there are write errors in a request, it is async_run, and not
> async_exec, the function that gets the error is async_run,
> and not the individual requests queued by async_exec. Right?

Correct.

> 3. If a server push is received but no async_receive is outstanding,
> according to docs, "the connection will hang". Does that mean that
> any responses to other requests I wrote with async_exec will not
> complete until I call async_receive (so I can expect a deadlock
> there)?

Correct. Data sent from the server is not consumed on behalf of the
user, he must do it.

> Also, I can see there is a heuristic rule to determine what is a
> server push response and what is not, how does this interact with my
> comment above?

A push has a fixed RESP3 type (the resp3 message starts with a >
character). The heuristic you are referring to handles corner cases.
Say I send a command that expects no response, but I do an error in
its expected syntax: The server will communicate an error as a
response, in this case I have no other alternative than passing it to
async_receive, as it shouldn't cause the connection to be closed, it
can be further used (a questionable thing but possible).

> 4. From what I've gathered, the protocol can be made full duplex
> (you may write further requests while reading responses to
> previously written requests),

That is not how I understand it, I send a request and wait for the
response. That is why pipeline is so important:
https://redis.io/docs/manual/pipelining/. Otherwise why would it need
pipelining?

> but the queue acts as half-duplex (it won't write a batch until the
> response to the previous batch has been completely read).

Even full-duplex would require a queue as requests must wait any
ongoing write to complete i.e. prevent concurrent writes. And we also
need the incoming order in order to demultiplex them when the response
arrives.

> This can be circumvented using the low-level API.

The low-level API is there as a reference. I can't name any reason for
using it instead of the connection, there are small exceptions which I
will fix with time.

> Am I getting the picture right? Or are there further protocol
> limitations I'm not aware of?

AFAIK, resp3 is request/response with pushes, which means you might be
sending a request while the server is sending a push.

> 5. Merging async_run into async_exec and async_receive is possible
> but we lack the implementation tools required (namely Klemens' asem
> async synchronization primitives).

No. async_exec can't be merged with anything. There is perhaps a way
to merge async_run and async_receive. But it looks complex to me, so I
have to investigate more. There are advantages and disadvantages. The
main point is to guarantee there are no concurrent writes.

> You mention that there would be a performance loss, why?

Putting it in simple terms, I have observed situations where a push is
in the middle of a response, that means the async_exec must be
suspended and IO control passed to async_receive and then back to the
suspended async_exec. This ping pong between async operations is
complex to implement without async_run. I think there is a way
though, we can talk more about that later.

> 6. From the implementation, I've seen every request allocates a new
> asio timer, I'd say for communication between the async_exec and
> async_receive tasks. Is allocating those I/O objects cheap?

The timer is used to suspend async_exec while we wait for its
responses. It is allocated using the completion handler associated
allocator, so memory allocation can be controlled by the user.

> 7. There is no sync counterpart to async_receive, async_exec and
> async_run, which is unusual in Asio-based libraries - is that
> intentional?

Yes. I think it is not possible and meaningful to provide a sync
implementation. You can however see the sync example for an
alternative.

> 8. Docs state that async_run "will trigger the write of all requests
> i.e. calls to async_exec that happened prior to this call". I
> assume it will also write requests that are queued while async_run
> is in progress, right?

Correct.

> 9. What is the recommended way of using async_run, async_exec and
> async_receive together with multi-threading? Would using a
> basic_stream_socket<ip::tcp, strand> as AsyncReadWriteStream be
> enough?

The idiomatic thing to do is to pass a strand as the connection executor.

> 10. From what I've seen, all tests are integration tests, in that
> they test with an actual Redis instance.

The low_level.cpp and request.cpp are not integration tests. The
others use a locally installed redis instance with the exception of
the TLS test that uses db.occase.de:6380.

Very good questions btw.

Regards,
Marcelo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk