Hi, some months ago I started implementing new benchmarks for Boost.Redis to compare it to the current state of other popular clients. While underway with it Boost.Redis gained Corosio support, something that has been announced by Ruben Perez in this mailing list recently. In this email I would like tol share the results I obtained for appreciation by the Capy/Corosio review audience. The code is public and available at https://github.com/mzimbres/redis-cli-comp. The new benchmarks simulate a scenario where Redis is mostly used, namely, internet facing servers (usually HTTP) that serve connections concurrently while receiving server pushes e.g. pubsub events. It consists of 1. starting multiple independent sessions that issue commands in a loop. 2. subscribing to a channel to receive pubsub events. ## Runtime Performance The metric used to assess the runtime performance was the wall-clock multiplied by %CPU consumption used by the client. This metric takes into account that clients might use a different number of threads. The following result were obtained (lower is better) Client Time x %CPU (nomalized) ------------------------------------------------------------------- boost_redis_corosio 1.00 boost_redis_asio_cb 1.35 boost_redis_asio_co 1.73 redis_rs (Rust) 4.62 go_redis (Go) 22.33 The thread and file-descriptor usage by each client was Client threads fd-nr -------------------------------------------------- boost-redis-corosio 3 7 boost-redis-asio-co 2 7 boost-redis-asio-cb 2 7 redis-rs 1 10 go-redis 24 1006 ## Application build time This is the time taken to build the benchmark program excluding the time taken to build the client library Library Time (s) --------------------------------------- boost-redis-corosio 1.68 boost-redis-asio-co 3.60 boost-redis-asio-cb 8.62 redis-rs 3.10 go-redis 0.28 ## Client build time Here we see the time taken to build only the client Library Time (s) ------------------------------------------- boost-redis-lib-corosio 5.3 boost-redis-lib-asio 21.7 redis-rs 9.6 Note: I haven't found a way to measure this for go clients. ## Total build time The times below include the build time of all dependencies (which have been previously downloaded) Client Time (s) --------------------------------------- boost-redis-corosio 33.6 boost-redis-asio-co 35.5 boost-redis-asio-cb 40.5 redis-rs 42.6 go-redis 9.0 ## Executable size That is the size of release builds of each benchmark program. Executable Size -------------------------------------- boost_redis_corosio 1.5M boost_redis_asio_cb 1.9M boost_redis_asio_co 1.3M redis_rs 2.0M go_redis 8.2M ## Summary It is not the point of this report to analyse the results in detail or emit any verdict, Corosio however has come out with really impressive results IMO. Marcelo
On 29 Jun 2026 01:01, Marcelo Zimbres Silva via Boost wrote:
## Runtime Performance
The thread and file-descriptor usage by each client was
Client threads fd-nr -------------------------------------------------- boost-redis-corosio 3 7 boost-redis-asio-co 2 7 boost-redis-asio-cb 2 7 redis-rs 1 10 go-redis 24 1006
I wonder why Corosio has one more thread than the ASIO variants. Does Corosio start an internal thread for some purpose? I'm assuming, the client code was equivalent for each library, or at least for the different variants of Boost.Redis. Also, worth noting that, according to the GitHub page you referenced, the number of context switches were the highest for Corosio, and the amount of system time was a bit higher than ASIO. This may be related to the additional thread being used.
The thread and file-descriptor usage by each client was
Client threads fd-nr -------------------------------------------------- boost-redis-corosio 3 7 boost-redis-asio-co 2 7 boost-redis-asio-cb 2 7 redis-rs 1 10 go-redis 24 1006
I wonder why Corosio has one more thread than the ASIO variants. Does Corosio start an internal thread for some purpose? I'm assuming, the client code was equivalent for each library, or at least for the different variants of Boost.Redis.
Also, worth noting that, according to the GitHub page you referenced, the number of context switches were the highest for Corosio, and the amount of system time was a bit higher than ASIO. This may be related to the additional thread being used.
The extra thread is allocated by the timer service in Capy, underlying capy::delay() and capy::timeout(). Capy doesn't know about Corosio, so I don't think it can use io_context threads to run timers.
On Mon, Jun 29, 2026 at 7:40 AM Ruben Perez via Boost <boost@lists.boost.org> wrote:
The extra thread is allocated by the timer service in Capy, underlying capy::delay() and capy::timeout().
Huh? How can Capy implement delay() and timeout() without a reactor? I don't think those functions belong in Capy. Thanks
On Mon, 29 Jun 2026 at 16:41, Vinnie Falco <vinnie.falco@gmail.com> wrote:
On Mon, Jun 29, 2026 at 7:40 AM Ruben Perez via Boost <boost@lists.boost.org> wrote:
The extra thread is allocated by the timer service in Capy, underlying capy::delay() and capy::timeout().
Huh? How can Capy implement delay() and timeout() without a reactor? I don't think those functions belong in Capy.
Thanks
It looks like it can. These functions seem to have been there for a while: https://master.capy.cpp.al/capy/reference/boost/capy/timeout.html https://master.capy.cpp.al/capy/reference/boost/capy/delay.html I asked myself the same question the first time that I saw them :) Looking into the code, Capy spawns a thread and waits on a std::condition_variable to implement this: https://github.com/cppalliance/capy/blob/develop/src/ex/detail/timer_service... I do agree it's odd, especially considering that Corosio has timers, and cancel_at/cancal_after members. I find the ergonomics of delay() and timeout() much better than those of timers, but that's a different concern to what's being discussed here. This makes me wonder if it is safe to use delay() and timeout() with an io_context with a concurrency_hit of one, or not.
On Mon, Jun 29, 2026 at 7:50 AM Ruben Perez <rubenperez038@gmail.com> wrote:
It looks like it can. These functions seem to have been there for a while:
https://master.capy.cpp.al/capy/reference/boost/capy/timeout.html https://master.capy.cpp.al/capy/reference/boost/capy/delay.html
Hmm... no, I don't think this is a good idea at all.
I find the ergonomics of delay() and timeout() much better than those of timers, but that's a different concern to what's being discussed here.
Well, of course the ergonomics are better. Because the Capy timer operations hide a memory allocation through use of std::function: https://github.com/cppalliance/capy/blob/9144290189fa149b27617c7d9a476c8fbff... Corosio makes this explicit by requiring the user to manage the timer object's lifetime.
This makes me wonder if it is safe to use delay() and timeout() with an io_context with a concurrency_hit of one, or not.
I think these two functions and the timer service should be removed as a condition of acceptance. They can be moved to the examples. Thanks
On 29 Jun 2026 17:49, Ruben Perez via Boost wrote:
On Mon, 29 Jun 2026 at 16:41, Vinnie Falco <vinnie.falco@gmail.com> wrote:
On Mon, Jun 29, 2026 at 7:40 AM Ruben Perez via Boost <boost@lists.boost.org> wrote:
The extra thread is allocated by the timer service in Capy, underlying capy::delay() and capy::timeout().
Huh? How can Capy implement delay() and timeout() without a reactor? I don't think those functions belong in Capy.
Thanks
It looks like it can. These functions seem to have been there for a while:
https://master.capy.cpp.al/capy/reference/boost/capy/timeout.html https://master.capy.cpp.al/capy/reference/boost/capy/delay.html
I asked myself the same question the first time that I saw them :)
Looking into the code, Capy spawns a thread and waits on a std::condition_variable to implement this: https://github.com/cppalliance/capy/blob/develop/src/ex/detail/timer_service...
I do agree it's odd, especially considering that Corosio has timers, and cancel_at/cancal_after members.
I find the ergonomics of delay() and timeout() much better than those of timers, but that's a different concern to what's being discussed here.
This makes me wonder if it is safe to use delay() and timeout() with an io_context with a concurrency_hit of one, or not.
My general preference is that it is best to avoid spawning internal threads and instead design API in such a way that the user provides a thread, if one is needed. This gives the user more control over resource management and allows for custom thread initialization, which may be necessary if user's code is supposed to be run in that thread. Think of stuff like thread custom stack size or CoInitialize(). I'm not sure which library timeouts and delays belong to, but I do agree that these features should be based on IO reactor loop. IMHO, if Capy has to provide those features while it doesn't provide IO reactors, it should accept an externally-provided reactor to implement those features.
On Mon, Jun 29, 2026 at 8:10 AM Andrey Semashev via Boost < boost@lists.boost.org> wrote:
My general preference is that it is best to avoid spawning internal threads and instead design API in such a way that the user provides a thread, if one is needed.
We try to do that but sometimes the internal thread cannot be avoided. For example domain name resolutions are inherently synchronous. User code doesn't run in the implementation-defined thread which Corosio launches for this. Thanks
On 29 Jun 2026 18:12, Vinnie Falco wrote:
On Mon, Jun 29, 2026 at 8:10 AM Andrey Semashev via Boost <boost@lists.boost.org <mailto:boost@lists.boost.org>> wrote:
My general preference is that it is best to avoid spawning internal threads and instead design API in such a way that the user provides a thread, if one is needed.
We try to do that but sometimes the internal thread cannot be avoided. For example domain name resolutions are inherently synchronous. User code doesn't run in the implementation-defined thread which Corosio launches for this.
Strictly speaking, asynchronous DNS resolvers do exist (e.g. c-ares). But I understand that they may not be available on a given system, and an implementation with an extra thread is needed as a fallback. In this case, I would still prefer an option for a user to provide his own thread for the resolver.
On Mon, 29 Jun 2026 at 00:02, Marcelo Zimbres Silva via Boost <boost@lists.boost.org> wrote:
Hi, some months ago I started implementing new benchmarks for Boost.Redis to compare it to the current state of other popular clients. While underway with it Boost.Redis gained Corosio support, something that has been announced by Ruben Perez in this mailing list recently.
Have you considered replicating your experiments on the cost of async abstractions [1], but for Capy/Corosio? I'd be really interested in the cost that tasks that complete immediately have. Note that I'm talking about tasks, rather than awaitables - so something like: capy::io_task<> queue_push(int value) { if (!full()) { container.push_back(value); co_return {}; } // wait for space } The reason why I'm asking this is because this is the kind of code that the "Application developers" user tier [2] has the ability to write, and hence the most abundant. Thanks, Ruben. [1] https://github.com/boostorg/redis/blob/develop/doc/on-the-costs-of-async-abs... [2] https://isocpp.org/files/papers/P4172R1.pdf
participants (4)
-
Andrey Semashev -
Marcelo Zimbres Silva -
Ruben Perez -
Vinnie Falco