This is my formal review of the capy and corosio libraries.
This review of capy and corosio is based on 4 months of active
development on a large-scale production codebase which consists of a
total of 56 repositories in a standard platform release, of which 39
form part of the C++ framework, all of which use asio for their
underlying asynchronous implementation. Of those, 7 repositories
contain more concentrated use as part of the message passing
architecture by providing core pipeline and transport layer utilities.
The codebase in question is for a tier-1 financial market system that
has been in production for many years. The codebase builds using
gcc15, supporting C++23/26.
It is worth noting that this review has been written with the view
that capy/corosio is used to underpin the backend asynchronous
operations and networking. No attempt was made to have asio/corosio
interoperability.
I have framed my review around the original questions proposed by Jeff Garland.
1. What is your evaluation of the usefulness of the libraries?
In a high performance, ultra low latency environment, being able to
move data around efficiently is of the utmost importance. The capy
library is extremely useful for this. It has a very specific set of
functionality: coroutine tasks, buffers, streams, and executors. This
makes it easy to reason about where it fits into an existing codebase,
or how it is to be used for a new project. Equally, the same can be
said about corosio, fast and efficient networking is core to the
functioning of financial systems. Having a modern approach to network
programming is extremely useful.
2. What is your evaluation of the design?
Having actively used this library for the last 4 months, I can say
with confidence that capy is a well designed library. By separating
out buffers, streams, and executors, capy gives the user orthogonal
abstractions. That is to say, memory layout, asynchronous byte
transfer, and coroutine scheduling can be mixed and matched, optimized
independently, tested in isolation, and reused without pulling in
platform specific networking or each other's implementation details.
A key design decision is that it is coroutine-only. This has not been
retrofitted and therefore commits to one single asynchronous style.
The usual benefits of coroutines apply: flow control reads as
sequential code. The use of capy primitives (eg:
run_async/when_all/etc...) are used instead of ad-hoc callback chains.
Corosio provides a layered design (similar to capy) that consists of
three layers: (viewed as native vs. type-erased)
- abstract - eg: io_stream, type-erased and generic that works with
TCP/UDP/Unix domain socket/etc...
- concrete - the full protocol API (connect, bind, endpoints, socket
options) for each type.
- native - implementation specific inlined awaitables with no vtable.
The corosio design focuses on IO primitives, sockets, resolvers,
timers, etc... and does not try to be an execution framework. This
results in a cleaner codebase with clear ownership of problems, all of
which can be paired very naturally to specific networking problems,
like HTTP, or Websockets. By using the capy::io_result<T>, there is a
consistent way to handle errors and results. It is the same for
sockets, resolvers, timers, etc…
3. What is your evaluation of the implementation?
The capy and corosio implementations use modern C++20 techniques:
- concepts
- coroutines
- structured bindings
- std::stop_token
- PMR-style allocators
I have spent some time inside the capy source (mostly when it came to
raising bugs/issues/questions) and I found that there is a clear
layered design without platform coupling.
The layered design provides clear separation points within the code:
executor / run_async -> where and when coroutines resume
|
v
streams / sources/sinks -> the async byte transfer semantics
|
v
buffers / buffer sequences -> memory regions those operations use
Users are able to choose the layer as per their need, for example:
- Protocol parsing - BufferSink / BufferSource concepts (zero-copy).
- Connection logic - concrete stream types (or the type-erased any_stream&).
The higher-level behaviour is typically expressed as template free
functions that are constrained by concepts. Capy keeps concepts
(compile-time contracts) and wrappers (runtime polymorphism) as
separate layers, so you choose where the cost lands (native vs.
type-erased) . This approach to implementation gives the user full
control over how they integrate capy into their projects.
It is also worth mentioning that a lot of the capy and corosio
specific compile errors that were seen were easy to follow and debug.
It was rare to see pages of obscure compile-time errors with little or
no information about what actually happened when the capy/corosio
related code was incorrect.
The corosio implementation has a very familiar look and feel,
borrowing much of the styling from well established asio principles.
Many of the APIs are very similar to asio. As a result, the biggest
challenge was thinking about the coroutine integration, rather than
having to re-think the applications network stack and how it behaves.
The capy and corosio source code quality is very high. When it came to
requesting features, an attempt was made to understand how things
currently work before offering up a solution, or, to check whether it
was feasible or not. The code was easy to read and understand. A
number of bugs were fixed and features added during this large-scale
refactoring project.
I found error handling to be very consistent across the library. The
structured binding approach is used everywhere and io_result never
throws, which is extremely important for the code I work on. However,
it is possible to throw std::system_error(ec) if that behaviour is
desired.
From a performance and benchmarking perspective, the implementation
(epoll in this case) is performant. It is on a par with asio under
production-style loads and it is definitely capable of low latency
message handling (more on this later). Having the option to use fully
inlined backend calls without introducing a second, parallel type
hierarchy or a separate way of thinking about sockets and streams is a
very attractive implementation detail.
4. What is your evaluation of the documentation?
This was my first real foray into the world of coroutines and how they
might be used in a production codebase. I am familiar with C++20
coroutines, but I had no real world experience with them. The capy
documentation gives a very good primer and introduction into
coroutines, concurrency, and all of the associated concepts. I found
the documentation clear, and easy to follow.
I have a lot of experience with asio and reading through the capy
documentation allowed me to see the similarities between capy and asio
(in terms of buffers) and map them directly onto pre-existing concepts
that were lifted from asio, no discrepancies in terms of what I would
expect the behaviour to be were noticed.
For the new concepts introduced by capy, each section is well
presented and clearly written. The code examples are concise and go to
great lengths to illustrate most (all?) of the functionality. It was
rare that I would not find what I was looking for when looking over
the specific examples if I was unclear on how to do something. If
details were missing, the authors made the required documentation
updates.
As an experienced user of asio, my view on the corosio documentation
is that it is very well written and explains each of the concepts very
clearly. Users who have asio familiarity will be very comfortable with
this documentation.
As for users who may not be familiar with network programming, or
asio, they will get a gentle introduction with a number of tutorials
that are well written and include code examples that are well
commented. New users should be able to get up to speed quickly, given
the quality of the documentation.
5. Have you used either or both libraries? What was your experience?
As mentioned above, I have been using capy for 4 months now on a
large-scale production codebase that is underpinned by asio. The goal
of the project was to assess the viability of capy/corosio as an
alternative to asio. The asio code in question used lambda-style
callbacks and in some places, asio (stackless) coroutines, as well as
various uses of timers, buffers, and asio specific helpers for various
networking functionality.
The first major piece of work was to integrate capy into the core
message passing pipeline library that underpins the entire codebase. A
traits based approach was taken which facilitated using asio_traits or
capy_traits as the backend. This was configured as a compile-time
option. The system could be built with asio or capy.
A springboard approach was taken that wraps a callable with all of the
required coroutine machinery and is then executed inline, or posted
across a thread boundary. This integration was relatively unintrusive
aside from the compile-time defines that needed to be specified in a
small number of places. Even when capy moved from a
std::coroutine_handle to a capy::continuation model, the traits based
approach was easily modified to handle the continuation based
approach. Part of the goal with this springboard approach was to avoid
changing core infrastructure. Whilst the implementation may not have
been as optimal as it could have been, the performance was still very
comparable to the asio equivalent.
This traits based approach resulted in a very minimal set of
boilerplate code to determine which implementation was used at compile
time. This boiled down to a number of defines being set in the main
application code, meaning that as you moved lower into the library
code, these defines were not required and the codebase “looked normal”
with no “per-implementation” hacks to work around certain things.
The capy stream concepts and buffers were almost a direct drop-in
replacement for the asio buffers. Very little additional code was
required and in most cases, it was making sure that the new capy types
were compatible with the internal structures used within the
production codebase.
Using capy::run_async to launch the asynchronous behaviour at various
locations in the codebase provided very clear entry points to indicate
that asynchronous operations were about to occur. Support for
stop_token was already part of our asio codebase which was a custom
implementation of P0660 (our implementation pre-dated support in the
standard library and the version of gcc that was being used at the
time). As such, his custom implementation was replaced with
std::stop_token. The majority of this change was replacing the custom
namespace with std::.
A number of issues were raised during this development period and were
resolved very quickly, or alternatives presented (which ultimately
made it into the documentation). As a result, the usage of capy within
this codebase does not have any drawbacks or missing features.
When it came to integrating corosio, the most challenging aspect was
re-writing the lambda-style callback-based code with a coroutine based
approach.
The challenging part was around how to restructure the existing
callback code so that lifetime issues were managed correctly. The
difference in behaviour between callbacks and coroutines required some
thought.
One very important lesson when moving from a lambda-style callback
chain to coroutines was that simply copying the callback lambda into
capy::run_async and having it co_return a capy::task<> was a massive
problem, due to IIFE/lifetime/UB traps. This is now very well
documented in the capy documentation and I would go as far as to say
lambdas (except for very simple usage) should be avoided. Class
methods or free functions should be preferred. These denote clearer
lifetime semantics and explicit coroutine entry points.
When it came to implementing corosio client and server components, the
flow control and APIs were all very familiar. The APIs are very
similar to asio, so it was possible to make fast progress.
The modular design approach taken in the production codebase made it
possible to do a side-by-side porting of each asio component to a
corosio implementation.
At a high level, the following application components were ported over
to use corosio:
- TCP based clients and servers
- UDP based transmitter and receiver.
- Application specific timer synchronisation classes.
- The core "application" that was used to start the io_context
objects had both an asio and corosio implementation.
It was of utmost importance that all of the existing tests worked with
an asio io_context as well as the corosio io_context (both native and
type-erased). Each of the corosio-equivalent classes has identical
interfaces so that each of the tests could be called in the same way.
A lot of work went into updating the existing tests so that any
io_context could be passed and the tests run. Depending on the type of
io_context, asio::dispatch or capy::run_async would be used. Care was
taken to avoid making behavioural changes to the tests, to ensure that
what previously passed would still pass with the corosio version of
the test runner.
This approach was also taken for all of the core benchmark scenarios,
so that a fair comparison could be made between the asio components
and corosio components. As a result of extensive test coverage, a
number of bugs relating to the corosio io_context were filed and
fixed.
A number of API changes were required upstream, these were mostly gaps
in functionality between corosio and some asio features that were
required by the system. For example, supporting the <=> operator for
endpoint objects. These changes were added upstream. As a result, all
of our use cases and requirements were satisfied. The ports for each
component were 100% capy/corosio compatible and did not require any
asio interoperability.
My overall experience in refactoring this codebase was very positive.
I was surprised at how quickly progress was made on porting entire
components. I was also very impressed by the speed at which issues
were resolved. This is why I do not have any specific gripes about
missing features or functionality. This work has been carried out in
parallel with the development of capy and corosio.
6. Are the libraries ready for inclusion in Boost?
Given the time spent with capy and corosio, I would say that yes, the
libraries are ready for inclusion in Boost.
The APIs are sufficiently mature and each have an adequate amount of
features to be very useful. They provide a familiar approach to what
users may already be used to, and expecting, when it comes to
asynchronous network programming. Given the well defined design
principles, further development and maintenance of these libraries
will be possible.
7. If not, what changes would you recommend before acceptance?
See point 6.
8. Do the libraries fit well within the existing Boost ecosystem?
Yes, having a set of modern coroutine-only C++20 libraries that
provides access to executors, buffers, and streams, as well as
networking would be a welcome addition. Both capy and corosio would
fit well within the existing Boost ecosystem.
9. Are there API, naming, usability, extensibility, or implementation
concerns that should be addressed?
I do not have any concerns. Mainly because any issues that I had
raised as part of this large-scale refactoring were addressed in one
form or another.
In most cases, when an API change was made (that was not at the
request of this reviewer), it generally caused very little disruption
and the change in the production codebase was minimal. This suggests
that the designs are mature and well thought out. Any user who is
already familiar with asio should see familiar concepts, along with
the typical extension points to allow further customisation, if
required.
Summary:
I vote for these libraries to be accepted into Boost.