Boost logo

Boost :

From: Ivica Siladic (ivica.siladic_at_[hidden])
Date: 2024-10-21 14:40:47


Hi Ruben,

thank you very much for your thorough review and valuable suggestions! And of course, for recommending our library to Boost!

> On 21.10.2024., at 10:43, Ruben Perez <rubenperez038_at_[hidden]> wrote:
>
> Hi all,
>
> This is my review of async_mqtt5.
>
> First of all, thanks Ivica and the Mireo team for submitting this
> library for Boost review, and Klemens for managing the review.
>
> - Does this library bring real benefit to C++ developers for real world
> use-case?
>
> Yes. MQTT clients are widely used in the IoT world. Boost.Asio is one
> of the strongest networking libraries in C++. Bringing these together
> in a high-quality library is valuable. The library already has already
> more than 140 stars on GitHub, which is a good sign.
>
> - Do you have an application for this library?
>
> Not right now. But I used to work in the IoT world, and I would have
> used it if I would have had it available back then.
>
> - Does the API match with current best practices?
>
> It does almost everywhere. Analysis follows.
>
> 1. It's only a high-level API, akin to Boost.Redis. This may not cover
> some very specialized use cases, but makes it much easier handling
> common tasks robustly. I don't think this is concerning, since the API
> is flexible enough. Moreover, the Mireo team has more field experience
> regarding the protocol than I do. A lower level client can be added in
> the future without breaking compatibility, if requested by users.
>
> 2. It correctly follows Boost.Asio universal async model (except for a
> gotcha with executors, see my next point), accounting for allocator
> customization, custom completion tokens, and per-operation
> cancellation. Looking at the code, it's clear that the developers bear
> a deep understanding of Asio, which is very valuable.
>
> 3. At the moment, the client requires the following: "The same
> executor must execute mqtt_client::async_run and all the subsequent
> async_xxx operations." (quoting
> https://spacetime.mireo.com/async-mqtt5/async_mqtt5/asio_compliance/executors.html).
> This is a deviation from the standard Asio model, not present in any
> of the libraries currently in Boost. The former statement makes the
> following malformed, without any diagnostic being emitted:
>
> cli.async_run(asio::bind_executor(ex1, ...)); // 1
> cli.async_publish(asio::bind_executor(ex2, ...)); // 2
> cli.async_publish(asio::bind_executor(ex3, ...)); // 3
>
> Where operations 2 and 3 may complete on any of ex1, ex2 or ex3.
>
> I think this limitation should be lifted, and that the above code
> should be legal. I'd suggest making stream writes child agents of
> async_run, instead of async_publish and similar functions (see
> https://lists.boost.org/Archives/boost/2024/10/258132.php for the
> discussion regarding this issue).
>
> 4. The client is templated on the stream type. This is a well-tested
> design pattern, but can lead to higher compile times. Writing dynamic
> clients (where the transport type or TLS usage is determined by a
> runtime variable) is also more involved. I think the current
> architecture is acceptable, but I'd consider adding a type-erased
> client in the future. While this shouldn't be too much effort for
> well-versed Asio developers, regular users may find it unsurmountable.
> Alternatively, I'd also be good with completely replacing the
> templated client with a type-erased one (up to the authors).
>
> 5. The library internally buffers outgoing and ingoing messages
> automatically, which is great because it's easier to maintain
> resilience. However, the only limit on the publishing side are the
> limited 2^16 packet IDs (restricted by the protocol), with no limit on
> the reading side. I'm slightly concerned about applications buffering
> too much on either side. I don't know if this is a real concern or
> just produced by my lack of MQTT field experience. Would it make sense
> to somehow introduce a per-topic limit on the number of unacknowledged
> messages on both the subscribing side (possibly specified when the
> client subscribes) and on the publishing side? If that's not possible,
> I'd like to see an example on how to implement this flow control limit
> on the publishing side, at least.
>
> 6. Some APIs use owning types (std::vector and std::string) where
> non-owning types (boost::core::span and std::string_view) could
> suffice. I'd suggest reviewing these.
>
> 7. TLS streams require a number of template specializations to set up.
> At the moment, this must be done by the user even for the usual
> asio::ssl::stream. This risks ODR violations if two libraries end up
> defining such specializations independently. I'd advise placing the
> required boilerplate code, together with helper typedefs, in an
> optional, separate header.
>
> 8. The API is really small, and I think that's great.
>
> - Is the documentation helpful and clear?
>
> It is in general well written and clear. It is however geared towards
> a user that knows Asio and MQTT pretty well. I'd suggest a couple of
> changes:
>
> 1. There are a lot of pages justifying design decisions made by the
> library. This has been very useful to me, but is geared towards
> advanced users. I'd attempt to gather all pages explaining how to use
> the library in the first section, and move pages with design
> rationales to a second section.
> 2. There is no page in the discussion addressing how to use
> async_publish and async_receive (only an example). This new page
> should likely go into the first section.
> 3. There is no page in the discussion addressing TLS.
> 4. I miss an example about custom authentication.
> 5. I don't think the multi-threaded example captures the problems that
> a multi-threaded program integrating the library may face. Running
> everything under a single strand negates the benefit of
> multi-threading. I'd target the publisher example, instead. It's
> important that you ensure that reading the sensor happens outside the
> strand.
>
> I have more minor comments, included at the bottom of this email.
>
> - Did you try to use it? What problems or surprises did you encounter?
>
> I built a sample publisher and subscriber, based on the examples.
> They're available here: https://github.com/anarthal/async-mqtt5-test
>
> The publisher reads two sensors at different frequencies, which is
> something that scales poorly when using threads. The async
> architecture makes this task trivial, which is great. Note that the
> publisher doesn't co_await async_publish, because that'd prevent
> further sensor reads, reducing the benefits of resilience. The
> publisher example in the docs may consider mentioning this.
>
> Debugging config issues is extremely difficult (see
> https://spacetime.mireo.com/async-mqtt5/async_mqtt5/auto_reconnect.html
> for a rationale). I initially tried to use a public test broker that
> rejected my connection because of high utilization. It took me a
> decent amount of time to figure it out - I tried debugging, enabling
> Asio handler tracking, until I finally launched Wireshark and figured
> it out. This is a problem during development but also in production,
> as logging when things go wrong is a usual practice.
>
> I think this should be somehow addressed, at the very least by
> suggesting to use Wireshark/tshark (or other tools that may be
> practical) in the docs. When using TLS, using
> SSL_CTX_set_keylog_callback
> (https://docs.openssl.org/1.1.1/man3/SSL_CTX_set_keylog_callback/#synopsis)
> to create a keylog file for Wireshark may also be useful, and
> mentioning how to do it may be useful.
>
> I'd at least consider the possibility of including some sort of
> logging or diagnostics generation, either by using a custom macro or
> abusing Asio handler tracking.
>
> I found that the implementation does not correctly shutdown TLS and
> websockets (no calls to asio::ssl::stream::async_shutdown and
> beast::websocket::stream::async_close). These should be added
> (https://github.com/mireo/async-mqtt5/issues/18).
>
> - What is your evaluation of the implementation?
>
> The implementation is well written and very easy to follow, which is great.
>
> Build times are pretty high, which is not great. The sample publisher
> takes 22s to build under clang-18, Linux, Debug, C++23 on my machine
> (30s in Release mode). Building with -ftime-trace, two points show up
> in the flame graph:
>
> 1. The client header includes boost/beast/websocket/stream.hpp, even
> when not using websockets. The header is very heavyweight and slows
> down compilation. I think the class can be forward declared if the
> definition of rebind_executor for websocket streams is changed to the
> following:
>
> template <typename Stream, bool deflate_supported, typename Executor>
> struct rebind_executor<boost::beast::websocket::stream<asio::ssl::stream<Stream>,
> deflate_supported>, Executor> {
> using other = typename boost::beast::websocket::stream<
> asio::ssl::stream<typename rebind_executor<Stream, Executor>::other>,
> deflate_supported
>> ;
> };
>
> Some other Beast headers are included in detail/connect_op.hpp.
> Connection attempts to distinguish whether the stream is a websocket
> using metaprogramming, and then runs Beast-specific code when a
> websocket is detected. I think this creates too much coupling between
> the libraries. My suggestion is to re-architect this a bit. I'd
> rewrite websocket detection as a separate trait, and then provide a
> definition for it in a separate header, thus making Beast a peer
> dependency.
>
> Beast is also use for get_lowest_layer. I don't know if this is
> strictly required, since asio::ip::tcp::socket does implement the
> layered protocol (it defines a lowest_layer_type and a
> get_lowest_layer function, see asio::basic_socket).
>
> 2. asio::append and asio::prepend are heavily used, which are heavy to
> instantiate. I'd suggest experimenting with replacing these by a
> custom handler class that specializes asio::associator to propagate
> associated characteristics, as suggested in this ML post:
> https://lists.boost.org/Archives/boost/2024/10/258132.php. This may or
> may not reduce compile times, but it's worth a try.
>
> The library has many Boost dependencies. While this might not itself
> impact compile times, it affects package managers. Using Boostdep, the
> dependency with most transitive dependencies (by far) is Spirit,
> pulling in variant, thread, preprocessor, phoenix, and a number of
> other libs.
>
> The obvious solution would be encouraging the author to remove the
> dependency. But I think we have something off in how we manage
> dependencies in Boost, since the parts of Spirit they're pulling don't
> require most of these libraries.
>
> Regarding testing, I like seeing both unit and integration tests. Code
> coverage is great. I haven't looked into them in great detail, but
> from some snippets I saw and Ivica's comments, they seem correctly
> architectured. I miss sanitizers, fuzz testing and some compilers. See
> my comments below.
>
> - Do you have experience with asio?
>
> Yes. I developed and maintain Boost.MySQL.
>
> - Did you work with MQTT in the past?
>
> I know the basic concepts and have read part of the spec for this
> review. I haven't used MQTT in production. I have worked in embedded
> systems before, though.
>
> - How much time did you spend on this review?
>
> I spent around 25h reading the documentation, reference, researching
> MQTT, building code snippets and discussing design decisions on the
> mailing list.
>
> - Final decision and comments
>
> My recommendation is to CONDITIONALLY ACCEPT async_mqtt5 into Boost.
> It's a great library and I'd love to see it in Boost.
>
> Here are a lot of feedback points, in decreasing order of importance.
>
> 1. Acceptance conditions: I think these should be addressed before
> Boost integration.
>
> - Executor correctness. The restriction of forbidding different
> executors for different operations should be lifted.
> - Making debugging configuration easier. This can have the form of a
> documentation page explaining the available tools and how to use them,
> or by enhancing the current API with logging facilities.
> - TLS/websocket shutdown code should be added, as per
> https://github.com/mireo/async-mqtt5/issues/18.
>
> 2. Recommendations. These are not acceptance criteria, but I strongly
> recommend considering them, at least:
>
> - Remove the hard dependency on Boost.Beast websocket in
> detail/rebind_executor.hpp by forward-declaring websocket::stream.
> - Change the websocket connect handling from if constexpr to traits,
> to remove the Beast hard dependency.
> - Consider creating an optional header containing the required
> boilerplate to use TLS-based connections without the need of template
> specializations. Helper typedefs could be beneficial.
> - Experiment with build times and -ftime-trace by reducing the usage
> of asio::prepend and Boost.Spirit in the implementation. If this
> happens to improve build times, consider applying the relevant
> changes.
> - Introduce fuzz testing for the encoders and decoders (I can aid with
> this if required).
> - Introduce address and undefined sanitizer builds (I can also aid with this).
> - Consider supporting total cancellation in async_publish and other
> functions. Probably requires fixing the executor issue.
> - Consider introducing a type-erased client that does not depend on
> the stream type. I'd consider acceptable entirely removing the
> templated one in favor of the type-erased one, if the authors feel
> comfortable with that.
> - Consider changing the mqtt_client::brokers signature to take a
> span<host_and_port>, where host_and_port contains a string and a port
> number. I think this better represents what the function does.
> - Consider changing usages of std::vector and std::string in public
> APIs to boost::core::span and std::string_view where it makes sense.
> - Consider introducing APIs to limit the message
> production/consumption rates per topic. If not feasible, introduce an
> example on how to rate-limit a publisher user-side.
> - Consider trimming some Boost dependencies to make the library
> lighter for package managers.
>
> 3. Minor comments. These are not acceptance criteria, either.
>
> - Consider adding CIs for clang 16 to 19.
> - Apply concept checking to completion tokens, using the
> BOOST_ASIO_COMPLETION_TOKEN_FOR macro for C++17 compatibility.
> - Consider using an inline variable to implement
> get_error_code_category, instead of a static one.
> - Place non-documented, non-public functions in the detail namespace.
> Candidate functions:
> - client_error_to_string, in error.hpp
> - client_ec_category, in error.hpp
> - Auxiliary functions and types implementing properties, in
> property_types.hpp
> - to_reason_code, in reason_codes.hpp
> - Consider making the library compatible with the
> BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT macro. Although legacy, other
> libraries (namely a postgresql one) require it.
> - Consider making the will class a plain struct, instead of a class.
> This would make it compatible with C++20 designated initializers.
> - Consider renaming the library and namespace to mqtt5 or mqtt,
> following what we have for MySQL and Redis. Note that typing
> boost::async_mqtt5 may be long.
> - Consider making constexpr global variables (like the ones in
> reason_codes.hpp) inline constexpr.
> - Consider enhancing support for Asio handler tracking (as per
> https://live.boost.org/doc/libs/1_86_0/doc/html/boost_asio/overview/core/handler_tracking.html)
> by using the BOOST_ASIO_HANDLER_LOCATION macro. This may mitigate the
> frustration of users experiencing connectivity problems.
> - Consider running the thread-safety example under the thread
> sanitizer. I've found a lot of surprises (and learnt a lot) by doing
> this.
> - Consider renaming client_service to something that doesn't include
> the term "service". When working with Asio, a service usually refers
> to something inheriting from asio::execution_context::service,
> allocated once per execution context. This is not the case for
> client_service, which initially created confusing expectations to me.
> - Consider replacing the SFINAE clause in authenticator() by a
> static_assert. Calling authenticator() with something not fulfilling
> the concept is always an error - the clearer the diagnostic, the
> better.
> - Consider whether asio::cancel_after/asio::cancel_at can be used to
> simplify some of the internal waiting logic.
> - Consider moving the code snippets shown in the discussion to a C++
> file and use Quickbook import statements to include them in the docs.
> This reduces the possibility of having out-of-sync snippets.
> - Consider specializing asio::associated_executor,
> asio::associated_immediate_executor and asio::associated_allocator to
> implement property propagation in detail::cancellable_handler.
>
>
> 4. Comments on docs. These are not acceptance criteria.
>
> - The reference lists is_authenticator
> (https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/is_authenticator.html),
> but it looks like it's in the detail namespace. Either promote it to
> the public namespace, or hide it from the docs (you can still create a
> page for the concept even if you don't expose the actual type).
> - The reference lists the phrase "On immediate completion, invocation
> of the handler will be performed in a manner equivalent to using
> boost::asio::post.". This is not true, since you support immediate
> executors.
> - Some property types are listed as scalars, when they are really
> vectors of scalars. This might be confusing. For instance,
> https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/publish_props.html
> lists subscription_identifier as std::string, when it's actually
> std::vector<std::string>. Same applies for user_property.
> - As of Boost 1.86, timeouts should be implemented using
> asio::cancel_after, rather than by awaitable operators or parallel
> groups. I suggest replacing the examples on timeouts by a single one
> using asio::cancel_after.
> - Type requirements for ExecutionContext
> (https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/ExecutionContext.html)
> are not correct. You can probably just link to Asio's description:
> https://www.boost.org/doc/libs/1_86_0/doc/html/boost_asio/reference/ExecutionContext.html.
> - The multithreaded example recommends the following code to run
> callbacks though a strand:
>
> boost::asio::post(
> strand,
> [&client, &strand] {
> // Considering that the default associated executor of all
> completion handlers is the strand,
> // it is not necessary to explicitly bind it to async_run or
> other async_xxx's handlers.
> client.async_run(boost::asio::detached);
> }
> );
>
> This form of post is not equivalent to
> asio::post(asio::bind_executor(ex, ...)), and can yield surprising
> behavior if the token has a bound executor. Furthermore, post might be
> less efficient than dispatch, and should only be used if a reschedule
> is required (which is not the case). I'd advise to change the above
> by:
>
> boost::asio::dispatch(boost::asio::bind_executor(
> strand,
> [&client, &strand] {
> // Considering that the default associated executor of all
> completion handlers is the strand,
> // it is not necessary to explicitly bind it to async_run or
> other async_xxx's handlers.
> client.async_run(boost::asio::detached);
> }
> ));
>
> - The multi-threaded example uses asio::io_context and a vector of
> threads. This is error-prone. It can be simplified by using
> asio::thread_pool, asio::thread_pool::attach and
> asio::thread_pool::join.
> - The coroutine examples can be simplified with the latest Asio
> additions. At the moment, asio::deferred is recommended over
> asio::use_awaitable for efficiency reasons. With partial tokens,
> async_xxx(asio::as_tuple(asio::deferred)) can be replaced by
> async_xxx(asio::as_tuple).
> - The coroutines examples shouldn't use asio::detached with
> asio::co_spawn, as this swallows any thrown exceptions. It's better to
> use [](std::exception_ptr e) { if(e) std::rethrow_exception(e); }, as
> this propagates exceptions to main.
> - It's probably better to avoid the unified header in the examples. We
> should encourage including individual headers, instead, as this is
> better for compile times.
> - In https://spacetime.mireo.com/async-mqtt5/async_mqtt5/connection_maintenance.html:
> should be "the MQTT <protocol>" (missing word).
> - https://spacetime.mireo.com/async-mqtt5/async_mqtt5/optimising_communication.html:
> typo: "simultanious".
> - https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/mqtt_client/operator_eq_.html:
> moved-from state probably supports being assigned to, too.
> - https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/mqtt_client/tls_context.html:
> render error in the 1st template parameter.
> - https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/mqtt_client/authenticator.html:
> render error in the 1st template parameter.
> - https://spacetime.mireo.com/async-mqtt5/async_mqtt5/ref/authority_path.html:
> member descriptions are empty.
>
>
> Thanks Ivica for proposing the library. I really hope to see it in
> Boost in the near future.
>
> Regards,
> Ruben.

dr. Ivica Siladic
Chief Technology Officer
Member of the Board
Mireo d.d.
T: +385 (1) 6636966
E: ivica.siladic_at_[hidden]
www.mireo.com


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk