Boost logo

Boost :

From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2021-05-13 21:07:44

Beware of a long post.

On 5/13/21 5:27 PM, André Almeida wrote:
> Hi there,
> I'm the author of futex2[0], a WIP new set of Linux's syscalls that
> allows userspace to write efficient sync mechanisms. I would like to
> hear from Boost's developers if the project would benefit from this new
> interface.
> From Boost/sync's codebase, I can see that you are already familiar with
> futexes, but just in case:


> The detailed description of the API can be seen in the documentation
> patch[1]. Do you think that Boost would benefit from it?

Hi, and thank you for working on this and especially for including
64-bit futex support in the latest patches. I have already described
some of the use cases in my earlier post on LKML[1], but I'll try to
recap and expand on it here.

Boost contains many libraries, but there are few of them that deal with
thread synchronization directly:

- Boost.Atomic implements atomic operations and also basic wait/notify
operations. Supports both inter-thread and inter-process synchronization.
- Boost.Interprocess implements inter-process communication primitives,
including synchronization.
- Boost.Sync and Boost.Thread implement inter-thread communication
primitives, including synchronization. (Note that Boost.Sync is not an
officially accepted library yet; you can consider it a work in progress
that is not yet an official part of Boost.)

A few other libraries are worth mentioning. Boost.Fiber and Boost.Log
implement custom thread synchronization primitives that use futex API
directly. Some libraries may be also using low-level thread
synchronization APIs, such as pthread and WinAPI, but not futex directly.

Of the libraries I mentioned, the prime user of futex2 would be
Boost.Atomic. With the current implementation based on existing futex
API, the important missing part is support for futex sizes other than 32
bits. This means that for atomics other than 32-bit Boost.Atomic must
use an internal lock pool to implement waiting and notifying operations,
which increases thread contention. For inter-process atomics, this means
that waiting must be done using a spin loop, which is terribly
inefficient. So, the support for 8, 16 and 64-bit futexes would be very
much needed here.

Another potential use case for futex2 is the mass locking algorithms[2]
in Boost.Thread. Basically, the algorithm accepts a list of lockable
objects (e.g. locks or mutexes) and attempts to lock them all before
returning. Here, I imagine, the support for waiting on multiple futexes
could come in handy. It should be noted that the algorithms are generic,
so they must work on any type of lockable objects, including those that
do not use or expose a futex, so the optimization is not trivial or
universally applicable. However, if the algorithm is applied to
Boost.Thread primitives, and those expose a futex, this could work quite

Although Boost.Interprocess doesn't currently use futexes directly, I
imagine it would benefit from it. Not in least part because pthread does
not provide robust condition variables, and robust mutexes alone are
often not enough for organizing inter-process communication. Robust IPC
is a recurring theme in Boost.Interprocess issues and PRs, so I think,
some solution is needed here and futex could be a building block. In my
LKML post I have described one solution to this problem (that is
implemented in a project outside Boost) and there 64-bit futexes would
be very much useful.

Alternatively, futex2 could offer a new API for implementing robust
primitives in userspace. I know the current futex2 patch set does not
implement robust futexes, and I'm not asking to implement them, but if
there are plans to eventually add robust futexes, here is a thought. The
new API should preferably support multiple users of this feature. That
is, the kernel API should allow any piece of userspace code (not just
libc) to mark individual futexes as robust, without having to maintain a
common list of robust futexes in userspace. Currently, this list is
maintained by libc internally, which prevents any futex user (other than
libc itself) from using robust futexes. But this feature should probably
be discussed with libc develolpers.

Other than the above, I can't readily remember potential use cases for
futex2 in Boost. We do use futexes (the currently exiting futex API) in
Boost.Sync and other libraries and could use them elsewhere, but for
primitives like mutexes, condition variables, semaphores and events the
existing API is sufficient. We currently don't implement NUMA-specific
primitives, which might be a good future addition to Boost, but I can't
tell whether the new futex2 API would be sufficient to it. Better NUMA
support could be interesting for the thread pool implementation in
Boost.Thread, but I'm not familiar with that code and don't know how
useful futex2 would be there.

As for use cases outside Boost, that application that I described in the
LKML post would benefit not only from 64-bit futexes but also from the
ability to wait on multiple futexes. We are also using futex bitset API
in order to reduce the number of woken threads blocked on a futex. The
bitset is used as a mask of events that each blocked thread subscribes
to. When the notifying thread wakes, it sets the bitset to the mask of
events that happened, so that only the threads that are waiting for the
events are woken up. I think, this could be emulated with multiple
futexes in the futex2 design, although I'm not sure if that would be as
efficient, as that would increase the number of futexes at least twofold
in our case (since every thread most of the time subscribes to at least
two events). I can provide more details on this use case, if you're


Boost list run by bdawes at, gregod at, cpdaniel at, john at