On Wed, 28 Dec 2022, 12:01 Alexander Carôt via Boost-users, <boost-users@lists.boost.org> wrote:

Hello all,

I have a classical receive handling structure for an asynchronous UDP socket:

void receiver::receive(){
mySocket->async_receive_from(
boost::asio::buffer(input, maxInputLength), senderEndpoint,
boost::bind(&receiver::handleReceiveFrom, this,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}

void receiver::handleReceiveFrom(const boost::system::error_code &error,
size_t bytes_recvd) {

// SPECIFIC CODE HERE AND FINAK CALL OF RECEIVE FUNCTION

this->receive();
}

Besides this my app works with several realtime threads I have assigned maximum priority to via:

pthread_t threadID = (pthread_t) sendThread.native_handle();
struct sched_param param;
int policy = SCHED_FIFO;
int max = sched_get_priority_max(policy);
param.sched_priority = max;
pthread_setschedparam(threadID, policy, &param);

Now I wonder in how far these realtime threads can have an impact on the performance of my receiver process: So far I assumed that the receiver handle code is executed immediately when a packet arrives at the NIC but I am not sure if other realtime threads might possibly delay this process when scheduled first.

In other words: If we think of the receiver handling process as a thread that is triggered by incoming packets does this thread also have realtime capabilities that can suffer from competing processes ?

Thanks in advance for clarification,
best

When data is received on the NIC (i.e. the full frame has been received and the FCS is correct), it emits an interrupt which triggers a switch into kernel mode, which signifies that a DMA transfer into the kernel internal buffers has completed and must be acted on. It then processes such buffers to reconstruct a queue of UDP packets, mapping the IP destination to registered file descriptors, then switches to the thread that was waiting for such a packet. The thread then instructs the kernel to copy the data from its internal buffers to the userland buffer, which causes another two context switches.

Your thread is going to sleep whenever you are waiting for data (unless you use some kind of poll or poll_once call), so your scheduling only affects things while you are processing the packet. Assuming processing times are small you're unlikely to get scheduled out to begin with unless the machine is overloaded.

If you're looking for low-latency (microsecond scale), you should of course avoid all those context switches (including the interrupts), copies, and do all the thread scheduling yourself rather than rely on the kernel.

But Asio isn't particularly a good fit for anything that low-level or Linux-specific. While it can use io_uring for example (so long as you're willing to be extremely careful of ODR), it's still quite more limited than what you can achieve by using that API directly, but it will at least avoid the redundant context switches at the end since the kernel will directly write into the buffers.

Using io_uring directly would however allow you to control the affinity and scheduling of the real-time kernel thread processing your data as it arrives from the NIC, and switch from interrupts to a pure busy polling method, indepedently in the kernel side and in userland.

Bypassing the kernel is of course even better potentially.