[Boost-bugs] [Boost C++ Libraries] #11895: Strand service scheduling is hurting ASIO scalability

Date view	Thread view	Subject view	Author view

Subject: [Boost-bugs] [Boost C++ Libraries] #11895: Strand service scheduling is hurting ASIO scalability
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2016-01-08 04:38:27

Next message: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #11894: boost::python::file_exec not fully closing file with Python 3.4.3"
Previous message: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #11861: Using boost thread crash winrt store app on start on Windows 10 Phone devices"
Next in thread: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #11895: Strand service scheduling is hurting ASIO scalability"
Reply: Boost C++ Libraries: "Re: [Boost-bugs] [Boost C++ Libraries] #11895: Strand service scheduling is hurting ASIO scalability"

#11895: Strand service scheduling is hurting ASIO scalability
-------------------------------------------+----------------------------
Reporter: Chris White <chriswhitemsu@â€¦> | Owner: chris_kohlhoff
Type: Bugs | Status: new
Milestone: To Be Determined | Component: asio
Version: Boost 1.60.0 | Severity: Optimization
Keywords: strand scheduling priority |
-------------------------------------------+----------------------------
This problem will be explained best by walking through a scenario:

I have an io_service being run with one thread per CPU core. The CPU has 4
cores. I also have a strand. I then post 10 one-second operations to the
strand, and 30 one-second operations to the io_service. That's 40
cumulative seconds of work, and since there are 4 cores, that work should
optimally take 10 seconds when performed in parallel.

However, that's not what happens. The work in the io_service is given
precedence over the work in the strand. So first, the 30 seconds of
cumulative work in the io_service is performed in parallel which takes
roughly 7.5 seconds (30 seconds / 4 cores). Only after that work is
complete does the 10 seconds of work on the strand get performed. So it
ends up taking a total of 17.5 seconds to do all the work because CPU time
wasn't distributed optimally.

If we think of the strand and the io_service as queues, what we have here
is a queue within a queue. The outer queue's work can be performed in
parallel, the inner queue's work is performed serially. It seems to make a
whole lot of sense that when control reaches the inner queue, that queue
should be serviced until it's empty. After all, there are other threads
that can still service the outer queue while that's happening. The
opposite is not true. By giving the outer queue priority, ASIO is
crippling the concurrency potential of the work in the inner queue i.e.
the strand.

This simplified scenario is a very real performance problem for my
companies server application. As far as I can tell, the only options for
me to address the problem are to implement my own strand that doesn't give
control back to the io_service until it's work queue is empty -or- put all
of my strands onto one io_service and post all non-strand work to a
second, separate io_service.

I attached a sample application that demonstrates the scenario above,
complete with timings and a visual print out of the order of operations.
It also includes a stripped down strand implementation that demonstrates
how the problem could be addressed. The application was built in Visual
Studio 2013 with boost version 1.60.0.

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/11895>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

Date view	Thread view	Subject view	Author view

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:19 UTC