We have a client application that uses deadline_timers in order to implement timeouts for read and write operations on a socket. We have an io_service thread that runs boost::asio::io_service::run.
The _ioService is used by resolver, by read, write, connect timers (implemented using deadline_timer) and also by tcp::socket. On socket we are using only synchronous operations like:
connect, read_some, write_some. The socket is blocking. Also for resolver we are using the synchronous operation resolve.
Our problem is that although read timer is set to 100 milliseconds, it doesn't fire up and the application is blocked in read_some function for like 15 minutes. In onReadTimeout callback we are closing the socket. I can reproduce this behavior every time by changing the connection settings on the virtual machine that host the server to a dummy interface.
Currently I sow this behavior in logs and i did a core dump of the
client process using gcore and sow the stack trace on all threads.
I added also the stack trace. The thread that is blocked for 15 minutes is thread 18. This thread contain in stack trace boost::asio::io_service::run. Also thread 9 and thread 10 contains boost::asio::io_service::run.
Thread 18 calls
_readTimer.expires_from_now(boost::posix_time::milliseconds(_readTimeout), ec);
_readTimer.async_wait(boost::bind(&BlockingTcpClient::onReadTimeout, this, boost::asio::placeholders::error));
with readTimeout set to 100 and then _pSocket->write_some. In BlockingTcpClient::onReadTimeout we are closing the socket. So read_some should be blocked for maximum 100 miliseconds.
Maybe the problem is that the read timer uses the same io_service that currently is blocked in read_some.
Can anyone help us to investigate or fix this issue?
If this is a known limitation can anyone tell us why this behavior happen?
Thank you!