Boost Users :
Subject: Re: [Boost-users] [asio] Problem with async_read/async_read_some not reacting to incoming data
From: Bill Somerville (bill_at_[hidden])
Date: 2009-03-19 11:52:58
Stig Sandø wrote:
> We're rewriting a client with boost asio but have run into some
> problems when stresstesting the client. The client is fetching
> textual and graphics from a server with one connection that is open at
> all time. When the client is getting large amounts of graphics data
> it will after awhile suddenly stop receiving data, and eventually it
> will hit our timeouts and the last call sent is an
> async_read/async_read_some. We are keeping the server well-fed with
> requests so there should be graphics forthcoming without pause. The
> problem has been seen on win32, linux and darwin when testing on a
> gigabit net fetching raw 1080i graphics (4M for each field), and is
> most frequent on darwin. This is naturally an absolute show-stopper
> for us.
> So we are a bit loss what is going wrong and why
> async_read/async_read_some stops reacting in the middle of the
> fetch-queue, despite wireshark showing that the data is incoming.
> When using compression on the data the problem is harder to reproduce,
> which might suggest a race-condition somewhere. But our code is just
> using a single thread for io_service and all async-communication is
> triggered from this io-thread which has a work-object to keep the
> io_service spinning. We're also making sure there is at most one
> async_read and one async_write in effect at a time, roughly similar to
> the chat_client sample.
I would be suspicious of the 'incoming_request' queue, where is that
data being popped from the queue, if it is not from the context of the
io_service thread then it is not thread safe.
> Has anyone seen something similar or have any input on how best to
> figure out what goes wrong? Are there invariants that says you cannot
> read and write at the same time?
> Some symptoms are the same in each test. When we get the last image
> from the socket the buffersize is zero afterwards, and the next
> async_read request is to transfer_at_least(1). The async_read never
> calls the handler for completion of this byte, so Nagle would have
> kicked in. It is also fairly hard to strip down to a small example
> using a mock server.
> I have included some stripped down code below in case that might be
> helpful spotting something that we cant see.
-- Bill Somerville
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net