[Boost-bugs] [Boost C++ Libraries] #7961: handle_connect called before the socket was actually connected

Subject: [Boost-bugs] [Boost C++ Libraries] #7961: handle_connect called before the socket was actually connected
From: Boost C++ Libraries (noreply_at_[hidden])
Date: 2013-02-01 16:02:59


#7961: handle_connect called before the socket was actually connected
-------------------------------------------------+--------------------------
 Reporter: Nikki Chumakov <nikkikom@…> | Owner: chris_kohlhoff
     Type: Bugs | Status: new
Milestone: To Be Determined | Component: asio
  Version: Boost 1.52.0 | Severity: Problem
 Keywords: asio epoll epoll_reactor |
-------------------------------------------------+--------------------------
 RedHat5.8, kernel 2.6.32.26-17.el5, glibc 2.5-81.el5_8.7
 The problem was detected with boost 1.49, but was confirmed with 1.53b1
 also.

 Problem: handle_connect (connect completion handler) can be called before
 TCP open handshake completes.

 Unfortunately I could not strip my application to reasonable size, so I
 prefer not to post it.
 There is mail thread related to this bug at
 *
 [http://sourceforge.net/mailarchive/forum.php?thread_name=kegjks%24c82%241%40ger.gmane.org&forum_name
 =asio-users Asio Users Mail List]

 I believe there is a bug in epoll_reactor, the way it handles EPOLLHUP
 event on yet-not-connected sockets. Below I explain the details and
 symptoms.

 Here is strace output of such connects:

 {{{
 [pid 25441] socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 9

 [pid 25441] epoll_ctl(5, EPOLL_CTL_ADD, 9,
 {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, {u32=1811961744,
 u64=140515962017680}} <unfinished ...>
 [pid 25442] epoll_wait(5, <unfinished ...>
 [pid 25441] <... epoll_ctl resumed> ) = 0
 [pid 25442] <... epoll_wait resumed> {{EPOLLOUT, {u32=1811958752,
 u64=140515962014688}}, {EPOLLIN, {u32=1811943784, u64=140515961999720}},
 {EPOLLOUT|EPOLLHUP, {u32=1811961744, u64=140515962017680}}}, 128, 0) = 3

 [pid 25441] ioctl(9, FIONBIO, [1]) = 0

 [pid 25441] connect(9, {sa_family=AF_INET, sin_port=htons(80),
 sin_addr=inet_addr("xxx.xxx.193.11")}, 16) = -1 EINPROGRESS
 (Operation now in progress)
 *********** no epoll_wait after connect **********

 [pid 25441] epoll_ctl(5, EPOLL_CTL_MOD, 9,
 {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, {u32=1811961744,
 u64=140515962017680}}) = 0

 *********** calling handle_connect
 [pid 25442] getsockopt(9, SOL_SOCKET, SO_ERROR, [7782250667543887872],
 [4]) = 0
 [pid 25442] getpeername(9, 0x40450360, [140514150055964]) = -1 ENOTCONN
 (Transport endpoint is not connected)
 [pid 25442] write(2, "connect error: ", 15connect error: ) = 15
 [pid 25442] write(2, "Transport endpoint is not connec"..., 35Transport
 endpoint is not connected) = 35
 [pid 25442] write(2, "\n", 1
 ) = 1
 }}}

 As one can see, there is no epoll_wait after ::connect call, but connect
 handler was called.

 So, asio calls "::connect" and then immediately calls
 user handle_connect handler without calling (and waiting for)
 epoll_wait between ::connect and handle_connect. Thus handle_connect is
 called before the socket was actually connected.

 What may happen is:

 1. main thread calls do_open and adds the socket to epoll queue.
 2. service thread calls epoll_wait and it returns several events INCLUDING
 that socket.
 3. main thread calls async_connect (and modifies the socket in epoll
 queue, but it does not matter at this point)
 4. service thread processes the events it got form epoll_wait at step #2
 in a loop, and when it process that socket, the completion connect handler
 is called.

 The possible workaround is to ignore EPOLLHUP in
 epoll_reactor::descriptor_state::perform_io() until the socket got
 'connected' state.

 I'm attaching the patch that works for me. It need to be carefully
 reviewed, because of possible unwanted side effects (e.g. lost socket
 disconnect error notifications).

-- 
Ticket URL: <https://svn.boost.org/trac/boost/ticket/7961>
Boost C++ Libraries <http://www.boost.org/>
Boost provides free peer-reviewed portable C++ source libraries.

This archive was generated by hypermail 2.1.7 : 2017-02-16 18:50:11 UTC