Boost logo

Boost Users :

From: Ken Thomases (ken_at_[hidden])
Date: 2002-09-03 20:04:58


On 29 Aug 2002, at 20:12, "lparrab" <lparrab_at_[hidden]> wrote:
> I'm starting a multi-threaded program using Boost.Thread library:
> it is a "server", in which a thread loops around a "select" to
> accept new connections and see when a socket has data to be read, in
> which case puts the fds in queue from which other threads are taking
> them out to read the data.
> The problem I have, is that when a "ReceiverThread" finishes reading
> the data from the socket, it puts the fd back in the "all
> connections list", but by this time the other ("main")thread already
> copied the fds from the list and got into the next "select" loop,
> which means the socket from which the "receiver" just finished
> reading, will not be "selected" (monitored) untill the next select
> loop, which will not be untill a) a new connection comes or b)one of
> the other connected sockets gets some data that takes makes the main
> thread return from select.

The cleanest way to wake the main thread from select is to make one
of the file descriptors in its FDSET become readable. What you need
is a special file descriptor that is always in the FDSET that becomes
readable when some other thread needs the main thread to come out of
the select call. For this you can use a pipe.

Here's a way that requires minimal changes to your existing design:

When a thread needs the main thread to wake up, it writes a byte into
one end of the pipe. It doesn't matter what the byte contains, its
presence in the pipe is the "information" payload. When the main
thread comes out of the select, if its end of the pipe was among the
readable fds, then it should pull one byte out of the pipe (and
discard it). If you use extreme care, you can instead flush the pipe
of all bytes if you're sure you won't cause a race condition that
way. (Example of race condition: Main thread comes out of select,
Main thread builds new FDSET for next select call, Receiver thread
writes to pipe to inform it of newly available fd, Main thread
flushes pipe and blocks in select. To fix this race condition, you
would reverse the order in which the Main thread flushes the pipe and
builds the FDSET.)

Here's the better design:

Consider: why do you need the Receiver threads to wake the Main
thread? To transfer responsibility for an fd from a Receiver thread
to the Main thread. So, this is similar to the way the Main thread
transfers fds to Receiver threads through a queue, and should be
implemented in a similar fashion. Instead of using the pipe just to
signal the Main thread to wake, actually use it to transfer the fds,
just as you use the queue in the other direction.

So now, when Receiver threads are done with an fd, they write the
actual fd into the pipe. The Main thread wakes because the other end
of the pipe was in its FDSET. It should read any and all fds out of
the pipe and add them to its "connection list" for its next pass
through select. (You can read just one fd from the pipe for each
time that select indicates the pipe is readable, or you can set the
pipe to be non-blocking and read until the pipe is empty.) This
approach is a better design because it implements the actual transfer
of responsibility and because the connection list should be private
to the Main thread. This way doesn't require that the Receiver
threads access the connection list, so the connection list doesn't
have to be protected by a mutex.

> I think having a thread waiting on each open connection is not the
> right approach here, so I thought of going a kind of "production line"
> approach in which a few threads (5-10) are reading the data from a
> bunch of open connections (50-100), the only thing they do is read the
> data, parse it and form a "Request" which they put in a queue for
> further processing and then go on to service the other connections.
> Then another group of threads is taking this requests and doing
> something, and putting Response objects in yet another queue, which
> another group of threads "Sender" is taking and sending thorugh the
> socket.

Just be aware that this design may mean that the client sees
responses arrive out of order from their requests. If the client
only submits a request after any previous requests have been
satisfied, you're OK. If the client can have multiple requests
outstanding at once, the Request threads or the Sender threads may
finish out-of-order. Also, if the Sender threads don't write
atomically or with exclusive access, then the data from multiple
responses might be interleaved.

This illustrates why it is cleaner to have a thread take full
ownership of a connection for its lifetime as others have suggested.
Of course, you're not being unreasonable in your approach, but you
can marry the two somewhat. You might have each worker thread
combine the Receiver, Request, and Sender jobs in sequence and then
relinquish the connection back to the main thread. It's not clear to
me that there is any advantage in terms of number of threads, thread
workload, or throughput to splitting those jobs up.

Hope that helps,
Ken


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net