Boost logo

Boost :

From: dmoore99atwork (dmoore_at_[hidden])
Date: 2002-03-14 06:07:43


In the first case where overflow on the producer side can be
discarded - ok, I see how that's appropriate in certain application
domains.

> 2) those units of work which absolutely cannot be lost. In a
> lossy-overflow message-queue asynchronous MT architecture each
thread
> working with these mission-critical units of work would retain a
> rememberance of each unit of work. There would be some expression
of
> acknowledgement or completion by one or more downstream threads for
> each of these remembered mission-critical units of work. That
> acknowledgement would directly or indirectly make its way back to
the
> rememberance data-structure/thread. The unit of work would be
> considered a success and would typically be removed from the
> data-structure remembering partially completed work which was
> submitted for downstream processing. If no acknowledgement ever
came
> back before some deadline (measured in elapsed real-time or some
other
> metric) for a rememberance of partially completed
> submitted-for-downstream-processing work, then that unit of work
would
> be considered to be in need of resubmission. (By the way these
units
> of work would need to be idempotent, which means that multiple
> performances of a request r for a unit of work yield the same result
> as a single performance of r (i.e., duplicates are okay), because
> maybe instead of the unit of work being lost, the acknowledgement
was
> lost after the work was in fact successfully accomplished
unbeknownst
> to the producing thread retaining the rememberance.) The unit of
work
> needing resubmission would again be pushed onto the message-queue
and
> eventually the mission-critical unit of work would be successfully
> accomplished on the first or second or third or nth try.

This makes sense to me, except that when we are talking about a
concrete realization of a semaphore which is incapable of tracking
more than "N" units of work, once the N+1 is produced, you have a
situation where there are N+1 units of work existing, presumably
consuming some type of resources such as memory, etc.

The downstream processing can only possibly know about "N" of them,
because they are relying on the semaphore to represent the count of
work items.

So, as you write, the burden of rememberance falls to the producer of
the units of work. The ability to resubmit and duplicates being ok
is fine, but once we reach "N+1" units of work, I can't see how the
system ever recovers in this case, because even if every consumer was
waiting on the semaphore with a count of 0, there would still be that
extra 1 unit of work bouncing around in a producer, who is waiting
for a completion notification that's not coming....

Thanks for the detailed info.
Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk