Subject: Re: [boost] [transaction] New Boost.Transaction library under discussion
From: Bob Walters (bob.s.walters_at_[hidden])
Date: 2010-01-27 18:01:36
On Wed, Jan 27, 2010 at 5:15 PM, <strasser_at_[hidden]> wrote:
> Zitat von Bob Walters <bob.s.walters_at_[hidden]>:
>> Yes, I can deal with it in that way. You mention that you have no
>> checkpoint, but I thought you would need occasional msync() of the
>> backing store, in order to eliminate the need for some of the logs,
> is that what you mean by checkpoints? I assumed you meant exporting the
> entire state from time to time, so checkpoint + log = current state.
Well, not the whole state, but rather just the changes since the last
checkpoint. In effect, it is the equivalent of writing to the log,
but doing a lazy msync() of the memory mapped region only once every
N seconds so that if there is good spacial locality to the updates
being done by the user, there is chance of reducing the I/O load, and
also of using more sequential I/O. My checkpoint is (unfortunately)
more a matter of explicitly writing out the changes, rather than just
msync(), but the concept is the same, and so fits with the algorithm
> anyway, this is currently managed by the storage log itself, though this is
> not set in stone:
> in addition to a commit message there is a "success" message written to the
> log, i.e. "this transaction has reached disk". the log only removes old logs
> when there's been a success message for each transaction in it.
> so the RM can delay the success messages (and therefore the sync) as long as
> it wants, it is only prompted by storage_log::overflow()==true to post its
> success messages.
> so I guess it won't be too much coordination if we end up using this log.
It sounds like it. I can always have a thread which does periodic
checkpointing, then interacts with the log when prompted by overlow to
indicate the transactions which have been written to disk.
>> True. It would be great to ensure that when the different boost
>> transaction-capable libraries are used together, the log can be
> do you also include the TM log in this?
No. I'm assuming here that one/both of us eventually gets an RM
created which combines our two RMs under a common log, as you had
mentioned previously. As a result, the TM would recognize only one
RM, and thus could avoid any need for a log of it's own, and just do 1
phase commit calls (pass-through.) IIUC it also wouldn't need a log
also in the case of 1 persistent RM and 1 non-persistent RM. So that
means all 3 libraries under discussion could be used together without
the overhead of a distributed TM having its own log and sync points.
> e.g. when there is a RDBMS, and a logging boost library, the boost RM
> maintains a log and the TM does for the distributed transactions with the
> RDBMS RM.
> so the RM and the TM could also use a shared log, but I tend to think this
> is not worth the effort, as this would span the interface between RM and TM.
I think it's been done both ways. i.e. TM has its own log resource,
and also TM shared a log with one of the RMs it is managing. (e.g.
Oracle RDBMS). However, any sharing probably isn't worth much, because
the TM would still need its own sync()s as it orchestrated the
different RMs, even if it was sharing a log with one of them.
> Unsubscribe & other changes:
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk