Boost logo

Boost :

Subject: Re: [boost] ACID transactions/flushing/MySQL InnoDB
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2009-12-12 10:47:58

On Sat, Dec 12, 2009 at 11:27 PM, <strasser_at_[hidden]> wrote:
> Zitat von Dean Michael Berris <mikhailberis_at_[hidden]>:
>> One other thing you can do is to pack the data into 512-byte divisible
>> chunks. If you know you're writing 2048 in one go, then you might have
>> to align to 512-byte chunks. Instead of writing "garbage" you can pack
>> null bytes past the 2048 bytes you originally wanted to store.
> I do have sector chunks, that was required for the reason alone that a used
> sector needs to be distinguishable from an unused sector, because of the
> zeroed out data at the end of the log.

I see. Do you have a separate index mechanism?

> I still insert the "garbage" before the data, I don't think I understood why
> inserting at the end would be preferable, if that was what you're trying to
> explain?

If you had an index mechanism that contains information about where a
certain log entry starts (and the size of that entry) then putting it
in the beginning allows you to re-use the area after that entry (thus
the zero's). Instead of garbage data that you're never going to be
able to re-use if it was in the beginning of a multi-sector entry, you
get usable space after the entry. This only works though if you have
an index, or at least a "header" as part of each entry that you write.

Imagining it this way:

 | header < offset, size > | entry |

You can then pack many of these entries in a single sector chunk. Of
course you're going to run into buffering and serialized management of
the chunk, but this allows you to save the space you otherwise would
be using for garbage data.

If in case you maintain a separate index, you can store the "headers"
independent of the data file. In which case you can keep updating the
index while writing the data on an mmap'ed file that's msync'ed
regularly and have the index "log style" if it's always append-only

> if cache locality was your only concern that doesn't really matter here,
> since CPU usage is way below 100%.

That's for single threaded code, but when you run into multi-threaded
code running on multiple cores and/or hardware threads, then it's
going to be one of the killers for performance -- even if CPU usage is
way below 100%. Of course tuning that requires profiling data and
would be done on a case to case basis. Just something worth keeping in
mind IMO.

> there can be about 40000 un-flushed small transactions/s, so this is really
> about optimizing the syncs. in a mode that is only safe in case of an
> application crash, but not in case of a system failure (flush, but no sync)
> there are about 30000 transactions/s for the same test case.

Alright. In which case sector-aligned writing and a good OS IO
scheduling algorithm would be your best friend in this case. :)

>> I'm not sure if this is part of the requirements in your design or
>> whether you want client threads to really block on transaction commits
>> (even though the syncing is lock-free).
> the worker threads are the user threads, so they need to block until the
> transaction is recorded committed (-> log sync).

I see. Have you looked at using futures to handle the communication
between user threads and the writer thread?

> you could only avoid some syncs when 2 or more independent transactions run
> concurrently and they both need to sync, those could be combined. but that's
> not my focus right now.

Alright. Thanks for indulging me and my suggestions. :)

Dean Michael Berris | | |

Boost list run by bdawes at, gregod at, cpdaniel at, john at