Boost logo

Boost :

Subject: Re: [boost] ACID transactions/flushing/MySQL InnoDB
From: Stefan Strasser (strasser_at_[hidden])
Date: 2009-12-10 17:02:37


thank you very much for both your answers.
but they don't solve the mystery, please see below.

Am Thursday 10 December 2009 16:40:12 schrieb Dean Michael Berris:
> > what I find surprising is the performance of MySQL for small
> > transactions. while my library spends 90% of its time waiting for the
> > disk to flush a transaction, InnoDB uses 100% CPU and barely uses any
> > disk.
>
> Am I reading this correctly, you're doing something with persistence
> of tuples to disk, allowing indexed access to this persisted data, and

persistence yes, tuples no. persistent objects.
in the backend that means: chunks of arbitrary sized data.
there is some indexing going on, but it isn't relevant to this case because
small transactions are dumped to the log on commit. the data files are only
updated from time to time, to avoid additional disk syncs.

Am Thursday 10 December 2009 16:56:27 schrieb Jose:
> This is the setting where you can change the performance vs durability
> tradeoff:
>
> If the value of innodb_flush_log_at_trx_commit is

thanks, I didn't know about this. but the value was 1 by default, so MySQL
claims to flush for each transactions.

but, including all the SQL parsing, running 10000 small transactions takes
about 4 seconds on my system.
while this little code snippet alone takes 6-7 seconds, with much more disk
usage:

  int fd=open("test.bin",O_RDWR | O_CREAT | O_NOATIME);
  for(int c=0;c<10000;++c){
    write(fd,&data,1);
    fsync(fd);
  }

neither fdatasync(), O_WRONLY, nor writing whole sectors instead of just a
byte make a difference. MySQL also claims to use fsync on the page you
linked.

do you have an explanation for this?

> then allowing someone to embed it in their application. This sounds a
> lot like an embedded database system and looking at SQLite and BDB

I have looked at SQLite, they seem to use the same method as I do.
judging by this FAQ entry, with similar results:
http://www.sqlite.org/faq.html#q19
(although I get much more than 60 transactions a second. I guess they have
calculated that from the RPM without actually trying it. fsync() returns
about 1600 times/s on my old 10 GB disk, presumable due to hardware
caching.).

> Another means that they do away with the flush is by using memory
> mapped regions in Linux/UNIX, and then delegate the flushing to the
> operating system.

I have implemented that, with an own set of problems, see this:
http://lists.boost.org/boost-users/2009/12/54449.php

> BTW, I am interested in what you're doing and maybe using a library
> that tackles the embedded database space in a modern C++ manner.
> Looking forward to hearing more about what you're working on.

I've posted a description for the Libraries Under Construction page (not yet
there) in this thread:

http://lists.boost.org/Archives/boost/2009/12/159900.php


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk