Boost logo

Boost :

Subject: Re: [boost] Request for Interest in several Modules
From: Dominique Devienne (ddevienne_at_[hidden])
Date: 2012-01-12 05:39:25


On Thu, Jan 12, 2012 at 11:02 AM, Artyom Beilis <artyomtnk_at_[hidden]> wrote:
> So if you are looking for vendor specific functions
> that allow you to do some kind of data transfer directly
> from socket to memory buffer it is not for you, especially
> when only one API around supports it (OCI)

You have to use a vendor-specific API in the back-end anyways.

What matters is to have the user-facing API not preventing
optimizations in the back-end in the case you want to insert 100,000
rows to a table. If the library forces to call stmt.execute() for each
row, and thus a network round-trip to the server, you're far from
high-performance. I doubt Oracle OCI is the only vendor or DBMS API
that offers ways to improve such a case (and I'm not talking of its
Direct Path API which bypasses the normal DML path). As an example,
you showed binding a scalar to a prepared statement's placeholder,
which Oracle supports of course, but it also allows binding an array
of scalars almost as simply, and sends all those in a single
round-trip. One could agree it's the job of a back-end to accumulate
the scalar values and behind the scene accumulate them in an array,
similar to Oracle's prefetching but in reverse, but it could also be
in the API itself, which is simply converted to a loop for back-ends
that do not support bulk array operations.

I don't dispute that your wrapper is 2x or 3x faster than existing
wrappers, (and I was already taking prepared statements reuse for
granted in fact), but was discussing the kind of large insert or
select above. SQLite doesn't have the kind of bulk operations I
mention, simply because it's a server-less SQL engine talking directly
to the filesystem, with no network overhead, but for client-server
RDBMS you simply cannot afford to ignore network round-trips to have
the best performance.

As you rightly point out, you don't aim to be an ORM (which often by
their nature of traversing an object graph and doing small "scattered"
scalar inserts, can't leverage bulk operations, and therefore can't
achieve best performance) and concentrate on just the DBMS
connectivity, but all I'm saying is that bulk insert or select are
precisely something a targeted library like this should address, one
way or another, to call itself high-performance.

Just my $0.02. --DD


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk