Boost logo

Boost :

Subject: Re: [boost] sqlpp11: SQL for C++
From: Dominique Devienne (ddevienne_at_[hidden])
Date: 2013-11-13 06:09:21


On Wed, Nov 13, 2013 at 7:28 AM, Gavin Lambert <gavinl_at_[hidden]>wrote:

> The GenerateQuery method returns a "query object" that defines the action
> to be done but not the specific parameter values on which it acts. It is
> expensive to create (because this is where all the parsing happens) but is
> (fairly) cheap to copy, and reusable without copying. This is analogous to
> preparing an SQL query.
>

I agree with Gavin's points. Support for prepared queries returning some
kind of functor / lambda taking as many typed variables as the query has
placeholders is essential IMHO, because unless you have true binding of
in-memory data, it's a "toy" to run literal SQL only. (whether it's the SQL
string literal parsed at compile time, or a compile time DSL matters less
to me).

But I'd like to flesh out my point that array-binding is essential to
performance. Most commenters focus on queries, i.e. selects, where you send
little data (just a few bind variables) and receive lots of data. That can
be made efficient in Oracle just by turning on prefetching on the statement
handle, w/o any other change to the client code. It's the OCI client
library which does the work for you transparently.

But when you insert or update, you have to send a lot more data than you
receive, and if you do a round-trip to the server on every row, you get
orders of magnitude performance differences sometimes. For example,
inserting 10,000 rows in a two-column table (an int ranging from 1 to
10,000, and a fixed "payload" string for all rows that read "azertyuiop"),
doing it the "normal" way takes 0.578 sec, while sending the 10,000 rows at
once take 0.007 sec. This timing is with the Oracle server and the client
on the same Windows laptop. (the same test running on Linux, talking to a
Linux server in the same room over the LAN gives 0.310 sec for scalar
inserts, < 0.001 sec for array insert). Increase the latency by using a
server on the WAN, and the difference can be even worse. But unlike the
select case, there's no prefetching equivalent, you must use separate APIs
to bind, and it can be done differently too (one array per col, or one
array of struct with a member for each col, or using dynamic binding via
callbacks that OCI calls to fill its "buffers" before sending the whole lot
server-side).

An API like SQLite doesn't need array-binding, because it's not
client-server, but server round-trips is what makes or breaks an
application's performance when dealing with databases, at least for those
which deal with high volume of data, which is often one of the primary
reason they are coded in C++ in the first place.

An API like sqlpp11 cannot be everything to everyone of course. You're very
much entitled to decide for yourself what is in-bounds and out-of-bounds.
But I thought I'd provide more info on the above so that you at least are
more aware of the problem space as I see it myself.

Cheers, --DD


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk