[boost] Re: [sqlite] Formal review of Boost.SQLite begins

8 Sep 2025

      I don't have the time (and courage...) to write a review.
But I feel compelled to comment on some of what I read.

On Sun, Sep 7, 2025 at 10:58 PM Maximilian Riemensberger via Boost
<boost@lists.boost.org> wrote:
...
* The `transaction` documentation and examples should emphasize the scope
   requirements for statements and their results used within a transaction.
That makes no sense to me. Your example with the scope made no sense to me.
Yet I've been using SQLite for almost 20 years, in raw C and various
C++ wrappers.
...
* I did not find any design documentation or rationale for the library's
   design decisions regarding ownership handling of sqlite3 resources.
Most reviewers have disclosed they don't have much experience with SQLite.
This shows on this point. You often don't own the main connection object.
Large and old code bases using SQLite have different subsystems, written
by different people, each wanting to use its favorite flavor of SQLite. And that
includes heavy weights like Qt and its Sql "module". We even have to use our
own Qt SQLite Driver to be able to inject a connection owned by lower layers
of our code.

So my point is essential to recognize that Boost.SQLite, if it arrives at all,
arrives very late, and has to integrate with the messy reality of own grown
and OSS wrappers, sometimes on the *same* connections. And as Klemens
wrote, you don't have to have to duplicate APIs you can into, or templatize
them, just because the connection is owning or not.

Whether the RAII or not the connection is the client-code's choice.
And most code shouldn't care at all, and just use it.
...
* resultset: The destructor steps through the entire resultset rows.  I don't
   think this is necessary since the sqlite3 docs would indicate that you can
   reset or finalize a statement at any time.
This also does not make sense to me, if it's true.
You can abandon a partially read "resultset" at any time.

I think maybe people don't realize that SQLite is an *embedded* DB engine.
The very term resultset is misleading. There's no resultset type in SQLite,
like there is in PostgreSQL's LIBPQ for example. The term resultset give
the impression the result of a query is "materialized" somehow. It's NOT!

An SQLite prepared statement is a bytecode program, compiled from SQL text.
And when you _step() that statement, you start the execution of that program
and move the PC ("Program Counter") to the generate the next row. I don't even
think the row is itself "materialized", it could be just a reference
to a particular
row on a table btree page, "paged-in" in the page cache. And when you read
a value from that row, SQLite dynamically decodes the row to get the requested
value, optionally doing implicit conversion on that value, if the way
its accessed
by the client code differs from that value's internal type (which can
be different
from the declared type of the column or expression it came from!).

So an SQLite statement is a compiled bytecode program, but it's also like
a cursor or in C++ terms a coroutine that is resumed on every _step() call
to advance to the next row. So it's not a resultset per se.

I saw from review that Klemens chose .read_next() for _step(), which
conveys the WRONG message IMHO. Such a name makes sense for
MySQL or PostgreSQL, which are client-server RDBMSs with a well defined
Network protocol over a socket, but there's no such thing in SQLite.

Trying to coerce an SQLite's statement into Boost.SQLite statement,
resultset, row, and field types, when it all comes from a single object
in SQLite itself (and its related API), leads to these various abstractions
all having to share the same handle, and brings about the lifetime
issues that displeases reviewers.
...
* field access: Retrieving text fields should follow recommended sqlite3
   practice: `sqlite3_column_text` followed by `sqlite3_column_bytes` to get the
   length.
Completely agree on that one. There's really not much difference between
an SQLite blob and text value, at the storage level, in both cases SQLite knows
the size. Yes, built-in SQLite SQL function assume null-terminated string (and
the SQLite APIs ensures all text buffer are null-terminated), you can
insert/select
them w/o any issues, so assuming null-termination for strings is a
mistake in the
API, and it's also inefficient to strlen in O(N) when SQLite knows the
size in O(1).

SQLite is single-threaded, through and through.
Statements are coroutines. You can have several statements,
and _step() their rows in any order, w/o issues, as long as it's all
done serially, and you decide whether SQLite ensures that for you,
or not, when you know you're already single-threaded from your own code.

Because it's single threaded, it can get away with the connection acting
as a kind of singleton, so when you insert rows in ROWID tables for example,
you don't ask the last-inserted-rowid from the statement, but from the
connection.
And a connection knows / tracks all its statements internally, there's
even an API
to iterate unfinished statements. So you could even "leak" your statements
in the C++ sense, and "fix"  that later, by iterating them via the connection,
and finalizing them. Or do that behind the back of C++ wrappers that think
they "own" the statement, and result in double-finalize, which may be harmless
and result in just an SQLITE_MISUSE, or something more nefarious.

It seems to me many reviewers looks at this review from the high ground
of experienced C++ developers, w/o knowing the messy realities of SQLite.
And perhaps missing the point I'm afraid.

I don't know what the right answer is, what the right API is.
But IMHO, with SQLite, any "resultset" abstraction will be
"leaky" I'm afraid. Ownership will be "messy", from having
to integrate with existing code bases and other wrappers,
and from lifetimes which are not purely C++ based.

Anyways, enough of a rant. Sorry about that... --DD

[boost] Re: [sqlite] Formal review of Boost.SQLite begins

Dominique Devienne