[boost][sqlite] Review

Hi Everyone, Again, I wanted to thank Klemens for his determination in developing, improving and submitting for review the Boost.Sqlite library. I also want to thank Mohammad for managing the review. My recommendation is to REJECT the library in its current shape, and encourage the author to resubmit the library in the future, when the issue listed below is addressed. I am *not* well familiar with SQLite, and my review is based on studying the documentation. I haven't played with the library. I see a great value in offering a wrapper over the native SQLite C API that offers clean, automatic, C++-style resource management. However, the way I see it, the library, in its current shape, does not offer this, or at least fails to communicate it in the documentation. This is my motivation for recommending rejection. As a C++ programmer I have two viable options. Either accept that I am dealing with the C interface, and write a resource-managing wrapper myself, or *trust* that a library performs a clean, correct resource management. I do not see a clear indication that class `connection` does the resource management right: 1. It is not documented 2. Constructor parameter take_ownership implies (but this is never documented) that a connection may have and use a connection handle and yet not manage its lifetime. How do I even tell if the connection object I have received does or does not manage the lifetime of the corresponding handle? 3. There are two functions for which it is not clear how they differ: release() ("Release the owned handle.") and close() ("Close the database connection."). What does function valid() check? having called close() or having called release() or having used `take_ownership`? I do not see a clear resource management in the class `statement`. Is `statement` associated with a resource that needs to be managed? The class has a comment "This transfers ownership to the resultset". Ownership of what? What is the full lifetime of this implied resource? Can execute() be called multiple times? If so, do you transfer the same ownership of something to multiple places? I do not see a clear resource management scheme in class `resultset`. I intuitively feel that some life-time-related thing happens each time I call read_next(). But I do not know when and what actually happens. I also do not know in the iterator interface whether it is operator++ or operator* that corresponds to read_next(). Or if the call to begin() also involves calling read_next(). THe documentation is silent about this important aspect. I fail to see how the example provided could be correct: sqlite::resultset rs = conn.query("select * from users;"); do { handle_row(r.current()); } while (rs.read_next()); // read it line by line The first call to `r.current()` is performed unguarded, without any check. How does this work given that the query can potentially return zero results? These things lead me to the conclusion that I would not be able to correctly manage resources if I used this library interface in its current shape. It is possible that the implementation is fine, and this is just the issue with the documentation. But in my opinion this aspect (resource management) is so important that it cannot be left undocumented, and having left it undocumented in the first place may be an indication that not enough thought has been given to the correct design of the resource management. I hope that the next incarnation of the library will address this aspect. Best regards, &rzej;

On Thu, Sep 4, 2025 at 11:53 AM Andrzej Krzemienski via Boost <boost@lists.boost.org> wrote:
Hi Everyone, Again, I wanted to thank Klemens for his determination in developing, improving and submitting for review the Boost.Sqlite library. I also want to thank Mohammad for managing the review.
Thank you, Andrzej, for the review. Your points are noted. I'm looking forward to Klemens' thoughts on the mentioned issues.

On Thu, Sep 4, 2025 at 4:23 PM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:
Hi Everyone, Again, I wanted to thank Klemens for his determination in developing, improving and submitting for review the Boost.Sqlite library. I also want to thank Mohammad for managing the review.
My recommendation is to REJECT the library in its current shape, and encourage the author to resubmit the library in the future, when the issue listed below is addressed.
I am *not* well familiar with SQLite, and my review is based on studying the documentation. I haven't played with the library.
I see a great value in offering a wrapper over the native SQLite C API that offers clean, automatic, C++-style resource management. However, the way I see it, the library, in its current shape, does not offer this, or at least fails to communicate it in the documentation. This is my motivation for recommending rejection.
As a C++ programmer I have two viable options. Either accept that I am dealing with the C interface, and write a resource-managing wrapper myself, or *trust* that a library performs a clean, correct resource management.
I do not see a clear indication that class `connection` does the resource management right: 1. It is not documented 2. Constructor parameter take_ownership implies (but this is never documented) that a connection may have and use a connection handle and yet not manage its lifetime. How do I even tell if the connection object I have received does or does not manage the lifetime of the corresponding handle?
The connection can be passed to user-code from sqlite where the lifetime is managed externally. These are the only cases this option should be true.
3. There are two functions for which it is not clear how they differ: release() ("Release the owned handle.") and close() ("Close the database connection."). What does function valid() check? having called close() or having called release() or having used `take_ownership`?
Release releases ownership, so the destructor of connection will not close it (if it's owned). Close closes the database connection, i.e. sqlite3_close. Valid checks if the managed resources refers to a valid database connection, regardless of ownership. Calling close or release will mean valid() returns false.
I do not see a clear resource management in the class `statement`. Is `statement` associated with a resource that needs to be managed? The class has a comment "This transfers ownership to the resultset". Ownership of what? What is the full lifetime of this implied resource? Can execute() be called multiple times? If so, do you transfer the same ownership of something to multiple places?
The overload of execute qualified with && transfers ownership of the statement to the resultset, so you can do this: conn.prepare("select * from x where y = %1").execute(42); where as this, does not, so you can call executor multiple times: auto s = conn.prepare("select * from x where y = %1"); s.execute(42); s.execute(43); This is because I introduced multiple classes (statement, resultset, row) that all use the same type (sqlite3_stmt) underneath. I don't think replicating the C APi here with a single class is helpful, especially given the required order of operations, hence the resultset & row. Therefore I split the functionality over multiple classes but they all share a single resource. Because a prepared-statement is the way to do parameterized queries in sqlite, there'll be a lot of uses of a statement with a single execution, hence allowing the user not to keep the statement around, but just the resultset.
I do not see a clear resource management scheme in class `resultset`. I intuitively feel that some life-time-related thing happens each time I call read_next().
It does not.
But I do not know when and what actually happens. I also do not know in the iterator interface whether it is operator++ or operator* that corresponds to read_next().
++ like any other input_iterator.
Or if the call to begin() also involves calling read_next(). THe documentation is silent about this important aspect.
I fail to see how the example provided could be correct:
sqlite::resultset rs = conn.query("select * from users;");
do { handle_row(r.current()); } while (rs.read_next()); // read it line by line
The first call to `r.current()` is performed unguarded, without any check. How does this work given that the query can potentially return zero results?
You're right, there should be a call to `rs.done()` first if the table might be empty.
These things lead me to the conclusion that I would not be able to correctly manage resources if I used this library interface in its current shape.
It is possible that the implementation is fine, and this is just the issue with the documentation. But in my opinion this aspect (resource management) is so important that it cannot be left undocumented, and having left it undocumented in the first place may be an indication that not enough thought has been given to the correct design of the resource management.
I hope that the next incarnation of the library will address this aspect.
Best regards, &rzej; _______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/DIP5GYYP...

czw., 4 wrz 2025 o 20:54 Klemens Morgenstern via Boost < boost@lists.boost.org> napisał(a):
On Thu, Sep 4, 2025 at 4:23 PM Andrzej Krzemienski via Boost < boost@lists.boost.org> wrote:
Hi Everyone, Again, I wanted to thank Klemens for his determination in developing, improving and submitting for review the Boost.Sqlite library. I also want to thank Mohammad for managing the review.
My recommendation is to REJECT the library in its current shape, and encourage the author to resubmit the library in the future, when the issue listed below is addressed.
I am *not* well familiar with SQLite, and my review is based on studying the documentation. I haven't played with the library.
I see a great value in offering a wrapper over the native SQLite C API that offers clean, automatic, C++-style resource management. However, the way I see it, the library, in its current shape, does not offer this, or at least fails to communicate it in the documentation. This is my motivation for recommending rejection.
As a C++ programmer I have two viable options. Either accept that I am dealing with the C interface, and write a resource-managing wrapper myself, or *trust* that a library performs a clean, correct resource management.
I do not see a clear indication that class `connection` does the resource management right: 1. It is not documented 2. Constructor parameter take_ownership implies (but this is never documented) that a connection may have and use a connection handle and yet not manage its lifetime. How do I even tell if the connection object I have received does or does not manage the lifetime of the corresponding handle?
The connection can be passed to user-code from sqlite where the lifetime is managed externally. These are the only cases this option should be true.
Unless it is possible in those cases to pass a pointer to `connection` object from sqlite to user code, the next best option, as I see it, would be to have a connection_view` class, much like we have `string` and `string_view`.
I do not see a clear resource management in the class `statement`. Is
`statement` associated with a resource that needs to be managed? The class has a comment "This transfers ownership to the resultset". Ownership of what? What is the full lifetime of this implied resource? Can execute() be called multiple times? If so, do you transfer the same ownership of something to multiple places?
The overload of execute qualified with && transfers ownership of the statement to the resultset, so you can do this:
conn.prepare("select * from x where y = %1").execute(42);
where as this, does not, so you can call executor multiple times:
auto s = conn.prepare("select * from x where y = %1"); s.execute(42); s.execute(43);
This is because I introduced multiple classes (statement, resultset, row) that all use the same type (sqlite3_stmt) underneath.
I don't think replicating the C APi here with a single class is helpful, especially given the required order of operations, hence the resultset & row. Therefore I split the functionality over multiple classes but they all share a single resource.
Because a prepared-statement is the way to do parameterized queries in sqlite, there'll be a lot of uses of a statement with a single execution, hence allowing the user not to keep the statement around, but just the resultset.
I fully support introducing multiple classes (statement, resultset, row). I am still struggling with understanding what kind of resource is being managed here. From your description it looks like there is a "shared state": created in function prepare and then consumed by class resultset and released by the last of all result sets and the statement object. Is this right?
I do not see a clear resource management scheme in class `resultset`. I intuitively feel that some life-time-related thing happens each time I call read_next().
It does not.
But I do not know when and what actually happens. I also do not know in the iterator interface whether it is operator++ or operator* that corresponds to read_next().
++ like any other input_iterator.
Or if the call to begin() also involves calling read_next(). THe documentation is silent about this important aspect.
I fail to see how the example provided could be correct:
sqlite::resultset rs = conn.query("select * from users;");
do { handle_row(r.current()); } while (rs.read_next()); // read it line by line
The first call to `r.current()` is performed unguarded, without any check. How does this work given that the query can potentially return zero results?
You're right, there should be a call to `rs.done()` first if the table might be empty.
Sooo, the canonical way of iterating through the result set with the non-range interface would be: for (sqlite::resultset rs = conn.query("select * from users;"); !rs.done(); rs.read_next()) { handle_row(r.current()); } Is that right? Regards, &rzej;

On 9/5/25 22:23, Andrzej Krzemienski via Boost wrote:
czw., 4 wrz 2025 o 20:54 Klemens Morgenstern via Boost <
I do not see a clear resource management in the class `statement`. Is
`statement` associated with a resource that needs to be managed? The class has a comment "This transfers ownership to the resultset". Ownership of what? What is the full lifetime of this implied resource? Can execute() be called multiple times? If so, do you transfer the same ownership of something to multiple places?
The overload of execute qualified with && transfers ownership of the statement to the resultset, so you can do this:
conn.prepare("select * from x where y = %1").execute(42);
where as this, does not, so you can call executor multiple times:
auto s = conn.prepare("select * from x where y = %1"); s.execute(42); s.execute(43);
This is because I introduced multiple classes (statement, resultset, row) that all use the same type (sqlite3_stmt) underneath.
I don't think replicating the C APi here with a single class is helpful, especially given the required order of operations, hence the resultset & row. Therefore I split the functionality over multiple classes but they all share a single resource.
Because a prepared-statement is the way to do parameterized queries in sqlite, there'll be a lot of uses of a statement with a single execution, hence allowing the user not to keep the statement around, but just the resultset.
I fully support introducing multiple classes (statement, resultset, row). I am still struggling with understanding what kind of resource is being managed here. From your description it looks like there is a "shared state": created in function prepare and then consumed by class resultset and released by the last of all result sets and the statement object. Is this right?
Approaching this from a different angle, I would naively expect the following two pieces of code to do the same thing (they don't): // 1 void process_users(connection &conn) { statement st = conn.prepare("select * from users where name = ?"); resultset rs = st.execute({"allen"}); for (row const& r : rs) handle(r); } //2 void process_users(connection &conn) { resultset rs = conn.prepare("select * from users where name = ?").execute({"allen"}); for (row const& r : rs) handle(r); } And depending on what resultset is, the following two pieces of code also do the same thing or both break for the same reasons (they don't): //3 resultset get_users(connection & conn) { statement s = conn.prepare("SELECT * from users where name = ?"); resultset rs = s.execute({"allen"}); return rs; } void process_users(connection &conn) { resultset rs = get_users(conn); for (row const& r : rs) handle(r); } //4 resultset get_users(connection & conn) { resultset rs = conn.prepare("SELECT * from users where name = ?").execute({"allen"}); return rs } void process_users(connection &conn) { resultset rs = get_users(conn); for (row const& r : rs) handle(r); } Granted, 3 and 4 are documented to behave very differently: 3 is UB, 4 not.

On Fri, Sep 5, 2025 at 1:55 PM Maximilian Riemensberger via Boost < boost@lists.boost.org> wrote:
Approaching this from a different angle, I would naively expect the following two pieces of code to do the same thing (they don't):
// 1 void process_users(connection &conn) { statement st = conn.prepare("select * from users where name = ?"); resultset rs = st.execute({"allen"}); for (row const& r : rs) handle(r); }
//2 void process_users(connection &conn) { resultset rs = conn.prepare("select * from users where name = ?").execute({"allen"}); for (row const& r : rs) handle(r); }
I haven't been keeping up too much with this new library or review but how can these two snippets not do the exact same thing? What does each one do and why do they yield different results? - Christian

On 9/5/25 23:10, Christian Mazakas via Boost wrote:
On Fri, Sep 5, 2025 at 1:55 PM Maximilian Riemensberger via Boost < boost@lists.boost.org> wrote:
Approaching this from a different angle, I would naively expect the following two pieces of code to do the same thing (they don't):
// 1 void process_users(connection &conn) { statement st = conn.prepare("select * from users where name = ?"); resultset rs = st.execute({"allen"}); for (row const& r : rs) handle(r); }
//2 void process_users(connection &conn) { resultset rs = conn.prepare("select * from users where name = ?").execute({"allen"}); for (row const& r : rs) handle(r); }
I haven't been keeping up too much with this new library or review but how can these two snippets not do the exact same thing? What does each one do and why do they yield different results?
They yield the same results. They call different sqlite functions. The 1 snippet unnecessarily prepares the statement for rebinding and reexecution when rs gets destroyed. Not a big deal, admitted. But I didn't expect both snippets would do the exact same sqlite API calls. Any makes me feel a bit icky about the life-time management around statement. Best regards Max

The 1 snippet unnecessarily prepares the statement for rebinding and reexecution when rs gets destroyed. Not a big deal, admitted. But I didn't expect both snippets would do the exact same sqlite API calls. Any makes me feel a bit icky about the life-time management around statement.
The other aspect that is surprising is that even if we ignore the subtlety of the lifetime footgun, on the flip side, if what you’re saying is accurate about the statement preparation, and only if accurate, then this means that users could end up running queries that they meant to be prepared statements, but due to this non-apparent subtlety in the type’s interface, you could end up pessimising your code by mistake. Claudio

On Fri, Sep 5, 2025 at 11:55 PM Andrzej Krzemienski via Boost <boost@lists.boost.org> wrote:
Unless it is possible in those cases to pass a pointer to `connection` object from sqlite to user code, the next best option, as I see it, would be to have a connection_view` class, much like we have `string` and `string_view`.
Klemens, since this has been brought up regularly, what do you think about introducing a connection_ref type that is explicitly non-owning? The use cases for a non-owning connection seem limited to modules and sqlite::context<...>::get_connection(). Passing an instance of a connection_ref seems reasonable.

On Sat, Sep 6, 2025 at 11:57 PM Mohammad Nejati via Boost < boost@lists.boost.org> wrote:
On Fri, Sep 5, 2025 at 11:55 PM Andrzej Krzemienski via Boost <boost@lists.boost.org> wrote:
Unless it is possible in those cases to pass a pointer to `connection` object from sqlite to user code, the next best option, as I see it,
would
be to have a connection_view` class, much like we have `string` and `string_view`.
Klemens, since this has been brought up regularly, what do you think about introducing a connection_ref type that is explicitly non-owning? The use cases for a non-owning connection seem limited to modules and sqlite::context<...>::get_connection(). Passing an instance of a connection_ref seems reasonable.
While I think having a single type is easier, the regularity does show that I hold a minority position. It seems to be rather confusing and worrying to almost everyone else. I'll add a `connection_ref` type that is implicitly constructible from a `connection` and explicitly from a `sqlite3` pointer and make the connection always owning.
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/7TCMMZV3...

While I think having a single type is easier, the regularity does show that I hold a minority position. It seems to be rather confusing and worrying to almost everyone else.
I'll add a `connection_ref` type
if it serves as a counter datapoint, as someone with experience using SQLite in a C++ application, IMO I think introducing another type is also not the best option here. How is a connection_ref different from a reference or a pointer to connection? I can’t help but to feel this is quite unusual, and that it will impose a certain cognitive cost to understand code that otherwise could be simpler for little return. Claudio

On Sat, Sep 6, 2025 at 8:36 PM Claudio DeSouza via Boost <boost@lists.boost.org> wrote:
How is a connection_ref different from a reference or a pointer to connection? I can’t help but to feel this is quite unusual, and that it will impose a certain cognitive cost to understand code that otherwise could be simpler for little return.
A connection& or connection* allows moving a connection because it has a mutable reference. Furthermore a connection_ref can point to a sqlite3* without the need for a connection object at all.

sob., 6 wrz 2025 o 19:27 Mohammad Nejati via Boost <boost@lists.boost.org> napisał(a):
On Sat, Sep 6, 2025 at 8:36 PM Claudio DeSouza via Boost <boost@lists.boost.org> wrote:
How is a connection_ref different from a reference or a pointer to connection? I can’t help but to feel this is quite unusual, and that it will impose a certain cognitive cost to understand code that otherwise could be simpler for little return.
A connection& or connection* allows moving a connection because it has a mutable reference. Furthermore a connection_ref can point to a sqlite3* without the need for a connection object at all.
I do not understand the use case/constraint that Klemens describes, but I guess a pointer to connection will not suffice. Maybe, if the passing of the connection from sqlite to a user is *the only* use case, the following scheme would work: Keep the "take_ownership=false" constructor, but: * in the implementation make it private * in the documentation don't even mention it: it would be an implementation detail. Then, when need arises for sqlite to pass the connection to the user, create it privately, but return only a pointer to the `connection` object. In that scheme, the users do not even know about this non-owing state, and the thing is presented to them as if "the library somehow creates and destroys the resource-managed connection object at the right time". Regards, &rzej;
participants (6)
-
Andrzej Krzemienski
-
Christian Mazakas
-
Claudio DeSouza
-
Klemens Morgenstern
-
Maximilian Riemensberger
-
Mohammad Nejati