Boost logo

Boost :

From: Dominique Devienne (ddevienne_at_[hidden])
Date: 2020-04-14 13:19:06


On Mon, Apr 13, 2020 at 1:52 PM Ruben Perez via Boost <boost_at_[hidden]>
wrote:

> I have been working in a proof-of-concept implementation of PostgreSQL. You
> can find it here: https://github.com/anarthal/postgres-asio

Hi. Great, I've been following this thread from afar, because first I'm
using/learning ASIO
myself, and second because I call databases as part of the processing I do
in my ASIO-scheduled
"tasks". But I'm a PostgreSQL user, not a MySQL one, thus my relative lack
of interest until now :).

> It supports connections, queries and prepared statements with an API almost
> identical to the one proposed for Boost.MySQL. I have only implemented
> synchronous operations yet. I hope this helps prove that implementing other
> databases with a very similar API is feasible and helps push the MySQL
> library forward.
>

I'm confused. What's the point of implementing synchronous at all rather
than use
libpq directly? libpq is "official" and synchronous, so the *only* reason
I'd prefer your
re-implementation of the protocol would be to gain async via ASIO.

I haven't looked at the code, but I would also have assumed that any
synchronuous
API for a library that stated to the async, would be as a convenience layer
on top of
the async layer, doing the waiting/synchronization for you, to simplify
things when async
is not strictly necessary.

But what you wrote above seems to imply the contrary. Is the async API
implemented
on top of the sync API? Or you implement two separate APIs completely?

> Note that the Postgres library is just a proof of concept; it is very far
> from being a full implementation. At this point I lack the time to
> transform this library into a full implementation. If somebody wants to
> volunteer and progress this library into a full implementation (or just
> write something from scratch), I would appreciate it a lot - just message
> me.
>

It's hard to get traction on alternatives to official APIs (like libpq for
PostgreSQL),
w/o some pretty strong guarantees on testing. And testing with databases is
complicated
because of the need to access that remote DB; you can't easily mock a
database reliably.
I'm not sure how that would work within Boost. Are there precedent for
Boost libraries that
need "external" resources like full blown databases? Perhaps via VMs and/or
Docker?

> As always, any feedback is welcome.
>

A few things that came to my mind reading this thread, from my DB
experience, is that
the re-implement-the-low-level-protocol approach applies to very few DB
systems,
given that most commercial DBs use proprietary protocols (I know Oracle
does for sure).
Or the most used DB in the world (SQLite) doesn't have a network protocol
at all. So that
pretty much only leaves MySQL and PostgreSQL, no? So at the very least,
both should be
supported by a library that wants to gain widespread appeal IMHO. And from
my POV, you
chose the wrong one to start with :)

Another remark is that PostgreSQL has two modes, text (the default) and
binary to send row
data back and forth. We've tested both and went the most difficult route of
using the binary
route for performance reasons, which complicates things. Do you handle both
modes at the
protocol level?

I'd conclude by saying that from my perspective, the one-true-SQL-library
is a pipe dream
when you want to maximum performance, or when wanting to avoid the
lowest-common
denominator approach.

I've used Array-binding in Oracle, and XMLTYPE, and low-level C/C++ binding
of Spatial objects
in OCI, and pretty much no SQL abstraction libraries can do these things.
After doing lots of
Oracle and SQLite, and now doing PostgreSQL, the differences are actually
very large between
these DBs, to the point a lots of the optimization done for one are not
applicable to the other,
and many things just don't translate at all. If one stays within a *tiny*
subset of SQL and features,
maybe... But any non-trivial sophisticated DB use, you quickly find the
limits libraries that try
to be multi-DB IMHO.

So in your case, what matters most is the async ASIO-based design, and
having a framework
to develop async DB client libraries, but not necessarily exposing exactly
the same API for all
of them. What matters is to be able to start processing rows and sending
them downstream for
further processing, *BEFORE* the whole result set made it client-side.
Maybe what I'm saying
doesn't make sense, because I haven't studied your code and examples yet,
but FWIW,
these are the thoughts I had following this thread.

Thanks, --DD


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk