Boost logo

Boost :

Subject: [boost] Requests for comments on a (partly) hypothetical non-relational serialization library
From: nathaniel_at_[hidden]
Date: 2010-06-19 16:38:14


Hi all: I've been working for a while on a variety of tools to facilitate
application development in normal (cross-platform) C++, and avoid the
byzantine dependency chains (including needing multiple boost versions)
which so often creep in because real applications always seem to piece
together disparate parts with different build systems, requirements, even
how to download the source code...pretty soon you're not programming C++
anymore, you're tinkering with Python make scripts, or Perl code
generators, or learning Git or Subversion ... know what I'm saying?
Anyhow, I'm a fan of the Mongo database, but it's notoriously hard to build
even the drivers, and not really suited for simple SQLite-like object
serialization for persistence between runs of an application (even though
this is theoretically possible, it is poorly documented and still requires
linking against the entire Mongo system).

So I've decided to develop a serialization framework (not a database) with
some "NoSQL" features based on Mongo, but alot easier to use. I believe
this framework could provide a foundation upon which useful, moderately
complex C++ applications could be designed, by providing extensions to the
library which are optional to use but which incorporate my work (I hope
that doesn't sound pedantic) on general application development, without
extra external dependencies. Specifically, these extensions would
include:

1) A tool for generating GUI code -- for wxWidgets, in particular -- from
archives that could be edited with a simple textual front-end, vaguely like
XAML;

2) A custom language based on Clojure -- a Lisp dialect originally
implemented by Rich Hickey on the JVM -- for expressing queries and
importing/exporting data from/to an archive;

3) Perl6-like regular expressions for matching against textual fields in an
archive;

4) AI-inspired algorithms for sorting, filtering, and in other ways
operating on archives.

My academic background is in AI -- actually, to be precise, I wrote a
doctoral dissertation in the philosophy of science, but I researched AI in
this context -- but I'm especially interested in nonrelational database
theory because it better captures the process of modeling complex systems,
and, in general, nonrelational databases are more interesting from an AI
perspective because the lack of a fixed schema means that operations like
sorting and filtering can require some "reasoning". I'm particularly
interested in application development because I think one concrete
application of AI research is to make tools like IDEs smarter. A
non-relational serialization library could potentially serve the
application development process not only by providing an easy way to
persist data, but through IDE extensions or project generators -- store
lists of debug breakpoints in an archive, or parse source code for
namespaces, types, etc., and store the results in an archive, or an archive
to represent all the controls in a GUI...

The library I have in mind would differ from boost.serialization by
providing explicit support for non-relational functionality, and also by
using a restricted type system along the lines of MongoDB and JSON: any
persistable data field would have to be marshalled into one of a few
predefined types, although users could explicitly extend the type system if
desired. Aside from writing persistence code directly in the C++ source
(along the lines of, e.g., instantiating a serialize() template in
namespace boost::serialization), the test or demo applications I've been
writing use external files, written in the (currently very minimal)
Clojure-like language I mentioned above, and an interpreter does the actual
serialization -- so the persistence strategy could be altered without
recompiling the application, even while it is running. I think this offers
new potential for using AI-style algorithms for things like tracking usage
patterns, because all of that could be implemented fully orthogonal to the
application itself.

So, that's the project I've sort of assigned myself, and I would appreciate
any comments and ideas and what I could do to make this the kind of library
C++ programmers would consider trying out. Thanks in advance.
 


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk