From: Wesley W. Terpstra (terpstra_at_[hidden])
Date: 2002-11-19 09:42:21
Sorry for the latency; was ill.
A status update (since I was working without internet while sick).
I have a working vector<T> wrapper as a prototype. It supports every
operation and type specificied on the SGI site except i->foo(), and
vector<T>::pointer is void*. The iterators are random access, if mutable
then *x = t; works and *x is convertable to T.
One interesting detail was this code fragment:
typedef JFA::VectorDB<IntTraits> DB;
typedef DB::Session Session;
JFA::Environment env = JFA::Guess("/tmp");
DB db(env, "test");
JFA_BEGIN_TRANSACTION(t, env) // opens retry block
Session s = t(db);
int k = 0;
Session::iterator e = s.end(); // getting iterators is expensive
for (i = s.begin(); i != e; ++i) k += *i;
JFA_END_TRANSACTION(t) // commits and closes block
The interesting thing to note is that s.begin() and s.end() are MUTABLE
iterators. Accessing these requires write-locking the sector they are on.
Yes I could require ((const)s).begin(), but I think that this is bulky.
However, I found that by returning a proxy object, I could delay the locking
till *i. This means that the above code does not write-lock at all.
(It also reads 40M in under half a second on my budget hardware :-)
So... I am beginning to lean towards the "don't do that" approach where I
simply don't allow the user to call member methods on items in the
container. (And not let them take pointers) This allows at least the above
optimization and a few others (like *i = *j; -- no deserialize&serialize)
and probably more I don't forsee yet.
On Wed, Nov 13, 2002 at 06:06:42PM +0200, Bohdan wrote:
> "Wesley W. Terpstra" <terpstra_at_[hidden]> wrote in message
> > You get a transaction object from the environment.
> > You get a database object from the environment.
> > You combine the two to get a session object.
> > The session looks like an stl map (it issues iterators).
> > When you commit the transaction all the sessions using that transaction are
> > invalidated as are all the iterators that they issued.
> Not necessary. Generally transaction class has two methods
> for commiting: commit and commit retaining (or checkpoint).
> Later retains transaction contex.
Good point. Presently my transaction manager does not support checkpoint (I
keep all the dirty write buffers in RAM [I know... just no time to make it
I think a checkpoint operation could simply write and release all the
unreferenced dirty buffers to disk without invalidating the sessions,
iterators, and references. So, this does not collide with my current design.
> Hand-written serialization is very error-prone thing. IMHO it would be good
> to supply some default serialization capability for user.
I do agree; but I want to reduce coupling.
I have not seen a serialization library compelling enough to bind to it.
> As i understand you are going to mange binary data blocks and sort them
> using part of this block.
Yes, that is exactly what I am doing.
> IMO this is too limited appoach. Actually, "key part" of your MapDatabase
> sould be relational table. I mean that user will be limited to some set of
> key types (int, char, varchar,datetime). Some complicated key types will
> be not possible in this case.
Yes, I lost a lot of sleep over this.
However, I eventually concluded that ANY tuple of ANY fundamental types can
be serialized into a string with an approriate lexical sort order.
(I actually have a formal constructive proof of this)
[ It is possible to deal with variable data by escaping nulls and appending
a null on the end, it is sufficient just dump floats and ints in network
byte-order, etc. ]
Since my primary goal is speed (I even expose the underlying sector buffers
to the stl-like wrappers so they can avoid virtual method calls for within
sector op.s), a user callback comparison function would be quite costly.
> But! I did not say this is poor way. I'm just not sure.
I am not sure either, but I do know that it is a greatly simplifying
assumption. I also know that a sufficiently power-serialization library
could make the tuple case work.
Finally, one of my on-disk structures is using front-coding.
So I really do need lexical sort order for this case.
> > > 1. You need disk to reduce memory usage.
> > > 2. You need disk to persist objects.
> > I desire both of the two properties, you want me to choose? :-)
> Definitely! i don't think you can mix this two approaches.
> Personally i prefer 2. You can use it for 1, but not vice versa.
> In 1 you put (automatically) on disk only something you don't need.
> In 2 you put in memory only something you need.
I see. Since I am transactional and use a reference count to hold pages in
memory I think I am #2. However, as I stated, I desire both. :-)
But if you mean "Am I a glorified VM swapper?" then no.
> > Is your comment here pertaining to mmap()?
> If i understand you correctly mmap means some kind of page file.
> I think it is very limited approach for solving 1. But it is not very smart.
I agree. I only mentioned it to forestall replies suggesting it.
mmap could actually persist objects too.
An entirely different project might be to make an stl-allocator that only
gave addresses on mmap()'d memory. Then require all stored types to only use
this allocator for their pointers. Such a system would be persistent and
also take only whatever memory the kernel paged in.
However, it would seek like a son-of-a-bitch (once too big for the VM), not
be transactional, and only be suitable for very specific types. (those that
use the allocator)
> I'm sure you should read about ODMG c++ interface.
> I hope i'm not too annoying :o)
Ok ok... I just don't want to make an object database. :-)
I will read more about it tonight.
> > However, with my trick (deriving from the class) I can still catch the
> > destruction even though it is on the stack.
> And what is the benefit of this "catch".
> Destruction of c++ object and destruction of object in a file are
> completely different things. Your task is to make it transparent to
The "catch" is so that I could reserialize the object to the database when
the destructor is invoked (eg: trap the most recent state). Not destroy the
object on disk.
> > > > Consider two iterators i&j which (happen by chance to) point to the
> > > > same object. [added: which has member methods set_a and set_b]
> > > > i->set_a(j->set_b(4) + 2);
> > > >
> > > > Oops. You would expect both changes to work since they are
> > > > presumably modifying different member variables. However, since
> > > > i-> and j-> both read from disk and deserialised to a temporary,
> > > > we are modifying two different temporaries. Therefore only one of
> > > > the changes (whoever's object destroys last) will be made.
> > > if i & j belong to same transaction (or no transactions) than
> > > assert(&(*i)==&(*j) ). It can be implemented if MapDatabase::iterator
> > > is smart enough "smart pointer" :)
> > But this is solution #3, I was trying to avoid deserialized cache and
> > only use the database's underlying sector-oriented cache. So, the same
> > object got deserialized twice and returned by value on the stack twice.
> I see, but this is very slow approach ( constantly deserializing ), and it is
> very cumbersome for user.
I disagree that it is cumbersome; the deserialization is transparent.
The deserialization occured on the "i->...", only two bytes of code! :-)
I do agree that it is slow. Really slow if the user gets too used to the
idea of calling member methods on things in the container.
Interestingly, the same problem happens if:
x = *i;
y = *j;
x.set_a(y.set_b(4) + 2);
*i = x;
*j = y;
Here the serialization and deserialization is explicit, but the same error
happens if i and j point to the same record. (Again, this is the same
transaction, so there is no locking issue)
I am strongly leaning towards the "don't do that" solution these days partly
because it is clear to the user when he can expect deserialization. On the
other hand, it is more bulky for simple operations (as we saw above).
The interface as it stands in my current prototype:
x = *i; deserializes
x = v[off]; deserializes
v[off] = x; serializes
*i = x; serializes
*i = *j; memory copies
v[off].member_method(); does not compile
v[off].member_variable; does not compile
(*i).member_method(); does not compile
(*i).member_variable; does not compile
^--- note, this actually does not violate the STL since the result
of v[off] and *i is only required to be convertible to T
i->member_method(); does not compile
i->member_variable; does not compile
^--- this does violate the stl... But, at least i::pointer is void*
so that you might have a clue. :-)
> I i'm very interested i your class. It is very close to main field of my
> professional interests (object/relational databases). If you are
> interested i'd like to help with this.
Well, sure, give me a couple weeks to brush up the code a bit and I will
send you a tarball. There is no public cvs because I am not sure in what
form I am going to release the package (bsd / gpl / whatever).
> BTW, i have good example idea for your class: database for e-mail letters.
> possible keys: id, subject, send date, sender ...
> body: text
> Letters is something you can't keep constantly in memory, and you should
> have capability of extracting them by key.
Actually, you are talking to the author of
This database is being written partly as the next generation database for my
email archiver. Heh. The one lurker uses now has to many seeks during
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk