Boost logo

Boost :

From: Bohdan (warever_at_[hidden])
Date: 2002-11-13 11:06:42


"Wesley W. Terpstra" <terpstra_at_[hidden]> wrote in message
news:20021112185744.GA850_at_ito.tu-darmstadt.de...
> Yes, I already have that solved:

Wow, i underestimated what you have now.
If you already have transactions, locking ... this is cool.

> You get a transaction object from the environment.
> You get a database object from the environment.
> You combine the two to get a session object.
> The session looks like an stl map (it issues iterators).
> When you commit the transaction all the sessions using that transaction are
> invalidated as are all the iterators that they issued.

Not necessary. Generally transaction class has two methods
for commiting: commit and commit retaining (or checkpoint).
Later retains transaction contex.

> I would rather leave this entirely in user control. It would be a matter of
> adding a single template class if they wished to bridge the two themselves,
> so there is no functionality lost.

Hand-written serialization is very error-prone thing. IMHO it would be good
to supply some default serialization capability for user.

>
> Further, it is important to control the serialized format so that the
> lexical sort order has desirable properties. This fine grained control is
> not available under boost::serialization without writing a stream object for
> each desired lexical-sort seriailization.

As i understand you are going to mange binary data blocks and sort them using
part of this block. IMO this is too limited appoach. Actually, "key part"
of your MapDatabase sould be relational table. I mean that user will
be limited to some set of key types (int, char, varchar,datetime).
Some complicated key types will be not possible in this case.
But! I did not say this is poor way. I'm just not sure.

> I agree that it is not designed for this purpose, but I think there are
> many beneficial emergent properties. It does not actually have to exactly
> conform; just conform with the subset used in existing practice.

Sure, interface should be close to what user already seen.

> I know from experience with a previous product that even something that
> closely approximates a std::map is highly useful. I just want to bring that
> approximation a bit closer so that code really can ignore the difference.

> > Well, there are two ways:
> > 1. You need disk to reduce memory usage.
> > 2. You need disk to persist objects.
> > I'm not sure which one is yours. Did you ?
>
> Did I ?

Well ... english isn't my native lang :)

>
> I desire both of the two properties, you want me to choose? :-)

Definitely! i don't think you can mix this two approaches.
Personally i prefer 2. You can use it for 1, but not vice versa.
In 1 you put (automatically) on disk only something you don't need.
In 2 you put in memory only something you need.

> Is your comment here pertaining to mmap()?

If i understand you correctly mmap means some kind of page file.
I think it is very limited approach for solving 1. But it is not very smart.

> > > This would unfortunately break any code which took a T& or T*
> > > from an iterator and held on to it.
> >
> > If you want to allow pointers and use them after application restart
> > than use smart pointers:
> >
> > I've heard something about some system/processor tricks which allow
> > to persist pointers, but i do not think it is good way.
>
> You are considering serialization. I have no interest in this topic.
> My implementation presumes that you have picked one of the many available
> serialization tools or rolled your own.
>
> This is especially not-so-important because I do not plan on supporting
> storing anything other than by-value. Further, since it looks like the stl
> I know all the type information and there is no polymorphism.
>
> The T& and T* I was refering to are those obtained from the map::iterator
> class that is walking records. The user might dereference this iterator and
> take a reference to the object. Rather than telling them to use a smart
> pointer, better would be to say: just keep the iterator!

Iterator is kind of smart pointer or vice versa :)

>
> I am just concerned with breaking existing stl code.
> If possible, I would rather this could work, but I don't see how.

I'm sure you should read about ODMG c++ interface.
I hope i'm not too annoying :o)

> > ODMG proposes new/delete operators for persistent object
creation/destruction.
> > I this case you can construct your object is somewhat bigger memory chunk,
> > which can contain some other implementation specific per-object information.
>
> I do not think this will help; the returned object would be living on the
> stack which puts control of it's new/delete out of my hands.
No! Object is not living in stack it lives in cache. User is workin only with
smart pointer to it db_ref<MyObject>.

> However, with
> my trick (deriving from the class) I can still catch the destruction even
> though it is on the stack.

And what is the benefit of this "catch".
Destruction of c++ object and destruction of object in a file are
completely different things. Your task is to make it transparent to
user.

>
> > > This seems like a good idea, but it is fraught with complications.
> > > Consider two iterators i&j which (happen by chance to) point to the
> > > same object.
> > > i->set_a(j->set_b(4) + 2);
> > >
> > > Oops. You would expect both changes to work since they are
> > > presumably modifying different member variables. However, since
> > > i-> and j-> both read from disk and deserialised to a temporary,
> > > we are modifying two different temporaries. Therefore only one of
> > > the changes (whoever's object destroys last) will be made.
> >
> > If your class is going to have concurrent transactions, than your
> > example shows how concurrency works.
>
> These iterators are in the same transaction.
>
> > if i & j belong to same transaction (or no transactions) than
> > assert(&(*i)==&(*j) ). It can be implemented if MapDatabase::iterator is
> > smart enough "smart pointer" :)
>
> Well if it is a smart pointer than you are using solution #2: cache.
>
> But this is solution #3, I was trying to avoid deserialized cache and only
> use the database's underlying sector-oriented cache. So, the same object got
> deserialized twice and returned by value on the stack twice.

I see, but this is very slow approach ( constantly deserializing ), and it is
very
cumbersome for user.

> I implemented a stl-like wrapper for libdb3 at a place I worked, but I could
> not keep that code. It took approach "#1: don't do that!" to calling methods
> through the iterator.
>
> After that, I wrote a custom database without a nice api because I had a
> very specific use which I could not get fast enough with btrees and hashes.
> It was not transactional, however, and I later discovered an even better
> layout (nearly 0 seeks!).
>
> This newest attempt is a fully transaction database with several unusual
> backing data structures. I have a nice raw iterator class that operates on
> memory chunks for the key&data. It is mostly complete, but lacks a
> user-friendly api.
>
> I am trying to get it to look more like the stl because the wrapper I wrote
> for my previous employer was invaluable. And all the reasons at the top.

I i'm very interested i your class. It is very close to main field of my
professional
interests (object/relational databases). If you are interested i'd like to help
with this.

BTW, i have good example idea for your class: database for e-mail letters.
possible keys: id, subject, send date, sender ...
body: text
Letters is something you can't keep constantly in memory, and you should
have capability of extracting them by key.

regards,
bohdan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk