Boost logo

Boost :

From: Scott Woods (scottw_at_[hidden])
Date: 2005-02-08 00:01:53


----- Original Message -----
From: "Jonathan Turkanis" <technews_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Tuesday, February 08, 2005 4:27 PM

> > * there is no knowledge of bytes consumed, instead I only remember Mb
> > * the only "addressing" is by ordinal, e.g. log[ 68755 ] so my maximum
> > addressable
> > space is a function of 32-bit integers (the ordinal) and the bytes
> > consumed by
> > each log entry.
>
> Could you elaborate on these points? What is the interface for accessing
the
> data?
>

The limitations of 32-bit integers first arose when dealing with the huge
solitary
files - I moved to striping. It cropped up again once I was dealing with
seriously
large volumes of logging.

Starting with Roman's template names and adding mine;

typedef stxxl::vector<double> actual_stripe;
typedef striped_vector<actual_stripe> stripes;
typedef striped_vector<stripes> striped_stripes;

Knowing when to "close off" a stripe and open the next is driven
by an "enum OPTIMAL_MAXIMUM". For a vector this was some reasonable
byte figure for the local filesystem. The "striped_vector" expects and
conforms to
a minimal concept, i.e. it can also be passed as a "stripe-type". The
calculation of
OPTIMAL_MAXIMUM within striped_vector<striped_vector<...> > quickly
blew out the 32-bit limit. So I moved it to a calculation of Mbs.

I would include real examples of template usage but I suspect that unrelated
material would confuse things (serialization techniques). So here is
some modulated source;

struct stored_log {};
typedef stxxl::vector<stored_log> log_stripe;
typedef striped_vector<log_stripe> log_vector; // Folder of files

//
//

class application_storage
{
    folder_device home;
    log_vector log;

..
..

 log.open( home .. );
..
..
stored_log sl;
..
..
log.push_back( sl );
..
..
stored_log &application_storage::line( unsigned long i ) { return log[
i ]; }

My implementation of an out-of-core vector is nowhere near as complete as
Roman's. My
container is required to collect a sequence of data and provide random
access to it. Taking this
to heart there is no means by which you can modify something once it has
been "push_back'd".
The ramifications of this with respect to STL concepts and the algorithms
that rely on them are
obvious. The container is write-once-read-many.

Hope this was a successful elaboration.

Cheers.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk