Boost logo

Boost Users :

From: Beman Dawes (bdawes_at_[hidden])
Date: 2005-02-27 21:18:32


At 11:23 AM 2/23/2005, John.Wismar_at_[hidden] wrote:
>Does Boost have a library that defines a file-based container, perhaps
>based on a B-tree or equivalent, for very large collections? Or does
>anyone know of such a thing, or has there been talk about developing one?
>
>We have a few file-based collections with millions of entries. We don't
>especially want to install a database. The file is too large to load
into
>memory. The file's format could be changed to make it compatible with
the
>type of collection I'm thinking of, but for the moment, we're doing
binary
>searches by reading the file on disc. As you might guess, this can be
>quite slow, especially when the file is on a DVD and not the HD.
>
>Anyway, I'd like to be able to write something like:
>
>//file.data contains 100,000,000 entries
>file_map<int, int> some_data("file.data");
>file_map<int, int>::iterator it = some_data.find(56789012);
>
>and have the container (for example via a B-tree or whatever) be able to
>locate the item with very few disc accesses.
>
>Any libraries that can do this type of thing?

In 1999 I did some work on a B-tree implementation for Boost, but became
discouraged. For maximum portability and very high performance, it would
have had to have a lot of limitations; memcopy'able POD types only, an
interface that differed in several ways from the standard library
containers, etc. Even with those limitations, the prototype took twice as
long for typical operations as the old C code B-tree I've used for 20
years.

Perhaps someone else will give it a try; a B-tree is an incredibly valuable
tool and it would be great to have a good one in Boost.

--Beman


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net