Boost logo

Boost :

From: Arkadiy (arkadiy_at_[hidden])
Date: 2003-06-30 14:15:29


We may use this to provide a better table implementation for the Relational
Template Library, we are currently working on.

Right now we are using a sorted vector instead of a set, to implement our
relational tables, because set doesn't allow us to search on a prefix of a
key. Like if a table is indexed on a, b, c, we are not able to use
equal_range on a, b with the set. Sorted vector allows us to do this, but
sorted vector has obvious problems with inserts/deletes.

Do you think your library can provide both search on a key prefix and fast
inserts/deletes?

Arkadiy

"JOAQUIN LOPEZ MU?Z" <joaquin_at_[hidden]> wrote in message
news:149e4a1475ec.1475ec149e4a_at_tid.es...
Hi all,

A few weeks ago I prompted for interest in a multiply
indexed set container, whth alas little response from the
community. In the meanwhile, I continued working in
multiindex_set and now the library is 90% completed,
save docs. It is available at

http://groups.yahoo.com/group/boost/files/multiindex.zip

and have been checked to compile with MSVC++ 6.5 and
gcc 3.2 under cygwin.

IMHO multiindex_set is an useful resource and can help
programmers when they are in need for data structures more
complex than simple std::sets and std::multisets. In these
cases multiindex_set is a more robust solution than the usual
resort to ad hoc constructs based on containers of iterators
and stuff like that.

I am aware the complete lack of docs is a big barrier to the
evaluation of the library, so the following serves as a basic
introduction to multiindex_set that hopefully can motivate the
reader into trying the library. An extensive example of use is
provided in the URL above showing the whole set of capabilities
offered by the container.

* Usage: As std::set takes a comparison predicate to index
the elements contained, multiindex_set is instantiated with a
boost::tuple of such predicates, which we call indices.
For instance, say we have class employee with sorting
predicates based on some id (upon which employe::operator <
is written), name and age. One can construct a set providing
indices to all three orderings like this:

  typedef
    multiindex_set<
      employee,
      tuple<
        std::less<employee>,
        less_by<employee,std::string,&employee::name>,
        less_by<employee,int,&employee::age> > >
    employee_set;

less_by is a facility intended to help construct a sorting predicate
based on a single member of the class. In general, one can use
whatever predicate with the right signature.
Unless explictly stated the fist index (std::less<employee> in our
example) is taken to be *unique*, while all others allow for duplicates.
This means one can insert employees with the same age but not
the same id. multiindex_set allows for specification of zero, one or
more unique indices, much in the same way as a relational table does.
So, the following constructs a multiindex_set with unique indices
for id and social security number:

  typedef
    multiindex_set<
      employee,
      tuple<
        std::less<employee>,
        less_by<employee,std::string,&employee::name>,
        less_by<employee,int,&employee::age>,
        less_by<employee,int,&employee::ss_number> >,
      unique_indices<0,3> // id and ss number
>
    employee_set;

Once constructed, multiindex_set holds as many internal orderings
as indices are specified. Each index has the same interface as a
std::set container. By itself, multiindex_set inherits the functionality
of the first (primary) index.

  employee_set es;
  // retrieves a "view" based on the first index (name)
  employee_set::index_type<1>::type& i1=es.get<1>();

  // ordered by id
  std::cout<<"by id"<<std::endl;
  std::copy(
    es.begin(),
    es.end(),
    std::ostream_iterator<employee>(std::cout));

  // ordered by name
  std::cout<<"by name"<<std::endl;
  std::copy(
    i1.begin(),
    i1.end(),
    std::ostream_iterator<employee>(std::cout));

* Special lookup operations: When elements are expensive to
construct, it is a nuissance to have to build a whole object
just to search for a particular member of it. multiindex_set duplicates
the std::set lookup facilities to help you perform searchs without
supplying entire objects:

i1.find("John"); // find John using the name index

For this to work the predicates must be extended to accept the
appropriate subobjects (less_by automatically does this).
One can even pass in "compatible" sorting predicates, where
compatible means "inducing a weaker compatible ordering", as
the following example shows:

i1.lower_bound('J',employee::comp_initial()
// first employee whose name begins with J, using the
// name index and the special employee::comp_initial predicate.

* Update: given an element pointed to by a multiindex_set iterator,
one can change its contents via the update member function:

employee_iterator it=es.find(0); // employee with id==0
employee e=*it;
e.id=100; // change her ID
es.update(it,es); // and update the record.

updating is done efficiently (constant time) if the ordering
of the updated element remains the same with respect to all
indices, and logarithmically for other cases. Interestingly enough,
iterators and references to an element remain valid after updating.
update provides strong exception-safety, in the sense of
Stroustrup's TC++PL sp. ed.

* Implementation: multiindex_set has been written from scatch
with Boost in mind, so it uses a fair number of other Boost
libraries, most importantly the MPL, which helps build the internal
hierarchy of indices in compile-time. Run-time wise, multiindex_set
is optimal in terms of space and complexity, yielding results
equivalent to STL sets (wich BTW can be simulated as particular
instantiations of multiindex_set).
Indices are constructed as red-black trees. For this I've resorted
to SGI STL tree implementation, which raises a concern about
copyrights. Strictly speaking, only the basic RB algorithms have
been used, but no attempt has been made to hide this fact. So
I wonder which is the legal status of the library in this respect
(now seems a good time to ask given the current interest in the
Boost licensing terms).

Sorry for the long post. Sugggestions and comments about the
library, as well of demonstrations of interest or disinterest in having
it promoted to Boost, are most welcome.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk