Boost logo

Boost :

From: JOAQUIN LOPEZ MU?Z (joaquin_at_[hidden])
Date: 2007-06-10 17:46:53


----- Mensaje original -----
De: Andrey Semashev <andysem_at_[hidden]>
Fecha: Sábado, Junio 9, 2007 10:22 pm
Asunto: Re: [boost] [multi_index] User-defined indices
Para: "\"JOAQUIN LOPEZ MU?Z\"" <boost_at_[hidden]>

> Hello JOAQUIN,
>
> Friday, June 8, 2007, 8:31:55 PM, you wrote:
>
> > Hello Andrey,
>
> > That feature is there on the future work section, but
> > for the moment being I don't really know when/if is going to
> > be provided. The main problem with documenting the internal
> > interface of indices is that I'd be bound to freeze it, and
> > as it happens I've had ocassionally the need to modify the
> > interface or augment it to support new funcionalities or
> > optimize some aspects.
>
> It's a pity to hear that. Although I share your worries about
> interface modification problems, I believe that as all major STL-
style
> indices are covered in the lib, the main direction of its
> evolution is to allow users to extend it. You won't be able
> to provide indices for every real life case, though you will
> cover quite a number of them.
>
> I'm not familiar with the implementation, but I think it should be
> possible to provide at least basic user-level interface without
> constraining you much to do some magic behind the scene in your
> out-of-box indices. I had some experience in developing similar
> containers (a bit more case-specific, though) and it almost always
> comes down to a std::list of nodes and whatever-you-need containers
> that hold iterators to the list of nodes and represent indices. If
> your design is similar to that, I don't see many problems if a user
> sees list-like interface of the main container and some helper
> facilities to implement index iterators and projecting.

It is not exactly like that, if I'm understanding your
description correctly. Indices do not hold iterators to some
base container, which would be the simplest implementation
technique, but they're rather much more tightly coupled
to each other internally. For instance, a multi_index_container
consisting of N indices holds only one node per element --this
node is constructed by metaprogrammatically aggregating
in one single struct the N headers for each index. The savings
achieved by this technique are calculated at

  http://boost.org/libs/multi_index/doc/performance.html

The inner workings of the indices are also constrained
by some internal APIs that must be complied with so that
multi_index_containers can assemble and use the indices.

> > I'm curious about those special indices you've been in need
> > of. If they are of general interest and you want to contribute
> > the code maybe they can be adapted and included into the lib.
> > Also, maybe if you explain their purpose we can come up with
> > a way to have the functionality with the current public
> > interface of Boost.MultiIndex.
>
> Well, I'm not sure it's generic enough to be contributed. The
> container I developed holds elements (not necessarily std::pairs).
> Each element contains a string member which is a wildcard as a key.
> The wildcard may contain fixed characters and placeholders '?' (any
> single char) and '*' (any number of any chars). Overlapping wildcards
> (two wildcards are overlapping if they are different, but there is a
> string which they both match to) are allowed to coexist in the
> container, but not the same ones.

Do '*'s always happen at the end of the string or are they allowed
to be used in the middle of a expression, for instance like in
"ba*ing"?

> A user, having a complete string, is able to find an element whose
> wildcard matches most closely (i.e. is the most detailed
> among all wildcards in the container that match the string).

In O(log n) time?

> Additionally, he is able to find all elements whose wildcards match
> the string. On top of that, it should be possible to operate on the
> container elements by having their unique integer identifiers (which
> essentially means another index).
>
> The major requirement was lookup performance, even in a reasonable
> expense of storage overhead and other operations speed. A typical
> sizes would be tens to hundreds of thousands of elements.
>
> I failed to make use of B.MI ordered or hashed indices because of the
> constraints they apply to the CompatibleKey, CompatibleCompare,
> CompatibleHash and CompatiblePred objects I would have to provide.
>
> My technique was based on a number of hash tables and a little
> knowledge of the wildcards nature. :) Do you consider this kind of
> index could be generalized more to be included in B.MI (some kind of
> "like_index" or "matching_index")?

Could be. The scenario looks very attractive, and rather intriguing
too. Could I learn more about your implementation techniques?
This does not seem an algoritmhically trivial problem at first sight.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk