Boost logo

Boost :

Subject: Re: [boost] [stl_ext_adv] ] Interest in advanced data structures
From: Vadim Stadnik (vadimstdk_at_[hidden])
Date: 2011-11-30 22:20:40


On Thu, Dec 1, 2011 at 11:26 AM, Vicente J. Botet Escriba <
vicente.botet_at_[hidden]> wrote:

>
>> how is your library related/compared to Boost.MultiIndex?
>>>>>>
>>>>> As far as I understand library multi-index provides multiple views of a
>>>> collection of data. The submitted containers and data structures can be
>>>> quite useful at implementation level for this type of applications,
>>>> since
>>>> one advanced data structure efficiently supports many types of STL
>>>> sequences and associative containers.
>>>>
>>>> I don't see how your library could improve the efficiency of
>>> Boost.MultiIndex. Maybe you could show some figures comparing both
>>> libraries.
>>>
>>> As for applications to other Boost libraries, I think that in the
>>>> current
>>>> shape the new data structures and containers can increase the
>>>> performance
>>>> of Accumulators and Interval.
>>>>
>>>> How your library can improve the performance of these libraries?
>>> Note that I'm not against your library, that I don't know well, but just
>>> I
>>> want to see what is the added value.
>>>
>>> I am not familiar with implementation details of multi-index library. My
>> reply only concerns the general use of new containers in multi-view
>> applications.
>>
>
> What do you mean on the last sentence?
>
> The demonstration is actually available in the example of
>> fast counting of intersections of one interval with sequence of intervals.
>> This solution uses one container, which stores all objects of intervals,
>> plus two containers, which represent ordered views of lower and upper
>> endpoints of these intervals. The ordered views support efficient search
>> operations and provide random access iterators. For this reason the
>> algorithm of intersection counting has logarithmic running time against
>> linear time if the solution was based on standard containers. In
>> principle,
>> the approach can be extended and applied to range queries using multiple
>> views.
>>
> What I mean is that it seems to me that Boost.MultiIndex provides already
> several index for a container, including random access index. So, what is
> exactly the added value.
> If your use case can be implemented already using Boost.MultiIndex I will
> consider to propose a specific abstraction on top of Boost.Container as it
> is done in Boost.Bimap. If it is not possible, because a specific kind of
> container must be added to Boost.MultiIndex, I will prefer Boost.Multiindex
> integrates this new container if the use case is of general purpose.
>

The last sentence is very close to what I mean. The proposed library is a
generalized and quite efficient extension of standard containers as
described in the documentation. The containers and algorithms of this
library are designed to become modules and provides support for other more
advanced libraries and solutions.

> I guess that you could study Boost.MultiIndex to see if your library
> provides something that is not doable with multi-index, or can be done
> better or that improves the performances. I ensure you will not lost your
> time.

I will have a deeper look at multi-index library. Also, I would like to ask
the following question, which can clarify the usefulness of new containers
for multi-index:

Can multi-index library solve the problem of counting of intersections of
one interval with an arbitrary sequence of intervals in O(log N) time and
provide the same running time for update operations?
The proposed library can support this efficiency, since its containers have
logarithmic cost of find, insert and erase operations and random access
iterator for both ordered and un-ordered data.

>
> New library can support efficient accumulators through the fast member
>> function accumulate() with logarithmic running time in the worst case.
>>
> Doesn't accumulate has O(log N) complexity for containers that have random
> access? If not, maybe Boost could include a boost::accumulate function that
> provides this complexity (if not already exists) in a non-intrusive way.

Standard containers, such as std::vector, std::list, std::set etc., provide
linear computational cost of std::accumulate() through iterators. If Boost
library has already implemented algorithm accumulate() with logarithmic
complexity, please send me a reference or a link, this is very interesting.

>
> Efficient accumulators are not yet implemented in this project. The
>> section
>> "Iterators and accumulators" of this document
>>
>> http://dl.dropbox.com/u/**51041088/stl_ext_adv_doc/doc/**iterators.html>
>>
>> discusses pros and cons of this option. I am not sure whether to add this
>> functional or not, this is why I ask Boost community if there is an
>> interest in this functionality.
>>
> IIUC you are proposing to store the accumulation of all the lower nodes in
> each node of the container, and maintain this information up to date while
> inserting and removing elements, isn't it?
>
Just a minor clarification: the accumulated value of all elements in the
range [0 , position) to be stored as a data member of an iterator.

> I have not found a use case for that, but several times I have needed to
> have each node store the accumulation of each one of its children.
>
The augmented B+ tree (class bp_tree_idx_acc) constructs these data for
every internal node or in other words for every level of the tree except
the lowest, which stores container elements.

> I will see this as a possible addition to Boost.MultiIndex as a special
> container/index that stores the accumulation of the preceding members, with
> the pros and cons. Whether it is worth including this in Boost.MultiIndex
> is an open question, that Joaguin could answer if needed.
>

Thank you, that would be very helpful.

Regards,
Vadim Stadnik


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk