Boost logo

Boost :

Subject: [boost] [stl_ext_adv] ] Interest in advanced data structures
From: Vadim Stadnik (vadimstdk_at_[hidden])
Date: 2011-11-30 04:58:11


Hi all,

Advanced data structures (ADS) offer useful STL extensions that are not
possible or not efficient with basic data structures, such as arrays,
linked lists and red-black trees.

The project

https://github.com/vstadnik/stl_ext_adv

http://dl.dropbox.com/u/51041088/stl_ext_adv_doc/index.html

focuses mainly on a generalized and unified computer representation of
sequences using the augmented B+ trees. It provides the following STL
extensions:

- The sequence container supports random access iterators and a union of
interfaces of STL containers: vector, list and deque. This sequence offers
the advantage of efficient update operations with logarithmic running time
compared with linear running time of the same operations of std::vector.

- The associative containers (set, multiset, map and multimap) with random
access iterators.

- The enhanced random access iterators effectively address the fundamental
problem of the iterator invalidation.

- The function accumulate() with logarithmic computational complexity can
significantly improve performance of many practically important algorithms
and applications, such as the numerical integration, the statistics, the
data analysis etc.

The examples demonstrate how to improve the computational complexity from
O(N) to O(log N) in the following applications and problems:

- numerical integration.
- calculation of mean value and standard deviation;
- calculation of weighted mean value;

- testing if any number of consecutive elements are equal;

- counting intersections of one interval with sequence intervals;

Is there an interest in this project and implemented STL extensions?

Interest in potentially useful, but not yet implemented algorithms and
specialized containers:

1. The function accumulate() with constant computational cost. With this
function some user algorithms, such as the numerical integration, can come
close to the absolute theoretical limit of the computational cost. The
function can be implemented through iterators enhanced with functionality
of accumulators. For more details, see the section “Random access
iterators” in the documentation.

2. The functions sequence::splice() with logarithmic running time in the
worst case. Note that functions std::list::splice() provide in many cases
constant computational cost, however, in the worst case the cost is linear.

3. Any other potentially useful specialized containers, iterators or
algorithms?

Augmented B+ trees are versatile and powerful data structures. They can be
designed to maximize efficiency of particular algorithms and programming
solutions. The limitation of specialized containers is that the
computational cost of some other operations may increase.

About augmented red black trees.

The augmented red black trees can also support various STL extensions. The
red black trees are yet out of the scope of this project for the reasons
explained in the documentation. However, historically augmented red-black
trees are probably the first ADS that could extend STL, since the method of
augmenting red black trees was described as early as in 1990 [T.H. Cormen,
C.E. Leiserson and R.L. Rivest.” Introduction to Algorithms” (1st ed.). MIT
Press and McGraw-Hill, 1990]. This means, in particular, that the very
first version of STL could provide associative containers with random
access iterators. Also, as I understand the previously submitted project
https://github.com/boost-vault/Containers/blob/master/countertree.zipimplements
containers based on the augmented red black trees.

It is hardly possible to make an exact prediction, but without a doubt
future versions of STL will benefit from advanced data structures.

Regards,
Vadim Stadnik


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk