Boost logo

Boost :

From: Michael S. Tsirkin (mtsirkin_at_[hidden])
Date: 2000-07-06 16:13:17


Hello!
Quoting r. Dietmar Kuehl (dietmar_kuehl_at_[hidden]) "Re: [boost] Union/find and splay trees":
> Hi,
>
> --- "Michael S. Tsirkin" <mtsirkin_at_[hidden]> wrote:
> > Does there exist a library for union/find type of problem?
>
> I don't have a current implementation but the union/find algorithm is
> very simple and very powerful. But before sketching the "correct"
> algorithm, let me point out how union/find should be view!
>
> A typical approach is to use a special data structure (eg. a collection
> of trees) where the elements are registered. This is, IMO, the wrong
> approach. It is superior to consider the representative of set as a
> property of an element where the elements are identified by the same
> kind of entities used to represent the representative. Maybe this sound
> familiar to some people on the list: Use a property accessor to access
> the representative of an element. In principle uniting two sets can
> also be done via the normal property accessor methods but probably
> using a special function, eg. 'unit()', instead.

Are you talking about interface, or implementation?
Changing property on each element for sets being merged
will talke you all the way to N^2 ...
Some special structure (tree-like) was used in all books that
I have seen.

>
> Some information on property accessors can be found in the graph
> directory where this stuff is kind of essential. However, eg. for the
> union/find stuff it is also applicable for other uses.
>
> > I have written a simple library that uses a forest of
> > splay trees to solve this in O(NlogN), with low memory overhead.
>
> I think the best algorithm (which also happens to be *much* simpler
> than a splay tree) uses O(n * inverse(Ackermann(n))) or something like
> this (don't nail me down on the details). The algorithm is rather
> simple: Individual elements, ie. set containing only on element, are
> represented by themselves and simply store a "pointer" to themselves.
> When uniting two sets, the pointer in one of the representives is
> replaced by the pointer to the other representative. When looking for
> a representative for a given element, the chain is walked up and the
> current pointers in the nodes encountered are replaced by a pointer to
> the representative (there are some variations on this and I don't know
> exactly which is theoretically or practically best approach; as far as
> I remember, these differ...). That's all.

Maybe the textbook that I used was not very good, but
1- it gave a much more complex algorithm ( I do not understand
exactly what you write, but it took more than two lines :) )
and there was more memory overhead than splay trees, where
you pay only for 3 pointers per value stored.

Reasons for memory overhead:
- for union by rank, you have to save additional integer, which is the
  rank.

- for path compression, you have unlimited number of children in each
  node. This means you must at least have a list of nodes, that is
  pointer to next and previous in the list, which gives two more
  pointers. (we need these to make it possible to go over all
  elements in the set).

All in all, memory overhead is (at least) twice as much as that of splay tree
implementation.
Therefore, both methods are valid, they just represent different
speed/memory tradeoffs (like list and slist for example).

>
> I can probably dig up some old implementation or hack a new one if
> there is interest...
>

Please do,
    MST

-- 
This message content is not part of Intel's views or affairs
Michael S. Tsirkin
    >   Four things are to be strengthened: Torah,and good deeds,
    >   prayer and one's good manners (Berachoth)

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk