Hi Jeremiah,

Thanks for your reply.

I ended up writing my own version, it is quite a simple algorithm.

It allowed me to do several of the following optimizations (which are relevant for my particular usage):

Return the merged rep directly from link(), instead of a void method - zero overhead.
Track the set size internally. My runtime decision to incrementally merge sets depends on the momentary size of each set.
Replace rank with size - the set size differences are equivalent to the set ranks, so I used the size instead of rank, and thus kept the same mem-footprint and runtime.
Track the number of disjoint sets (decreases by 1 for each link() call).

Warm regards,

Adi

> Hi,
> I'm using boost::disjoint_sets_with_storage<>.
> After calling link(rep1, rep2), I want to know the new rep - the representative of the merged set.
>
> Obviously, I can call find_set(rep1) to find it.
> I assume that the extra call to find_set incurs an additional run-time cost per call to link.
> Is this true or is the call immediate (as in returning the immediate parent of rep1)?
If you are using path halving, it appears that it is basically immediate
(looks up the parent of whichever representative you gave it, then looks
up the parent of that, and does nothing else if the sets were directly
joined). Path compression seems to be more complicated but not do too
much more work in your use case.
> If it does incur a run-time cost, then we know that the new-rep is either rep1 or rep2.
> Is there some other way to know which one it is?
It does not appear so using the public interface.

> On a related matter, I am also tracking the size of the sets for each link().
> Does knowing the size of the rep set before the link help in determining which of rep1 or rep2 will be the new rep?

The test is not based on the sizes, but on the ranks of the sets being
merged. Search for "union by rank" in
https://en.wikipedia.org/wiki/Disjoint-set_data_structure for more
explanation of how the representative for the merged set is chosen.
> Is there a simple way to extend boost::disjoint_sets_with_storage<> to
> do the size tracking internally (currenlty I keep an external size
> vector per rep to keep track of the link()s).

When do you need to know the sizes? Only at the end of whatever process
is building up the disjoint set structure? If so, calling compress_sets()
then iterating directly over the parent map (the parents() member) might
be the easiest way to go. Otherwise, disjoint_sets_with_storage<> does
not directly contain any of the logic for maintaining the data structure,
so it would not be hard to make your own edited version that maintains
sizes. That may not be any better than the approach you are already
using, though.
-- Jeremiah Willcock