Boost logo

Boost :

Subject: Re: [boost] Interest in some Algorithms
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2016-01-24 23:48:09


So sorry for the delay on the questions here. Some answers inline below.

On Wed, Jan 20, 2016 at 4:31 AM Phil Endecott <
spam_from_boost_dev_at_[hidden]> wrote:

> Steven Watanabe wrote:
> > On 01/18/2016 05:13 AM, Dean Michael Berris wrote:
> >> I wanted to ask whether anybody's interested in a couple of
> >> algorithms, to be made part of the Boost.Algorithm library.
> >>
> >> One is tentatively named ?set_inplace_difference' which has
> >> the following synopsis:
> >>
> >> template <class I1, class I2>
> >> I1 set_inplace_difference(I1 f1, I1 l1, I2 f2, I2 l2);
> >>
> >> template <class I1, class I2, class C>
> >> I1 set_inplace_difference(I1 f1, I1 l1, I2 f2, I2 l2, C comp);
>
> (I've fixed an obvious typo above...)
>
> >> It's essentially a set_difference which re-uses the storage of
> >> the range [f1, l1). It takes elements from [f1, l1) that are in
> >> [f2, l2) and moves them to [p, l1) where p is the partition point
> >> (or the new 'end'). This is best used in conjunction with 'erase'
> >> on vectors and/or std::deque. These algorithms return p.
> >>
> >> Another is tentatively named 'set_inplace_intersection' which
> >> instead of the above which does the difference, does an intersection
> >> instead.
> >>
> >> Requirements on I1 and I2 are that they are RandomAccess iterators,
> >> and that they are sorted. I haven't figured out whether
> >> partially-sorted is a sufficient condition. I also haven't figured
> >> out whether it would work for weaker iterator classes, but so far
> >> my implementation relies on std::lower_bound and std::upper_bound.
> >
> > Why do you need lower/upper_bound? set_difference
> > can be implemented easily with a straight linear
> > scan through the two sequences.
>
>
That is true, and I suppose that's acceptable for some values of N. In
practice, when you have two sorted ranges, you want to be able to eliminate
as many of the elements as you can without having to walk the whole range.

> If you do that, the complexity is at least O(N) even for the best
> case where the ranges don't overlap.
>
> Here is my attempt at this algorithm, which might be O(N log N) worst
> case if I'm thinking straight. It is O(1) when the ranges don't overlap.
> Dean, is this anything like what you've implemented?
>
>
It's close, I've implemented this without the recursion.

>
> template <typename I1, typename I2>
> I1 set_inplace_difference(I1 f1, I1 l1, I2 f2, I2 l2)
> {
> // If either range is empty, there is nothing to do:
> if (f1 == l1 || f2 == l2) return l1;
>
> // If the ranges are both singletons, compare the values:
> if (l1-f1 == 1 && l2-f2 == 1) {
> if (*f1 == *f2) return f1; else return l1;
> }
>
> // If ranges don't overlap, there is nothing to do:
> if (*(l2-1) < *f1 || *f2 > *(l1-1)) return l1;
>
>
If you can use std::lower_bound and std::upper_bound here, there's a
simpler test. :)

> // Recurse, splitting the larger range:
> if (l2-f2 > l1-f1) {
> // Second range is larger. Split it at its midpoint and
> // subtract each half in turn.
> I2 m2 = f2 + (l2-f2)/2;
> I1 p = set_inplace_difference(f1,l1, m2,l2);
> return set_inplace_difference(f1,p, f2,m2);
>
> } else {
> // First range is larger. Split it at its midpoint and
> // subtract from each half in turn.
> I1 m1 = f1 + (l1-f1)/2;
> I1 p = set_inplace_difference(f1,m1, f2,l2);
> I1 q = set_inplace_difference(m1,l1, f2,l2);
> // Now merge the two results.
> return std::rotate(p,m1,q);
> }
> }
>
>
This looks very interesting Phil. I suppose if we measure this on really
big overlapping ranges, there might be an issue with the recursion. You can
leverage the fact that the ranges are already sorted, using
std::lower_bound and std::upper_bound to find contiguous ranges that can be
rotated out of the way incrementally.

Seems there is some interest for this, Marshall do I just send you a pull
request for the implementation? What is the process here?

Cheers


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk