Boost logo

Boost :

Subject: Re: [boost] Formal Review: Boost.RangeEx - missing algorithms
From: vicente.botet (vicente.botet_at_[hidden])
Date: 2009-02-22 10:08:01


----- Original Message -----
From: "Neil Groves" <neil_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Sunday, February 22, 2009 3:11 PM
Subject: Re: [boost] Formal Review: Boost.RangeEx - missing algorithms

>
> Dear Vicente,
>
> On Sun, Feb 22, 2009 at 12:13 PM, vicente.botet <vicente.botet_at_[hidden]>wrote:
> <snip>
> </snip>
>
>>
>> May be it will be godd to add the equivalences. The user can always include
>> them in their application if they found useful.
>>
>
> I'm not sure about this issue. I really want to obtain feedback from as many
> people as possible. I understand the appeal of providing everything that is
> in <algorithm> but I dislike setting what is, in essence, a bad example of
> how to write algorithms. My personal preference is to drop the inferior
> technique, this will hopefully make it easy to do the right thing, and hard
> to do the wrong thing. I am, of course, open to feedback on this issue.
> Ultimately the library is for it's users.
 
It's OK. I will wait from feedback from others also.

>> > The _n operations are replaced if the input is a model of the
>> > RandomAccessRangeConcept with boost::copy( output, input | sliced(0, n) )
>> > for example, but on reflection this can and must be improved before
>> release.
>> > If I simply add a first_n adaptor then we can replace all _n algorithms
>> with
>> > a superior adaptor for all valid ranges. Hence I propose to add something
>> > that would allow:
>> > boost::copy( output, input | first_n(n) );
>> >
>> > I am pleased that you made me justify dropping the _n versions, because
>> > frankly my solution is currently insufficient to be a perfect substition.
>>
>> Yes, somthing like that should work, but what will be the difference
>> between
>> first_n(n) and sliced(0, n)?
>
>
> The key difference is that while sliced works perfectly fine for
> RandomAccessRange's it does not work for SinglePassRange's. It is possible
> to make a first_n work for all valid Ranges. Hence I can't correctly state
> that I have currently provided the tools to make the _n versions of the
> algorithms always replacable with a superior alternative. I shall, because I
> want to get rid of them, I should have provided a better alternative first.
 
Ok, I see.

>> > As for is_sorted and is_heap, these seem like useful additions to the
>> > algorithm_ext.hpp.
>> >
>> > And horror of horrors, the random_sample is missing due to no good
>> > justifiable reason what-so-ever. This I will address.
>>
>> waiting for them.
>>
>> >>
>> >> There are surely hidden reasons to don't include some of them.
>> >>
>> <snip>
>>
>> > I definately like the idea of carefully adding partitioning that is
>> > compatible with Boost.Range. I need to revist posts and suggestions made
>> by
>> > others on this list.
>> > IIRC Dr. Arnold Schodl had some interesting ideas, and partitioning is
>> > actively being worked on.
>> >
>> >> But a dynamic one would also be interesting. What do you think?
>> >>
>> >
>> > The only comment I can make at this point is that it requires some
>> > investigation. There are others that have investigate partitioning in
>> much
>> > more detail and I it would be silly for me to comment before studying
>> their
>> > contributions. It appears to be something that can be extended and made
>> > compatible with range very easily without affecting the core. Hopefully I
>> > can still find enough time to make this happen before release.
>>
>> Could you give me some pointers?

I have already found the thread in this ML, but I'm not sure we address the same partition concept. I need to reread it.

>> I have just a last question, is there any difference between
>> using directly a Range
>> Range range;
>>
>> using a sub_range
>> boost::sub_range<Range> sub_rng(boost::begin(range), boost::end(range));
>>
>> and using a sliced adaptor
>> range | sliced(0, boost:size(range))
>> ?
>
> The idea is that the algorithm may be written without regard for what actual
> range is being used. There are reasons to have iterator_range, sub_range and
> special returns from the adaptors. I believe these are all documented, so
> I'm going to be cheeky and ask that you look in the documentation for
> sub_range.

Sorry, I was not enough explicit. My question was related to performances. So is there any performance difference between ...

BTW, if I need to make a function that gets the i-th partition of a range divided on k-parts

template <typename Range>
??? get_partition(Range& range, unsigned i, unsigned parts) {
        std::size_t size = boost::size(range);
        if (i==parts - 1)
            return range | sliced(boost::begin(range_)+i*(size/parts_), boost::end(range_));
        else
            return range | sliced(boost::begin(range_)+i*(size/parts_), boost::begin(range_)+((i+1)*(size/parts_))-1);
}

Which will be the type of get_partition?
Should this be done in another way?

> Indeed it is a core design goal that the user may provide free
> functions to allow other containers to work with the range library
> seamlessly. This indeed is exemplified by the code that allows ATL and MFC
> 'collections' to operate as ranges without requiring modification. The
> relevant documentation is here index.html -> reference -> extending the
> library.
>
> The use of anything that is a valid model of the appropriate Range Concept
> must work. It's a defect if it doesn't.

I'm not sure if my example is a counter-example. I'm trying to define a parallel_sort function taking a Range as parameter. I have started with

template <typename Range>
void parallel_sort(Range& range, unsigned cutoff=10000) {
    pool_type pool( boost::tp::poolsize( 2) );
    x_sort fct( pool, cutoff);
    fct.execute(range);
}

class x_sort {
private:
  pool_type & pool_;
  unsigned cutoff_;

  template <typename Range>
  void seq_( Range& range) {
    sort_fct()(range);
  }

  template <typename Range>
  void par_( Range& range) {
    unsigned size = boost::size(range);
    if ( size <= cutoff_) return seq_( range);
    else {
        boost::sub_range<Range> left(boost::begin(range), boost::begin(range)+(size/2));
        boost::sub_range<Range> right(boost::begin(range)+(size/2)+1, boost::end(range));
        task_type task(pool_.submit(
                boost::bind(& x_sort::par_<Range>,
                    boost::ref( * this), boost::ref(left))));
        this->par_(right);
        task.wait();
        inplace_merge_fct()(range, boost::begin(range)+(size/2));
    }
  }
public:
  x_sort( pool_type & pool, unsigned cutoff) : pool_( pool), cutoff_( cutoff) {}
  template <typename Range> void execute( boost::sub_range<Range>& range) {
    par_( range);
  }
};

The problem is that the instantiation of par_<Range> requests the instantiation of par_<boost::sub_range<Range>>, and ..., which fall in an recursive instantiation that do not finish.
Should the types boost::sub_range<Range> and boost::sub_range<boost::sub_range<Range> > be convertibles?
I have taken a workaround, create a sub_range before calling to the internal function execute.

template <typename Range>
void parallel_sort(Range& range, unsigned cutoff=10000) {
    pool_type pool( boost::tp::poolsize( 2) );
    x_sort fct( pool, cutoff);
    boost::sub_range<Range> rng(range);
    fct.execute(range);
}

The internal templates have a boost::sub_range<Range> parameter instead of a Range, so the single tempalte instantiation is par_<boost::sub_range<Range> >.

class x_sort {
private:
  pool_type & pool_;
  unsigned cutoff_;

  template <typename Range>
  void seq_( boost::sub_range<Range>& range) {
    sort_fct()(range);
  }

  template <typename Range>
  void par_( boost::sub_range<Range>& range) {
    unsigned size = boost::size(range);
    if ( size <= cutoff_) return seq_( range);
    else {
        boost::sub_range<Range> left(boost::begin(range), boost::begin(range)+(size/2));
        boost::sub_range<Range> right(boost::begin(range)+(size/2)+1, boost::end(range));
        // fork a new sub-action t1 in pool
        task_type task(pool_.submit(
                boost::bind(& x_sort::par_<Range>,
                    boost::ref( * this), boost::ref(left))));
        this->par_(right);
        task.wait();
        inplace_merge_fct()(range, boost::begin(range)+(size/2));
    }
  }
public:
  x_sort( pool_type & pool, unsigned cutoff)
  : pool_( pool), cutoff_( cutoff)
  {}

  template <typename Range>
  void execute( boost::sub_range<Range>& range)
  {
    par_( range);
  }
};

Is this the correct way to use Range and sub_range?

Best,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk