Boost logo

Boost :

Subject: Re: [boost] Formal Review: Boost.RangeEx - missing algorithms
From: Neil Groves (neil_at_[hidden])
Date: 2009-02-22 10:41:41


Dear Vincente,

On Sun, Feb 22, 2009 at 3:08 PM, vicente.botet <vicente.botet_at_[hidden]>wrote:

> ----- Original Message -----
> From: "Neil Groves" <neil_at_[hidden]>
> To: <boost_at_[hidden]>
> Sent: Sunday, February 22, 2009 3:11 PM
> Subject: Re: [boost] Formal Review: Boost.RangeEx - missing algorithms
>
>
> >
> > Dear Vicente,
> >
> > On Sun, Feb 22, 2009 at 12:13 PM, vicente.botet <
> vicente.botet_at_[hidden]>wrote:
> > <snip>
> > </snip>
> >
> >>
> >> May be it will be godd to add the equivalences. The user can always
> include
> >> them in their application if they found useful.
> >>
> >
> > I'm not sure about this issue. I really want to obtain feedback from as
> many
> > people as possible. I understand the appeal of providing everything that
> is
> > in <algorithm> but I dislike setting what is, in essence, a bad example
> of
> > how to write algorithms. My personal preference is to drop the inferior
> > technique, this will hopefully make it easy to do the right thing, and
> hard
> > to do the wrong thing. I am, of course, open to feedback on this issue.
> > Ultimately the library is for it's users.
>
> It's OK. I will wait from feedback from others also.
>
> >> > The _n operations are replaced if the input is a model of the
> >> > RandomAccessRangeConcept with boost::copy( output, input | sliced(0,
> n) )
> >> > for example, but on reflection this can and must be improved before
> >> release.
> >> > If I simply add a first_n adaptor then we can replace all _n
> algorithms
> >> with
> >> > a superior adaptor for all valid ranges. Hence I propose to add
> something
> >> > that would allow:
> >> > boost::copy( output, input | first_n(n) );
> >> >
> >> > I am pleased that you made me justify dropping the _n versions,
> because
> >> > frankly my solution is currently insufficient to be a perfect
> substition.
> >>
> >> Yes, somthing like that should work, but what will be the difference
> >> between
> >> first_n(n) and sliced(0, n)?
> >
> >
> > The key difference is that while sliced works perfectly fine for
> > RandomAccessRange's it does not work for SinglePassRange's. It is
> possible
> > to make a first_n work for all valid Ranges. Hence I can't correctly
> state
> > that I have currently provided the tools to make the _n versions of the
> > algorithms always replacable with a superior alternative. I shall,
> because I
> > want to get rid of them, I should have provided a better alternative
> first.
>
> Ok, I see.
>
> >> > As for is_sorted and is_heap, these seem like useful additions to the
> >> > algorithm_ext.hpp.
> >> >
> >> > And horror of horrors, the random_sample is missing due to no good
> >> > justifiable reason what-so-ever. This I will address.
> >>
> >> waiting for them.
> >>
> >> >>
> >> >> There are surely hidden reasons to don't include some of them.
> >> >>
> >> <snip>
> >>
> >> > I definately like the idea of carefully adding partitioning that is
> >> > compatible with Boost.Range. I need to revist posts and suggestions
> made
> >> by
> >> > others on this list.
> >> > IIRC Dr. Arnold Schodl had some interesting ideas, and partitioning is
> >> > actively being worked on.
> >> >
> >> >> But a dynamic one would also be interesting. What do you think?
> >> >>
> >> >
> >> > The only comment I can make at this point is that it requires some
> >> > investigation. There are others that have investigate partitioning in
> >> much
> >> > more detail and I it would be silly for me to comment before studying
> >> their
> >> > contributions. It appears to be something that can be extended and
> made
> >> > compatible with range very easily without affecting the core.
> Hopefully I
> >> > can still find enough time to make this happen before release.
> >>
> >> Could you give me some pointers?
>
> I have already found the thread in this ML, but I'm not sure we address the
> same partition concept. I need to reread it.
>

I do too!

>
> >> I have just a last question, is there any difference between
> >> using directly a Range
> >> Range range;
> >>
> >> using a sub_range
> >> boost::sub_range<Range> sub_rng(boost::begin(range),
> boost::end(range));
> >>
> >> and using a sliced adaptor
> >> range | sliced(0, boost:size(range))
> >> ?
> >
> > The idea is that the algorithm may be written without regard for what
> actual
> > range is being used. There are reasons to have iterator_range, sub_range
> and
> > special returns from the adaptors. I believe these are all documented, so
> > I'm going to be cheeky and ask that you look in the documentation for
> > sub_range.
>
> Sorry, I was not enough explicit. My question was related to performances.
> So is there any performance difference between ...
>

Actually having re-read my paragraph I state something I don't really mean.
The user of the algorithm must be able to call it for any range that models
a valid Range concept. In very few cases when implementing an algorithm you
can benefit from using the sub_range as a tool to help build the algorithm.

There is no performance difference between iterator_range, and sub_range to
the best of my knowledge and belief. I have not yet measured so this is
slightly speculative.

>
> BTW, if I need to make a function that gets the i-th partition of a range
> divided on k-parts
>
> template <typename Range>
> ??? get_partition(Range& range, unsigned i, unsigned parts) {
> std::size_t size = boost::size(range);
> if (i==parts - 1)
> return range | sliced(boost::begin(range_)+i*(size/parts_),
> boost::end(range_));
> else
> return range | sliced(boost::begin(range_)+i*(size/parts_),
> boost::begin(range_)+((i+1)*(size/parts_))-1);
> }
>
> Which will be the type of get_partition?
> Should this be done in another way?

I can't see any reason why you should not be able to do it this way.

What this example shows is that I was wrong to state that the return type of
adaptors did not need to be known. I will add the return type information
into the documentation.

The answer in this particular case is:
iterator_range<typename range_iterator<Range>::type >

>
> > Indeed it is a core design goal that the user may provide free
> > functions to allow other containers to work with the range library
> > seamlessly. This indeed is exemplified by the code that allows ATL and
> MFC
> > 'collections' to operate as ranges without requiring modification. The
> > relevant documentation is here index.html -> reference -> extending the
> > library.
> >
> > The use of anything that is a valid model of the appropriate Range
> Concept
> > must work. It's a defect if it doesn't.
>
> I'm not sure if my example is a counter-example. I'm trying to define a
> parallel_sort function taking a Range as parameter. I have started with
>

> template <typename Range>
> void parallel_sort(Range& range, unsigned cutoff=10000) {
> pool_type pool( boost::tp::poolsize( 2) );
> x_sort fct( pool, cutoff);
> fct.execute(range);
> }
>
> class x_sort {
> private:
> pool_type & pool_;
> unsigned cutoff_;
>
> template <typename Range>
> void seq_( Range& range) {
> sort_fct()(range);
> }
>
> template <typename Range>
> void par_( Range& range) {
> unsigned size = boost::size(range);
> if ( size <= cutoff_) return seq_( range);
> else {
> boost::sub_range<Range> left(boost::begin(range),
> boost::begin(range)+(size/2));
> boost::sub_range<Range> right(boost::begin(range)+(size/2)+1,
> boost::end(range));
> task_type task(pool_.submit(
> boost::bind(& x_sort::par_<Range>,
> boost::ref( * this), boost::ref(left))));
> this->par_(right);
> task.wait();
> inplace_merge_fct()(range, boost::begin(range)+(size/2));
> }
> }
> public:
> x_sort( pool_type & pool, unsigned cutoff) : pool_( pool), cutoff_(
> cutoff) {}
> template <typename Range> void execute( boost::sub_range<Range>& range)
> {
> par_( range);
> }
> };
>

I think it's a good example of where the differing range types are not as
easy to implement an algorithm as we would wish. The finished algorithm
should be able to work with anything that is a model of a Range Concept.

>
> The problem is that the instantiation of par_<Range> requests the
> instantiation of par_<boost::sub_range<Range>>, and ..., which fall in an
> recursive instantiation that do not finish.
> Should the types boost::sub_range<Range> and
> boost::sub_range<boost::sub_range<Range> > be convertibles?
> I have taken a workaround, create a sub_range before calling to the
> internal function execute.
>
> template <typename Range>
> void parallel_sort(Range& range, unsigned cutoff=10000) {
> pool_type pool( boost::tp::poolsize( 2) );
> x_sort fct( pool, cutoff);
> boost::sub_range<Range> rng(range);
> fct.execute(range);
> }
>
> The internal templates have a boost::sub_range<Range> parameter instead of
> a Range, so the single tempalte instantiation is
> par_<boost::sub_range<Range> >.
>
> class x_sort {
> private:
> pool_type & pool_;
> unsigned cutoff_;
>
> template <typename Range>
> void seq_( boost::sub_range<Range>& range) {
> sort_fct()(range);
> }
>
> template <typename Range>
> void par_( boost::sub_range<Range>& range) {
> unsigned size = boost::size(range);
> if ( size <= cutoff_) return seq_( range);
> else {
> boost::sub_range<Range> left(boost::begin(range),
> boost::begin(range)+(size/2));
> boost::sub_range<Range> right(boost::begin(range)+(size/2)+1,
> boost::end(range));
> // fork a new sub-action t1 in pool
> task_type task(pool_.submit(
> boost::bind(& x_sort::par_<Range>,
> boost::ref( * this), boost::ref(left))));
> this->par_(right);
> task.wait();
> inplace_merge_fct()(range, boost::begin(range)+(size/2));
> }
> }
> public:
> x_sort( pool_type & pool, unsigned cutoff)
> : pool_( pool), cutoff_( cutoff)
> {}
>
> template <typename Range>
> void execute( boost::sub_range<Range>& range)
> {
> par_( range);
> }
> };
>
> Is this the correct way to use Range and sub_range?
>

Yes, this is exactly what I intend. The public algorithm exposes an
interface that can accept any Range Concept, and the internal implementation
has to, occasionally jump through some hoops to handle sub_range etc. I
haven't been able to think of a way to eliminate this need. Fortunately for
most algorithm implementations it is not necessary, and I think the extra
function call is not a huge deal.

If anyone knows a technique to avoid the need to use sub_range that
simplifies matters, I would gladly use it.

>
>
> Best,
> Vicente
>

Thanks,
Neil Groves

>
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk