Boost logo

Boost :

Subject: [boost] [Range] New I/O functions for ranges
From: Indiana (indi.in.the.wired_at_[hidden])
Date: 2014-09-27 17:21:44


I'm currently putting together a proposal to the standards committee
about a set of range I/O functions, but i think they would make a good
addition to Boost as well. They would probably fit in the Range
library - i don't think they warrant a library all of their own.

The stuff i wrote for the standards proposal is available at
http://cpp.indi.frih.net/rangeio/ - it goes into quite a bit of detail
about the interface, and links to a sample implementation. But here's
the executive summary:

There is a single output function - currently called "write_all" -
that takes a range and, optionally, a delimiter, like this:

// BEGIN CODE /////////////////////////
auto const r = std::array<int, 4>{1, 2, 3, 4};

cout << write_all(r); // prints "1234"
cout << "{ " << write_all(r, ", ") << " }"; // prints "{ 1, 2, 3, 4 }"
// END CODE ///////////////////////////

It takes anything that works with range-for, so it works with range adaptors:

// BEGIN CODE /////////////////////////
struct is_even { auto operator()(int x) const -> bool { return x % 2 == 0; } };

cout << write_all(r | reversed | filtered(is_even{})); // prints "42"
// END CODE ///////////////////////////

Note also that the delimiter doesn't have to be a string (there are
some examples in the reference of that).

Why not just use std::copy() and stream iterators? Several reasons -
the most obvious of which you can see above: this syntax is much
cleaner, much less error-prone, well-integrated with the existing
stream I/O mechanisms, and it produces properly delimited output
(unlike the delimited version of ostream_iterator).

But also, stream formatting is properly handled:

// BEGIN CODE /////////////////////////
auto const r = std::array<int, 3>{1, 2, 3};

std::cout.width(3);
std::cout.fill('0');

// ---------------------------
// The old way:
using std::begin;
using std::end;
std::copy(begin(r), end(r), std::ostream_iterator<int>{std::cout, ", "});
// even with Boost.Range, this just reduces to:
//boost::copy(r, std::ostream_iterator<int>{std::cout, ", "});

// prints: "001, 2, 3, "

// ---------------------------
// The new way:
std::cout << write_all(r, ", ");

// prints "001, 002, 003"
// END CODE ///////////////////////////

And of course this works as expected in chained expressions:

// BEGIN CODE /////////////////////////
auto const r = array<int, 3>{1, 2, 3};

cout.width(3);
cout.fill('0');

cout << '(' << setw(3) << setfill('0') << write_all(r, ',') << ')';

// prints: "(001,002,003)"
// END CODE ///////////////////////////

It is also possible to get more information about errors, and react
better to them. If there is an error in the formatting or in the
stream during an output operation using std::copy() and
ostream_iterator, the algorithm will chug merrily along until it's all
the way through the range even though it's no longer writing anything.
You will have no way to detect where the problem occurred - did it
fail after the first element was written, or *before*, or did it fail
after the millionth element? If the range is begin lazy-generated,
that could be particularly wasteful, or even problematic.

By contrast, write_all() stops immediately when there is an error in
the output stream. It is even possible get more information about the
last range write operation by capturing the temporary object, like
this:

// BEGIN CODE /////////////////////////
auto const r = array<int, 3>{1, 2, 3};

auto p = write_all(r);
cout << p;

if (!cout)
{
  cerr << "only " << p.count << " elements written\n";
  cerr << "next element would have been " << *p.next << '\n';
}
// END CODE ///////////////////////////

On the input side, there are 7 functions:
* overwrite
* back_insert
* front_insert
* insert
* back_insert_n
* front_insert_n
* insert_n

Overwrite simply replaces the contents of a range with whatever is
read from input:

// BEGIN CODE /////////////////////////
auto r = array<int, 100>{};

cin >> overwrite(r); // reads up to 100 ints
// END CODE ///////////////////////////

The *insert functions map to the *insert_iterators in the standard
library, using push_back(), push_front(), or insert(). In each case:

in >> ???insert(r[, i]);

is basically equivalent to:

copy(istream_iterator<value_type_of_r>{in},
     // use braces on the next line, or face the most vexing parse
     istream_iterator<value_type_of_r>{},
     ???inserter(r[, i]));

The *insert_n functions are for the common case where you know in
advance how many elements will be read, at maximum - or for when you
want to limit the amount of elements read in a single read operation.
This:

in >> ???insert_n(r[, i], n);

is conceptually equivalent to:

copy_n(istream_iterator<value_type_of_r>{in}, n, ???inserter(r[, i]));

except it can handle formatting and errors properly.

The reference at http://cpp.indi.frih.net/rangeio/ goes into much more
detail, of course.

Why add it to Boost? Because these operations (reading/writing ranges)
are so common they are almost universal, and there is no standard
facility that handles them easily or well. They are not *difficult* to
implement, but they are far from trivial - for example, you might
think write_all() is just a simple range-for loop or for_each(), and
it is... up until you add error checking, the delimiter, and
formatting awareness.

A sample implementation already exists - it can be implemented
portably in pure standard C++11. (Can it be implemented in C++03 with?
Good question. It might be possible with some Boost help, but it might
require dependencies on libraries other than Boost.Range. If this
proposal finds approval, that's something i can look into.)

Would there be any interest in adding these functions to Boost.Range?


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk