Boost logo

Boost :

From: Thorsten Ottosen (nesotto_at_[hidden])
Date: 2005-05-15 16:35:56


"Pavol Droba" <droba_at_[hidden]> wrote in message
news:464931589.20050515230222_at_topmail.sk...
| Hi Boosters,
|
| There was a discussion about char[] support in the Boost.Range
| library. The issue seems important and I'd like to express my
| ideas about a possible solution.
|
| First lets sumarize problems and goals.
| The problems:
| char[] and possibly any other type that can be used as a c-string
| (this includes wchar_t, but also int, long and etc when used as a
| unicode code-point) might represent two different things:
| 1.) c-string literal
| 2.) arbitrary c-array
|
| Both views differ in lenght calculation, which is totaly
| incompatible and what's worse, it can lead to casual access
| violation when used improperly.
|
| An example:
| char str[] = "Hello";
| // typeof(str) is char[6], str=={'h','e','l',l','o',0}
|
| In the c-string view, str have 5 letters and ends at the 'o'.
| So the range should be <'H','o')
| In c-array view str is 6 elements long and ends with '\0'
| The range is <'H','\0')
|
| From the user perspective, both views are equaly important,
| however according to the usage scenarion, one might be preferable
| over the other one. Important aspect to keep in mind is this strict
| relativnes. For example for string algorithms c-string literal is
| obvious default, while for a data processing library the second
| choice is better.
|
| Current implementation is not ideal. First of all, there is a
| difference between char,wchar_t[] and the rest of the types.
| This brings some confusion. Secondly, it is not possible to use
| char[] as an ordinary array.

no default can meet all's expectations.

| The goals:
| From the problem analysis above, following goals can be implied
|
| 1. we need to support both views equaly

if you remove "equally" I agree.

| 2. a user must be always able to explicitly specify what type
| of view he requires
| 3. it should be possible for a library writer to select default
| view for his library.
| However point (2) must hold, so the user must be able
| to override this default.
| 4. Support must be present in the Boost.Range library.
| It is not feasible to ask library writer to provide
| specific workarounds/hacks. It would simply break the idea
| of Boost.Range library as a unified interface to range-like
| data structures.
|
| The solution:
|
| I propose to have two free-standing functions
| as_string() and as_array() (naming is not important now).
|
| Both should have the same generic signature:
|
| template<typename RangeT>
| boost::sub_range<RangeT> as_string(RangeT& aRange);
| template<typename RangeT>
| boost::sub_range<RangeT> as_array(RangeT& aRange);
|

+ const overloads

| By default, the functions only copy the input range to the target.
| However for the types like char[], the result will differ.
| For as_string() will create a sub_range delimiting string
| literal (using char_type<char>::length for instance), while as_array()
| will use compile-time boundaries.

sounds fair.

| In addition we might consider to open this interface for
| user-defined type, even if I'm not sure how it can be used.

with ADL. the library says

using boost::as_string;
foo( as_string(bar));

| Please note, that once any of these manipulators is applied to a
| range following application will have no effect.
|
| Lets see how this faicility can be used:
|
| A library writer can set the default by writting algorithm like
| this:
|
| template<typename RangeT>
| ... AnAlgorithm(const RangeT& aRange)
| {
| boost::sub_range<RangeT> StrRange=as_string(aRange);
|
| // Do something with StrRange
| }
|
| If a user calls AnAlgorithm directly:
| char str[]="hello";
| AnAlgorithm(str);
|
| str will be converted to a range, delimiting a string_literal.
| However he can alse use as_array():
|
| char str[]={'h', 'e', 'l', 'l', 'o'};
| AnAlgorithm(as_array(str));
|
| This time no conversion will take place, since as_array() returns
| sub_range.
|
| Note, that for the AnAlgorithm it does not matter what default is
| used in the Range library.
|
| Open questions:
|
| - I have intentionaly not included a proposal for the default view
| that the Range library should provide.
| Goal of this solution is to provide a way, that is not dependant
| on this.
| I'd like to leave it for the discussion. Right now it seems, that
| most of the people that entered discussion prefer c-array view.
| I would prefer c-string view, but I'm probably biased by the fact
| that I'm the author of StringAlgo library.

 I prefer the string view too.

That's how boost.tange was designed.

There is one prblem with the default today IMO: char[]
should call char_traits<char>::length();

| - There is a space for possible extentions to the basic proposal.
| For instance, as_string() migh have the second parameter that
| will identify a terminator.
|
| - String literal lenght can be calculated in two ways. Either by
| using strlenght() (or alike), or using compile-time size (N)
| decreased by 1 (N-1).

for const char[], this would be the way to go...and so is it also implemented
by default.

-Thorsten


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk