Boost logo

Boost :

Subject: Re: [boost] [ICL] #6853: boost::icl::contains(NaN) returns true
From: Joachim Faulhaber (afojgo_at_[hidden])
Date: 2012-05-08 12:39:27


Hi Andrew, list,

the NaN topic is not my favorite topic it's more a NaTo (not a Topic ;)

2012/5/7 Andrew Hundt <athundt_at_[hidden]>:
>>
>>
>> IMO it would seem reasonable and useful to 'support' NaN.
>>
>>  (If you mean that an interval cannot be bounded by a NaN, and a NaN
>> cannot be contained by any
>> set?)
>>
>
> Yes, this is exactly what I had in mind. I'm sure data sets with invalid
> data points are fairly common across many domains, particularly in cases
> where measurements are involved such as scientific data.
>
>
>>
>> But the library author is 'in charge' and Joachim obviously thinks it
>> would be too complicated to do
>> this, so probably we have to accept that for now.
>>
>
> Understood and agreed, I'm simply trying to facilitate discussion since it
> seemed like an interesting and fairly reasonable use case.

I don't think the ICL should support NaN. Moreover I believe boost
libraries and generic concepts should not integrate NaN in general. In
a way I conceive NaN as being "anti generic". Wherever I run into the
NaN phenomenon, it tends to jeopardize simplicity and elegance in
generic designs.

Some thoughts and observations:

(1) NaN seems to be a paradox: "I am a number that is not a number".
But the paradox exists only in the naming: NaN = "not a number". In
the implementation NaN is of course a number:

double zero = 0.0;
double not_a_double = zero/zero;
BOOST_CHECK(tr1::isnan(not_a_double)); //will pass

Now, *all values* of a datatype should obey certain axioms. If we want
to be able to sort them, which we certainly want to do with floating
point numbers at times, we need e.g. Induced Equivalence:

if !(x<y) and !(y<x) then x =^= y
where =^= is a
  equivalence relation (for a strict weak ordering)
  or an identity relation (for a total ordering).

Sorted Associative Containers in the stl rely on such concepts and
their axioms. NaN poisons the generic design:

For all double x : Since !(x<NaN) and !(NaN<x) it follows that NaN =^=
x. NaN is equivalent to all "regular" double values. Therefore, if you
insert NaN into a std::set<double> you cannot insert any other value
and every find-operation results in NaN. In a way NaN is the black
hole of generic designs ;)

//---- code -------------
set<double> NaN_set;

double NaN = std::numeric_limits<double>::quiet_NaN();
NaN_set.insert(NaN);
cout << "NaN_set.size() = " << NaN_set.size() << endl;
NaN_set.insert(42.0);
cout << "NaN_set.size() = " << NaN_set.size() << endl;

set<double>::iterator it = NaN_set.find(42.0);
if(it != NaN_set.end())
  cout << "NaN_set contains: " << *it << endl;
else
  cout << "42.0 not found\n";
//---- edoc -------------
//Output:
NaN_set.size() = 1
NaN_set.size() = 1
NaN_set contains: 1.#QNAN
//-----------------------

(2) Being part of a standard (IEEE-754) NaN comes with a certain
authoritative appearance and tends to leak from the domain of floating
point numbers into other areas like unlimited integer implementation
or e.g. the boost::data_time library. The idea of "not_a_date_time"
made the integration of boost::data_time with the ICL unnecessary
clumsy. I had to write adapter code that is completely unnecessary
based on a generic design without not_a_date_time. I am happy the
boost::chrono people did not follow the NaN-path (and did not fall
into the black hole;) Chrono and ICL therefore are working perfectly
together.

(3) Undefinedness, invalidness, over/underflow, infinities for data
types are concepts do not live at the same abstraction level as the
data values themselves. I think it is a bad idea to try to code them
into the level of naked values.

(4) After some experiences with NaN, infinity and generic concepts I
come to the conclusion that a concept or "abstract data type" shall
not include any "special values" that are not reachable by fundamental
operations of the datatype (E.g. x/0, infinities) and do not obey the
laws that are fundamental to its semantics. I think we can work with
such notions in separate concepts, like e.g. an infinite_interval<T>.
T does not need to have an infinity-value but infinite_interval<T> may
have the the property to be infinite on one or both sides. Datatypes T
can stay simple and the infinity handling is the responsibility of
concept infinite_interval<T>.

I'd love to work on those ideas and more, but unfortunately I don't
have time to develop much code in boost quality currently :(

Cheers,
Joachim

-- 
Interval Container Library [Boost.Icl]
http://www.joachim-faulhaber.de

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk