Boost logo

Boost :

From: Paul A Bristow (pbristow_at_[hidden])
Date: 2008-04-30 05:35:32


 

>-----Original Message-----
>From: boost-bounces_at_[hidden]
>[mailto:boost-bounces_at_[hidden]] On Behalf Of John Maddock
>Sent: 30 April 2008 09:53
>To: boost_at_[hidden]
>Subject: Re: [boost] [Math/nextafter] A question of naming functions...

>>> T edit_distance(T a, T b)
>>>
>>> Returns the number of floating point representations between values
>>> a and b.
>>>
>>> So the questions are: can you think of any better names, or are
>>> these OK?
>>>
>>> And should edit_distance return a signed or absolute value?
>>
>> I'm at a loss for a better name (representation_distance seems overly
>> verbose but interval_size might be ok), but do have a couple
>questions
>> about this function's behavior.
>>
>> Why is the return value templated? Shouldn't it be some fixed
>> diff_t? I tentatively agree with the other comment that a signed
>> distance is
>> preferable to unsigned.
>
>I was undecided about this: but for large intervals a
>std::difference_t
>wouldn't be large enough, for example edit_distance(0.0, 1.0)
>is approx
>1.97753e+032. Of course if you use for intervals that stretch several
>orders of magnitude then you get what you deserve I guess!

I can see the problem here - Would you need a difference_t with about as many bits as the floating point type?

But I'm struggling to conceive of uses for this where the distance is so large?
Perhaps someone can suggest some?

Can't the inevitable integer overflow just be accepted when it happens?

I also find the name edit_distance very unintuitive. In fact I struggle to see from how the definition in Wikipedia it is
'technically accurate'.

"In information theory and computer science, the edit distance between two strings of characters is the number of operations
required to transform one of them into the other. There are several different algorithms to define or calculate this metric".

In this case, don't we have a clearly defined *fixed* sequence of things each of which is 'iterated' from one value to another by
calling a 'next' so many times.
So the distance is a count of the number of 'nexts' - or gaps or steps or hops, or preds or succs. So it feels to me that the result
is naturally an integer type. If you use T then it is OK for floating-point types (but loses exactness eventually), but precludes
use for some types for which next_* functions do make sense but are not useful for an integer result.

But I'm still struggling to find a better name, though 'nexts(a, b)' is short.

Paul

---
Paul A Bristow
Prizet Farmhouse, Kendal, Cumbria UK LA8 8AB
+44 1539561830 & SMS, Mobile +44 7714 330204 & SMS
pbristow_at_[hidden]
 

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk