Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [random] Quantization effects in generating floating point values
From: Thijs van den Berg (thijs_at_[hidden])
Date: 2015-03-05 09:18:17

Next message: Louis Dionne: "[boost] [Hana] Informal review request"
Previous message: Amarnath V A: "Re: [boost] GSOC 2015 : Project on Concurrent Hash Tables"
In reply to: John Maddock: "[boost] [random] Quantization effects in generating floating point values"
Next in thread: Steven Watanabe: "Re: [boost] [random] Quantization effects in generating floating point values"

On Thu, Mar 5, 2015 at 2:39 PM, John Maddock <jz.maddock_at_[hidden]>
wrote:

> First off, I notice there are no examples for generating floating point
> values in Boost.Random, so maybe what follows is based on a
> misunderstanding, or maybe not...
>
> Lets say I generate values in [0,1] like so:
>
> boost::random::mt19937 engine;
> boost::random::uniform_01<boost::random::mt19937, FPT> d(engine);
>
> FPT d = d(); //etc
>
> Where FPT is some floating point type.
>
> Now my concern is that we're taking a 32-bit random integer and
> "stretching" it to a floating point type with rather more bits (53 for a
> double, maybe 113 for a long double, even more in the multi-precision
> world). So quantization effects will mean that there are many values which
> can never be generated.
>
> It's true that I could use independent_bits_engine to gang together
> multiple random values and then pass that to uniform_01, however that
> supposes we have an unsigned integer type available with enough bits.
> cpp_int from boost.multiprecision would do it, and this does work, but the
> conversions involved aren't particularly cheap. It occurs to me that an
> equivalent to independent_bit_engine but for floating point types could be
> much more efficient - especially in the binary floating point case.
>
> So I guess my questions are:
>
> Am I worrying unnecessarily? and
> What is best practice in this area anyway?
>
> Thanks, John.
>

I've worried about this in the past, but I've accepted that using a 64 bit
integer engine instead of a 32 is good enough. A 64 bit engine reasonably
saturates 64 bit float conversions, and having 2^-64 probability resolution
is practically enough when computing statistics on large number of random
draws (1 trillion draws<< 2^64)

When using floating point random numbers there are a two main error sources:
* the finite resolution of the probability engine -e.g. 32 bits in your
example-. This determines the number of different random values you can
generate.
* but also the non linearity in the float representation. This determines
the number of individual values you can generate in a small interval. E.g.
there are many more float values close to zero then close to 1 when you
convert the mt19937 integers to floats the interval U01.

Since most statistical computations involve floating point computations so
you'll have type 2) issues anyway. In that respect I would find it
theoretically interesting (but I don't actually need it) to have
random floating point numbers with a fixed exponent. That would remove the
non-linearity of the float representation.

Next message: Louis Dionne: "[boost] [Hana] Informal review request"
Previous message: Amarnath V A: "Re: [boost] GSOC 2015 : Project on Concurrent Hash Tables"
In reply to: John Maddock: "[boost] [random] Quantization effects in generating floating point values"
Next in thread: Steven Watanabe: "Re: [boost] [random] Quantization effects in generating floating point values"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk