Boost logo

Boost :

From: Kris Thielemans (kris.thielemans_at_[hidden])
Date: 2003-06-04 18:41:01


Hi

I've had a look at random/poisson_distribution. The crucial part there goes

  result_type operator()()
  {
    // TODO: This is O(_mean), but it should be O(log(_mean)) for large
_mean
    RealType product = RealType(1);
    for(int m = 0; ; ++m) {
      product *= _rng();
      if(product <= _exp_mean)
        return m;
    }
  }

where _rng is a UniformRandomNumberGenerator and _exp_mean = exp(-_mean).

I must say I don't really understand why this would give you a Poisson
random number. I do think that it is terribly inefficient though as it call
rng potentially a lot of times.

My own code looks rather different. It is based on inverting the comulative
distribution for Poisson statistics. (Also, for large mean I use the usual
'normal' approximation).

Unfortunately, my code does not look like boost::random framework yet (it's
just a function), but I'd be happy to let somebody else to do that for me
:-). Here it is

  typedef boost::mt19937 base_generator_type;
  // initialize by reproducible seed
  static base_generator_type generator(boost::uint32_t(42));
  boost::uniform_01<base_generator_type,char> random01(generator);

// Generate a random number according to a Poisson distribution
// with mean mu
int generate_poisson_random(const float mu)
{
  // check if mu is large. If so, use the normal distribution
  // note: the threshold must be such that exp(threshold) is still a floating point number
  if (mu > 60.F)
  {
    boost::normal_distribution<base_generator_type> randomnormal(generator, mu, sqrt(mu));
    const double random = randomnormal();
    return random<0 ? 0 :
      round(random); // note: this is just floating point rounding
  }
  else
  {
    double u = random01();
  
    // prevent problems of n growing too large (or even to infinity)
    // when u is very close to 1
    if (u>1-1.E-6)
      u = 1-1.E-6;
  
    const double upper = exp(mu)*u;
    double accum = 1.;
    double term = 1.;
    int n = 1;
  
    while(accum <upper)
    {
      accum += (term *= mu/n);
      n++;
    }
    
    return (n - 1);
  }
}

As you see, this calls the random number generator only once. Should be far more efficient I think. Am I wrong?

I realise that one would need a 'proof' that this works. If necessary I can dig up the math. I did also check some histograms with 1 million random numbers or so.

Finally, a good implementation would do something about the more or less arbitrary constants (ok. you could just make them part of the template).

Any reactions/suggestions?

All the best

Kris Thielemans
Imaging Research Solutions Ltd


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk