Boost Users :
Subject: Re: [Boost-users] Mersenne Twister 19937 suffers from "lattice effect"?
From: Steven Watanabe (watanabesj_at_[hidden])
Date: 2012-01-26 17:58:14
On 01/26/2012 10:57 AM, Matthias Nagel wrote:
> I tried to implement a two-dimensional multi-modal gaussian mixture and used the "mt19937" as the underlying generator. After I drew some samples (from the gaussian mixture) the outcome was not as I expected. After some bug tracking I suppose that the reason is something that is called "lattice effect" in German math literature. A pseudo-random generator suffers from a lattice effect, if one draws a sequence of samples, groups them together into N-dimensional vectors and these N-dimensional vectors are not uniformly distributed in the N-dimensional euclidean space. Does anybody know if the boost implementation has this problem?
mt19937 is equidistributed in 623 dimensions.
The algorithm is standard. Any implementation
is going to have the same problems. The only
differences will be the seeding and the distribution
> In the following, I will present my problem (and my algorithm) more elaborately, especially how I implemented my multi-model gaussian that is the origin of my problem:
> Lets assume we want a 2-dimensial gaussian mixture with M gaussian.
> 1) Create three lists (stl vector objects in my case), each with M entries.
> The first list stores the weight of each gaussian. The weights sum up to 1.
> The second list stores the mean value of each gaussian, i.e. a two-dimensional vector (in the mathematical sense, not the stl vector).
> The third list stores a 2-by-2 covariance matrix for each gaussian
> 2) Create a "mt19937" object, a "discrete_distribution" object and a "normal_distribution" object
> The "mt19937" is the common source of randomness for "discrete_distribution" and "normal_distribution".
> The "discrete_distribution" produces integers between 0 and M (number of gaussians) according the the given weights
> The "normal_distribution" is parameteized with mean = 0 and variance = 1, i.e. it is the standard normal distribtion.
> 3) Drawing one sample from the gaussian mixture
> - Draw one sample from the mt19937 and feed it into the discrete_distribution object. Denote the result by "index".
> - Draw a second sample from the mt19937 and feed it into the normal_distribution object. Denote the result by "u".
> - Draw a third sample from the mt19937 and feed it into the normal_distribution object. Denote the result by "v".
> - Pick up the mean vector and covariance matrix from the lists with respect to "index"
> - Transform the vector (u,v) according to the selected mean vector and covariance matrix. Denote the result "(x,y)".
> - Return (x,y)
That's not quite how the distributions work. discrete_distribution
uses two inputs to produce one output. normal_distribution
uses two inputs to produce to outputs.
> Lets remark that three samples are drawn from mt19937 to obtain one sample from the gaussian mixture.
> The result of this algorithm depends on the numbers of gaussians that build the mixture.
> 1) If M equals 1, the result looks fine. The (x,y) samples form an ellipsoid in the 2-dimensional euclidean space
> 2) If M equals 2, one gaussian looks like an ellipsoid, but it is too small. The second "gaussian" looks like a torus. Obviously all samples (u,v) with large coordinates are assigned to one gaussian, all (u,v) near zero are assigned to the other gaussian
> 3) If M equals 3, the "gaussians" look like segments of an ellipsoid. In each ellipsoid one third is missing.
> 4) For higher M the result is not correct either, but not as vivid and easily to explain as the "small" cases.
> As soon as I switch to random_device instead of mt19937 the phenomena disappears.
Can you post small self-contained code for this?
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net