|
Boost Users : |
Subject: Re: [Boost-users] Mersenne Twister 19937 suffers from "lattice effect"?
From: Steven Watanabe (watanabesj_at_[hidden])
Date: 2012-01-26 17:58:14
AMDG
On 01/26/2012 10:57 AM, Matthias Nagel wrote:
> Hello,
>
> I tried to implement a two-dimensional multi-modal gaussian mixture and used the "mt19937" as the underlying generator. After I drew some samples (from the gaussian mixture) the outcome was not as I expected. After some bug tracking I suppose that the reason is something that is called "lattice effect" in German math literature. A pseudo-random generator suffers from a lattice effect, if one draws a sequence of samples, groups them together into N-dimensional vectors and these N-dimensional vectors are not uniformly distributed in the N-dimensional euclidean space. Does anybody know if the boost implementation has this problem?
>
mt19937 is equidistributed in 623 dimensions.
The algorithm is standard. Any implementation
is going to have the same problems. The only
differences will be the seeding and the distribution
algorithms.
> In the following, I will present my problem (and my algorithm) more elaborately, especially how I implemented my multi-model gaussian that is the origin of my problem:
>
>
> Lets assume we want a 2-dimensial gaussian mixture with M gaussian.
>
> 1) Create three lists (stl vector objects in my case), each with M entries.
> The first list stores the weight of each gaussian. The weights sum up to 1.
> The second list stores the mean value of each gaussian, i.e. a two-dimensional vector (in the mathematical sense, not the stl vector).
> The third list stores a 2-by-2 covariance matrix for each gaussian
>
> 2) Create a "mt19937" object, a "discrete_distribution" object and a "normal_distribution" object
> The "mt19937" is the common source of randomness for "discrete_distribution" and "normal_distribution".
> The "discrete_distribution" produces integers between 0 and M (number of gaussians) according the the given weights
> The "normal_distribution" is parameteized with mean = 0 and variance = 1, i.e. it is the standard normal distribtion.
>
> 3) Drawing one sample from the gaussian mixture
> - Draw one sample from the mt19937 and feed it into the discrete_distribution object. Denote the result by "index".
> - Draw a second sample from the mt19937 and feed it into the normal_distribution object. Denote the result by "u".
> - Draw a third sample from the mt19937 and feed it into the normal_distribution object. Denote the result by "v".
> - Pick up the mean vector and covariance matrix from the lists with respect to "index"
> - Transform the vector (u,v) according to the selected mean vector and covariance matrix. Denote the result "(x,y)".
> - Return (x,y)
>
That's not quite how the distributions work. discrete_distribution
uses two inputs to produce one output. normal_distribution
uses two inputs to produce to outputs.
> Lets remark that three samples are drawn from mt19937 to obtain one sample from the gaussian mixture.
>
> The result of this algorithm depends on the numbers of gaussians that build the mixture.
>
> 1) If M equals 1, the result looks fine. The (x,y) samples form an ellipsoid in the 2-dimensional euclidean space
>
> 2) If M equals 2, one gaussian looks like an ellipsoid, but it is too small. The second "gaussian" looks like a torus. Obviously all samples (u,v) with large coordinates are assigned to one gaussian, all (u,v) near zero are assigned to the other gaussian
>
> 3) If M equals 3, the "gaussians" look like segments of an ellipsoid. In each ellipsoid one third is missing.
>
> 4) For higher M the result is not correct either, but not as vivid and easily to explain as the "small" cases.
>
> As soon as I switch to random_device instead of mt19937 the phenomena disappears.
>
Can you post small self-contained code for this?
In Christ,
Steven Watanabe
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net