Transforming uniform variables to normal variables
21 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hello,
How would you transform variables with uniform distribution [0,1] to variables with a normal normal distribution in Matlab?
Thank you
John
0 comentarios
Respuesta aceptada
Kye Taylor
el 12 de Abr. de 2012
I will assume that your uniform random variables are stored in an array just like the one created with the commands
X = rand(2000,1); % the number of points is arbitrary
To create a sample of random variables drawn from a normal distribution with parameters (mu,sigma) defined as
mu = 0;
sigma = 1;
use the command
Y = mu + sqrt(2)*sigma*erfinv(2*X-1);
The right hand side of the equation above is the inverse of the CDF associated with a Normal(mu,sigma) random variable... see http://en.wikipedia.org/wiki/Inverse_transform_sampling
Más respuestas (2)
Peter Perkins
el 12 de Abr. de 2012
John, presumably you know about the randn function, which generates standard normal values. So I guess that your original question really was, "if I have uniform values, how do I transform them," and not, "how do I generate normal random values."
You said "normal normal distribution". I can't tell if this is a typo, or if you mean "standard normal", i.e. N(mean=0, std=1). If you mean, "transform to the normal distribution that corresponds to the lognormal," then all this is kind of pointless, since you can just take the log of data drawn from a lognormal to transform it to normal. But if you really mean "transform to a standard normal", then
1) As a philosophical point (and I suspect that this is what Kye was getting at when he said, "The CDF of a random variable is NOT a uniform random variable"), you're doing a standard thing in a theoretically invalid context. If you draw values from a fully known LN(mu,sigma), and use cdf('logn',x,mu,sigma) to transform them, the exact distribution of the transformed values is indeed U(0,1). But (I assume) you don't know the true lognormal distribution that your data came from, and so the transformation you're doing is only approximate. What you end up with will be as if drawn from something only approximately standard uniform. But since the assumption of log-normality is an approximation anyway, ... People do do this, but you have to be careful. Ask yourself why you are making this transformation, and if there is another way to attack whatever problem you are trying to solve.
2) You seem to have the Statistics Toolbox. So to transform from lognormal to uniform, you can use logncdf (or use the cdf function as you did), and to transform from uniform to normal, you can use the norminv function (or use the icdf function). But you have to use the right parameters in each case. For the lognormal->uniform, you'll want to use the mu/sigma lognormal parameters as MATLAB defines them. For the uniform->normal transformation, you'll want to use the mu/sigma normal parameters of your target distribution (which are just 0 and 1, if you do mean "standard normal").
When Kye said, "... you are only evaluating the lognormal density", I think he meant "lognormal cumulative probability. Your use of the cdf function is the correct transformation, modulo the above caveats (1) about using estimated parameters. But you may find logncdf and norminv simpler than erfcinv and cdf/icdf.
3 comentarios
Peter Perkins
el 13 de Abr. de 2012
Lots of people using copulas use nonparametric marginals to transform to the unit hypercube. See this extended example
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/nonparametricCDFdemo.html
that ships with the Statistics Toolbox. You may also find this section of the Statistics Toolbox User Guide helpful:
<http://www.mathworks.com/help/toolbox/stats/brklrj3.html#bqttfgl-1>
Kye Taylor
el 12 de Abr. de 2012
You know that the histogram shouldn't exactly match a Gaussian... Nevertheless, if you use the command above, you are guaranteed that Y will be sampled according to a normal distribution with parameters mu and sigma. Try adding more bins and more points to see the normal's shape:
mu = 0;
sigma = 1;
X = rand(5000,1);
Y = mu + sqrt(2)*sigma*erfinv(2*X-1);
[n,xout] = hist(Y,50);
figure,hold on
plot(xout,n/sum(n)/(xout(2)-xout(1)),'k.')
plot(xout, 1/sqrt(2*pi*sigma^2)*exp(-(xout-mu).^2/2/sigma^2));
legend('Observed density', 'actual density');
0 comentarios
Ver también
Categorías
Más información sobre Probability Distributions en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!