Transforming uniform variables to normal variables

21 visualizaciones (últimos 30 días)
John
John el 12 de Abr. de 2012
Hello,
How would you transform variables with uniform distribution [0,1] to variables with a normal normal distribution in Matlab?
Thank you
John

Respuesta aceptada

Kye Taylor
Kye Taylor el 12 de Abr. de 2012
I will assume that your uniform random variables are stored in an array just like the one created with the commands
X = rand(2000,1); % the number of points is arbitrary
To create a sample of random variables drawn from a normal distribution with parameters (mu,sigma) defined as
mu = 0;
sigma = 1;
use the command
Y = mu + sqrt(2)*sigma*erfinv(2*X-1);
The right hand side of the equation above is the inverse of the CDF associated with a Normal(mu,sigma) random variable... see http://en.wikipedia.org/wiki/Inverse_transform_sampling
  1 comentario
John
John el 12 de Abr. de 2012
Hello Kye,
Thank you for your reply.
Yes - I think this is what I want to do.
Although, when I plot a histogram of Y it does not have the typical Normal shape?
What I am really doing is:
I have some data that I know follows a logNormal distribution.
I transformed the data to uniform using its CDF.
And now I'm trying to transform it to normal.
Here are my commands:
DT=load('Departure Times (hr).txt');
Y = cdf('Lognormal',DT,2.2268,0.36631);
So, if I get the mean and SD of my uniform variables and apply your formula will that transform it to Normal?
Thank you
John

Iniciar sesión para comentar.

Más respuestas (2)

Peter Perkins
Peter Perkins el 12 de Abr. de 2012
John, presumably you know about the randn function, which generates standard normal values. So I guess that your original question really was, "if I have uniform values, how do I transform them," and not, "how do I generate normal random values."
You said "normal normal distribution". I can't tell if this is a typo, or if you mean "standard normal", i.e. N(mean=0, std=1). If you mean, "transform to the normal distribution that corresponds to the lognormal," then all this is kind of pointless, since you can just take the log of data drawn from a lognormal to transform it to normal. But if you really mean "transform to a standard normal", then
1) As a philosophical point (and I suspect that this is what Kye was getting at when he said, "The CDF of a random variable is NOT a uniform random variable"), you're doing a standard thing in a theoretically invalid context. If you draw values from a fully known LN(mu,sigma), and use cdf('logn',x,mu,sigma) to transform them, the exact distribution of the transformed values is indeed U(0,1). But (I assume) you don't know the true lognormal distribution that your data came from, and so the transformation you're doing is only approximate. What you end up with will be as if drawn from something only approximately standard uniform. But since the assumption of log-normality is an approximation anyway, ... People do do this, but you have to be careful. Ask yourself why you are making this transformation, and if there is another way to attack whatever problem you are trying to solve.
2) You seem to have the Statistics Toolbox. So to transform from lognormal to uniform, you can use logncdf (or use the cdf function as you did), and to transform from uniform to normal, you can use the norminv function (or use the icdf function). But you have to use the right parameters in each case. For the lognormal->uniform, you'll want to use the mu/sigma lognormal parameters as MATLAB defines them. For the uniform->normal transformation, you'll want to use the mu/sigma normal parameters of your target distribution (which are just 0 and 1, if you do mean "standard normal").
When Kye said, "... you are only evaluating the lognormal density", I think he meant "lognormal cumulative probability. Your use of the cdf function is the correct transformation, modulo the above caveats (1) about using estimated parameters. But you may find logncdf and norminv simpler than erfcinv and cdf/icdf.
  3 comentarios
John
John el 13 de Abr. de 2012
Hello Peter and Kye,
Thank you for taking the time to respond in detail to my question. I appreciate the effort.
Briefly what I am doing is modelling dependent random variables using a copula function. In order to do this I believe the method is to first to transform the random variables to a uniform distribution using their CDF. Next transform the uniform variables to normal variables using inverse standard normal distribution. Then this allows you to estimate the product normal distribution between the normal variables.
What Peter is suggest is pretty much exactly what I want to do.
I used an automatic distribution fitting tool in excel to find the distribution of the random variables in the first place and that is where I got the shape parameters and mu and sigma etc.
However is there a way in matlab to transform the distribution to a uniform distribution without knowing the distribution in the first place.
Also a Dagum distribution is the best fit for my data but it is not a supported cdf in Matlab. Log normal is actually only the 10th best fit for my data. This is why I am asking if there is a method to transform to uniform without having to use a support CDF?
Thank you
John
Peter Perkins
Peter Perkins el 13 de Abr. de 2012
Lots of people using copulas use nonparametric marginals to transform to the unit hypercube. See this extended example
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/nonparametricCDFdemo.html
that ships with the Statistics Toolbox. You may also find this section of the Statistics Toolbox User Guide helpful:
<http://www.mathworks.com/help/toolbox/stats/brklrj3.html#bqttfgl-1>

Iniciar sesión para comentar.


Kye Taylor
Kye Taylor el 12 de Abr. de 2012
You know that the histogram shouldn't exactly match a Gaussian... Nevertheless, if you use the command above, you are guaranteed that Y will be sampled according to a normal distribution with parameters mu and sigma. Try adding more bins and more points to see the normal's shape:
mu = 0;
sigma = 1;
X = rand(5000,1);
Y = mu + sqrt(2)*sigma*erfinv(2*X-1);
[n,xout] = hist(Y,50);
figure,hold on
plot(xout,n/sum(n)/(xout(2)-xout(1)),'k.')
plot(xout, 1/sqrt(2*pi*sigma^2)*exp(-(xout-mu).^2/2/sigma^2));
legend('Observed density', 'actual density');

Categorías

Más información sobre Probability Distributions en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by