Why rand function is not uniform in large intervals?
7 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
pankaj singh
el 23 de Jun. de 2016
Respondida: pankaj singh
el 27 de Jun. de 2016
I am using rand function to generate uniformly distributed random numbers in the interval [10e-6 and 1.] But the function generates the nos. which are close to 1 (RATHER THAN BEING UNIFORM IN THE ENTIRE INTERVAL]. I have tried with 10 nos. and 100 nos. But I found that most the nos. generated are close to 1. Then, how it will be a uniform distribution??
3 comentarios
Adam
el 23 de Jun. de 2016
rand always generates numbers between 0 and 1. What you then do with those to get them into a range you are interested in is entirely up to you.
John D'Errico
el 23 de Jun. de 2016
Rand IS uniform, and it generates numbers in the range from 0 to 1. If you are mis-using the results of rand in some way, then expect strange results. So show what you wrote.
Respuesta aceptada
John D'Errico
el 26 de Jun. de 2016
Editada: John D'Errico
el 26 de Jun. de 2016
Sigh. I think this is a misunderstanding of what uniformly random means. A fairly common one too. Since you seem not interested in showing your code, one can only guess the problem though. There can be no certain answer if I do not see your code. (addendum: The OP has since added a comment that indicates my conjecture is exactly on target.)
You are generating numbers uniformly random with a target interval [1e-6,1].
A uniform distribution implies that for ANY sub-interval of fixed width that is contained in the global window [1.e-6,1] (so assume a sub-interval [a,b]) where we have
1e-6 <= a <= b <=1
then the expected number of events we will observe should be:
(b - a)/(1 - 1e-6)
If you will generate N samples, then the expected number of events in the sub-interval is:
N*(b - a)/(1 - 1e-6)
So expect to see a number of events that are proportional to the sub-interval width. You won't seee exactly that many, since this is a random sampling.
So a uniform random sampling on the interval [0,1] would have roughly 10% of the samples in each bin [0,0.1], [0.1,0.2], [0.2,0.3], etc.
Now in your case, you are sampling on the interval [1e-6,1]. You find that very few samples occur right down at the bottom end, say between [1e-6,1e-5].
Lets use the rule above to see what fraction of the samples SHOULD occur in that interval. Lets say that we generate a sample size of 1000 values in the overall interval. Seems pretty big to me.
1000*(1e-5 - 1e-6)/(1 - 1e-6)
ans =
0.009
Hmm. I only expect to see 0.009 samples in that sub-interval, whereas I would have expected to see
1000*(1 - 0.9)/(1 - 1e-6)
ans =
100
So 100 events in the subinterval [0.9,1].
Is this truly uniform sampling? YES!!!!!!!!! Of course it is! You need to understand that the first interval I showed is a terribly tiny interval.
If you asked to generate a sampling that is uniformly probable over that region, but what you REALLY wanted was some sort of sampling that is uniform in a log space, then you needed to use a proper random sampling scheme!
For example, try this:
R = 10.^(rand(1,1000)*6 - 6);
Look at some percentiles of this sampling scheme:
Min 1.005e-06
1.0% 1.144e-06
5.0% 2.036e-06
10.0% 4.529e-06
25.0% 4.117e-05
50.0% 0.001265
75.0% 0.03185
90.0% 0.2554
95.0% 0.5309
99.0% 0.8607
Max 0.9928
It is NOT uniform, at least not in the domain [1e-6,1]. But the log10 of those numbers WILL be uniformly distributed. So, we will expect roughly 50% of the log10 values to be less than -3.
Min -5.998
1.0% -5.941
5.0% -5.691
10.0% -5.344
25.0% -4.385
50.0% -2.898
75.0% -1.497
90.0% -0.5928
95.0% -0.275
99.0% -0.06513
Max -0.003133
Again, it won't be perfect. But a sample size of 1000 is not really that huge. These predictions only become valid in the limit as N grows to a really large number.
Again, it is just a wild guess.
3 comentarios
Stephen23
el 26 de Jun. de 2016
Editada: Stephen23
el 26 de Jun. de 2016
@pankaj singh: Instead of ten samples (which is too few for showing any kind of probability distribution trend), here is your code with one million samples:
N = 1e6;
dmin = 10e-6;
dmax = 1 ;
d = dmin + (dmax-dmin).*rand(1,N);
hist(d)
and a histogram of those random values:
Does this look like a uniform distribution to you? If not, what would you expect it to look like ?
John D'Errico
el 26 de Jun. de 2016
Editada: John D'Errico
el 26 de Jun. de 2016
Did you read my response? If not, then why not? I spent, what, an hour writing that response to you. Then you asked exactly the same question that I just answered.
READ MY ANSWER. In that answer, I explained why you are confused, why you are not getting the kind of sampling that you want to see. I then show how to achieve the sampling that you want. But if you will ask a question and won't bother to read the answers you get and think about what they say, how can I do more? (Sorry if I seem frustrated.)
Más respuestas (2)
Roger Stafford
el 26 de Jun. de 2016
Editada: Roger Stafford
el 26 de Jun. de 2016
It seems clear from his most recent comment that where Pankaj says “uniform” he actually means a "logarithmic" distribution where there would be as many samples in the interval [10^(-6),10^(-5)] as in the interval [10^(-1),10^(0)], and indeed in any interval [10^(-k),10^(-k+1)], -6<=k<=-1. If that is the case, the proper code would be:
r = 10.^(-6*rand(1,n));
1 comentario
Roger Stafford
el 26 de Jun. de 2016
Oops! I didn't notice the same answer given by John earlier on.
Ver también
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!