Generating dispersed (non-integer) random matrix/array that sums to a particular value

Question

J AI el 28 de Jun. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/555907-generating-dispersed-non-integer-random-matrix-array-that-sums-to-a-particular-value

Editada: J AI el 28 de Jun. de 2020

One of the most suggested (in fact the only one to my finding) for generating random numbers (<1) that will sum to 1 is Random Vectors with Fixed Sum by Roger Stafford. However, what I noticed is that the data generated is not well dispersed. e.g.,

P = randfixedsum(10,10000,1,0.05,0.9); % a 10-by-100000 matrix where each column of P sums to 1 and each elements is between 0.05 and 0.9
find(any(P>0.5))
ans =
  1×0 empty double row vector

So far, every single time I tried it results in an empty vector - it always limits itself within below 0.5. Is there a way I could generate more dispersed data where it would include values between 0.05 and 0.9 (for the above example)?

Thanks in advance for your kind help.

FYI: I have tried this (took help from one of the MATLAB answers)

function P = rand_fixed_sum_2(p,n) % p number of columns, and n number of rows and each column sums to 1
    for j = 1:p
            n1=10^(n-1);
            m=1:n1;
            a=m(sort(randperm(n1,n)));
            b=diff(a);
            b(end+1)=n1-sum(b);
            P(:,j) = (b/sum(b))';
    end
    
end

But obviously the value of n1 is not feasible for higher dimensions (n>5). However, for lower dimensions, by tweaking n1, I could get much more dispersed data.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

John D'Errico el 28 de Jun. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/555907-generating-dispersed-non-integer-random-matrix-array-that-sums-to-a-particular-value#answer_458233

Editada: John D'Errico el 28 de Jun. de 2020

Abrir en MATLAB Online

I think you do not understand what you are asking.

randfixedssum indeed produces results that are uniformly sistributed within the sub-set in question. That is, any point in a 10 dimensional space that satisfies the requirements of a fixed sum is equally likely to arise.

However, that does not mean that it is at all probable you would find something that satisfies your goal, of "dispersion".

For example, suppose you were to choose one element that is greater than 0.5? Then the probability that the other 9 elements were ALL small enough that the sum is 1, is pretty low. In the 9 dimensional space that remains, that event would be actually very uncommon.

Thus, you want to generate 10 numbers, all of which lie between 0.05 and 0.9, such that the sum is 1.

Suppose, just suppose that one of the numbers was say, 0.6? Now what are the odds that you can find 9 other numbers that make the total sum exactly 1, but none of them are less than 0.05? SURPRISE! It can never be done.

In fact, if any simgle element was any larger than 0.55 in this example, your goal will never be doable. So if one element is as large as even 0.55+eps, it is mathematically impossible to find 9 numbers, all of which are between 0.05 and 0.9, such that the sum is 0.45-eps.

Next, suppose one element was even as large as 0.5? Just one element that large?

Now the other 9 elements must all be very close to 0.05. What is the probability of that event? Not surprisingly, it is pretty darn small. I can compute the actual probability of such an event to happen if you need. Being too lazy to think at this time of day...

X = randfixedsum(10,10000000,1,0,0.9);
sum(max(X) >= 0.5)
ans =
      195844

So 1.96e5 such events in 1e7. A little under 2% of the time. As expected, a rare event, and that is EXACTLY as it should be.

You ask for dispersion. But you don't seem to understand what dispersion means or what it implies in this context.

If I look at the distribution of the maximum of all 10 elements, I get something that is actually pretty reasonable.

X = randfixedsum(10,10000,1,0.05,0.9);
   Min     0.1207
0%     0.1342
0%     0.1445
0%     0.1524
0%     0.1674
0%     0.1884
0%     0.2167
0%     0.2503
0%     0.2738
0%     0.3143
   Max     0.4039

Most of the time, we get a maximum value that is pretty small in context. And that is because the sample truly is uniformly distributed around the constraint space. One point in that space is equally as likely to arise as any other point. But that does NOT mean that the maximum is ever likely to be larger than 0.55. In fact, that would be an impossible event.

Suppose instead, that we change the way things were generated? Now, instead of requiring that the min be 0.05. Just make it 0. How do the statistics change?

X = randfixedsum(10,10000,1,0,0.9);
   Min     0.1395
0%     0.1681
0%     0.1902
0%      0.205
0%     0.2353
0%     0.2784
0%     0.3359
0%      0.401
0%     0.4492
0%     0.5479
   Max     0.8123

As you now see, the maximum element is now considerably larger. In the same size sample, I once got something as large as 0.8123. There is now much more room for those "dispersed" events to arise.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

J AI el 28 de Jun. de 2020

Editada: J AI el 28 de Jun. de 2020

Oh wow. really appreciate your detailed painstaking explanation. I can see how I got the whole thing messed up with my requirements. Thank you so much for clearing it up with such clarity.

Iniciar sesión para comentar.

Generating dispersed (non-integer) random matrix/array that sums to a particular value

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Generating dispersed (non-integer) random matrix/array that sums to a particular value

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos