Asked by David C
on 25 Oct 2012

MATLAB provides built-in functions to generate random numbers with an uniform or Gaussian (normal) distribution. My question is: if I have a discrete distribution or histogram, how can I can generate random numbers that have such a distribution (if the population (numbers I generate) is large enough)?

Please post here if anyone knows of a good method of doing this.

Thanks, David

Answer by Jonathan Epperl
on 27 Oct 2012

Accepted Answer

Since nobody has any suggestions, here's one. If you have a discrete distribution, say it is a Nx2 matrix PD, first column the discrete values, second the probabilities of the corresponding value -- so sum(PD(:,2))==1.

Then map the probablities to the unit interval and use rand. What mean by that:

% Those are your values and the corr. probabilities:

PD =[

1.0000 0.1000

2.0000 0.3000

3.0000 0.4000

4.0000 0.2000];

% Then make it into a cumulative distribution

D = cumsum(PD(:,2));

% D = [0.1000 0.4000 0.8000 1.0000]'

Now for every r generated by rand, if it is between D(i) and D(i+1), then it corresponds to an outcome PD(1,i+1), with the obvious extension at i==0. Here's a way you could do that, even though I'm sure there are better ones:

R = rand(100,1); % Your trials

p = @(r) find(r<pd,1,'first'); % find the 1st index s.t. r<D(i);

% Now this are your results of the random trials

rR = arrayfun(p,R);

% Check whether the distribution looks right:

hist(rR,1:4)

% It does, roughly 10% are 1, 30% are 2 and so on

If you want more help you should post a minimal example of the form in which you have the discrete distribution.

David C
on 3 Jan 2013

Thanks.

Roger Stafford
on 3 Jan 2013

Eric Auld
on 24 Jan 2018

Sign in to comment.

Answer by Image Analyst
on 28 Oct 2012

Answer by Theron FARRELL
on 30 Apr 2019

Edited by Theron FARRELL
on 30 Apr 2019

Hi there,

I use this naive function to generate artificial outliers applied in machine learning. Hope that it will be a bit help in your case.

function [Out_Data, Out_PDF, CHist] = Complement_PDF(Hist, Data_Num, p)

% Generate a 1D vector of data with a PDF specified as the complementary PDF of input historgram. Note that the larger

% Data_Num is, the more Out_PDF will resemble to CHist

% Input

% Hist: PDF/Histogram of data

% Data_Num: Desired number of data to be generated

% p: Precision given by number of digits after 0

% Output

% Out_Data: Generated data as per the complementary PDF

% Out_PDF: The complementary PDF as per Out_Data

% CHist: The complementary PDF as per Hist

% Example

% Hist = [1, 6, 7, 100, 0, 0, 0, 2, 3, 5];

% Data_Number = 100000;

% p = 3

Hist = Hist/sum(Hist);

CHist = 1- Hist;

CHist = CHist/sum(CHist);

CDF_CHist = cumsum(CHist);

CDF_CHist = double(int32(CDF_CHist*10^p))/10^p;

Out_Data = zeros(1, Data_Num);

Out_PDF = zeros(1, length(CDF_CHist));

for i = 1:Data_Num

% Generate a uniformly distributed variable

x = double(int32(rand*10^p))/10^p;

% Inversely index CDF

Out_Data(i) = Inverse_CDF(x, CDF_CHist);

temp = floor(Out_Data(i) * length(CDF_CHist));

Out_PDF(temp) = Out_PDF(temp) + 1;

end

figure;

subplot 221, bar(Hist);

subplot 222, bar(CHist);

subplot 223, plot(CDF_CHist);

subplot 224, bar(Out_PDF);

end

function [y] = Inverse_CDF(x, CDF_CHist)

CDF_CHist_Ext = [0, CDF_CHist];

y = 1;

for ind = 1:length(CDF_CHist)

if (x >= CDF_CHist_Ext(ind)) && (x < CDF_CHist_Ext(ind+1))

y = ind/length(CDF_CHist);

break;

end

end

end

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.