Histogram with normal distribution and fixed-width bins?

26 visualizaciones (últimos 30 días)
Marmi Afrin
Marmi Afrin el 31 de Jul. de 2012
Hello
I am trying to plot a histogram of a data set with fixed-width bins and overplot a normal distribution on it/fit the data to a normal distribution and plot it. I've been using the histfit function for the latter purpose, however the histfit function does not seem to allow a way where fixed-width bins can be plotted, while the hist() function does. Can someone please help me with this problem, i.e. how do I plot a histogram with fixed-width bins and get a normal distribution overplotted on that (so that it gives me mean and stdev). Thanks.
  2 comentarios
Ilya
Ilya el 31 de Jul. de 2012
Can you clarify what you mean by "fixed-width bins"? histfit function can plot data only using bins of equal width. Do you mean that you need to specify the bin width?
Marmi Afrin
Marmi Afrin el 31 de Jul. de 2012
It means I want my bars to have fixed widths, the ranges are different for each set of data but I am plotting them together in subplots. That is because I am computing the systematic uncertainty of a machine and for each run the lowest value (and highest value) seem to vary anywhere between 0.6-1.2 so doing a fixed number of bins doesn't allow all the histograms to have the same width. I am basically trying to do a histogram per each of eight data sets taken of the same experiment so a total of eight histograms but data is unbounded and each set has a slight different range and thus keeping nbins fixed gives me different bin-width from histogram to histogram. The thing is I know I can solve this problem by using hist() only instead of histfit() but I also need to overplot a normal distribution per each histogram. I hope that makes it clear. Thanks for the reply!.

Iniciar sesión para comentar.

Respuestas (2)

Ilya
Ilya el 1 de Ag. de 2012
Editada: Ilya el 1 de Ag. de 2012
This lets you specify the range and bin width:
% Generate data
N = 100;
x = normrnd(0,1,N,1);
% Fit
[muhat,sigmahat] = normfit(x);
% Bin
xmin = -5; % lower bound
xmax = 5; % upper bound
h = 0.5; % bin width
edges = xmin:h:xmax;
% Histogram
n = histc(x,edges);
figure;
bar(edges,n,'histc');
% Overlay the distribution
hold;
f = @(z) N*h*normpdf(z,muhat,sigmahat);
fplot(f,[xmin xmax],'color','r');
hold off;
  7 comentarios
Marmi Afrin
Marmi Afrin el 2 de Ag. de 2012
Yes I know my fits look good - they were done with the histfit function, my advisor wasn't happy since the histograms didn't have the same width hence I posted here asking how I get the same normal distribution but have the histograms done with the same bin-width for the bars basically.
Marmi Afrin
Marmi Afrin el 6 de Ag. de 2012
llya, so I applied the above to my code with N=sum(~inan(data(:,1)) for the normalization, the problem is the muhat and sigma hat values that come out are different from what I get in the statistics tool box's mean and stdev value after plotting. What I get my data data files for muhat and sigmahat from running the code with my data values gives me values that are reasonable but I get large values from the statistics toolbox that are nowhere close. Of-course, changing N doesnt do anything to them as that's just normalization for the gaussian curve. Any tips on that or do you require more info to help me with that aspect? Thanks!

Iniciar sesión para comentar.


Walter Roberson
Walter Roberson el 31 de Jul. de 2012
Is your input data bounded or unbounded? If it is bounded, then it is a logical error to fit a normal distribution to it, as normal distributions are always unbounded. If your data is unbounded, then it is a logical error to attempt to plot it with fixed width bins in a finite display media.
  3 comentarios
Walter Roberson
Walter Roberson el 31 de Jul. de 2012
Your data is bounded. By definition then, it is not normally distributed, so fitting a normal distribution to it will give you garbage.
Please examine the properties of the Normal Distribution. You will see that for any finite x, the probability that the distribution is less than or equal to x is always non-negative.
If your data is such that there is some finite lower bound to it, then for your actual distribution the probability that the data is less than the lower bound, would be 0, which would be a violation of the possibility that the data forms a normal distribution (in which case the probability might be very small but would be non-negative.)
Bounded data is never normal distribution.
If your data is bounded, you might perhaps have a beta distribution, but never a normal distribution.
Ilya
Ilya el 1 de Ag. de 2012
Walter, any sample is bounded. This does not imply that any sample is not drawn from a normal distribution.
R = 1000;
N = 100;
minofx = zeros(N,1);
for r=1:R
minofx(r) = min(randn(N,1));
end
hist(minofx);
There may be things Marmi is not telling us about the data, but most certainly you cannot conclude that his fits are garbage just based on what he said.

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by