Using chi2gof to test two distributions

18 visualizaciones (últimos 30 días)
Allie
Allie el 6 de Feb. de 2019
Editada: Sim el 14 de Ag. de 2024
I want to use the chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis). I have binned observational data (x), binned model data (y), and the bin edges (bins). Both the observational and model data are counts per bin.
x= [41 22 11 10 9 5 2 3 2]
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]
bins=[0:9:81]
Because the data is already binned and because I'm testing x against y, I used the following code
[h,p,stat]=chi2gof(x,'Edges',bins,'Expected',y)
Manual calculation of the chi2 test statistic results in 4.6861 with a probablity of p=.7905. The above function however, produces a very different result. The resulting stats show different bin edges than designated, the ovserved counts per bin do not match x, the chi2 test statistic is ~87, and p<0.001. Could someone please explain why I'm getting such dramatically different results?

Respuesta aceptada

Jeff Miller
Jeff Miller el 7 de Feb. de 2019
Sorry, the x's really do have to be the data values. Try this:
bins=[0:9:81]
xvals = bins(1:end-1)+4.5; % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2] % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)
This will give you your 4.68. By default, chi2gof groups small bins (less than 5) together, and 'EMin' tells it not to do that.
  2 comentarios
Allie
Allie el 7 de Feb. de 2019
This worked! Thank you

Iniciar sesión para comentar.

Más respuestas (2)

Jeff Miller
Jeff Miller el 6 de Feb. de 2019
It looks like chi2gof expects the values in x to be the actual, original scores, not the bin counts. Try adding 'Frequency',x to the parameter list.
  1 comentario
Allie
Allie el 7 de Feb. de 2019
Editada: Allie el 7 de Feb. de 2019
This did not work. The stat output is below. As you can see, it changed the edges and expected values from what I originally input and the chi2stat became even bigger.
stat =
chi2stat: 234.4383
df: 5
edges: [0 9 18 27 36 45 81]
O: [12 30 22 0 41 0]
E: [38.0520 24.2655 15.4665 9.8595 6.2895 11.0670]

Iniciar sesión para comentar.


Sim
Sim el 14 de Ag. de 2024
Editada: Sim el 14 de Ag. de 2024
Shouldn't you use the two-sample chi-square test?
The Chi-squared test needs binned data. However, as far as I understand, you need to give the raw data, and not the binned data, as inputs of CHI2TEST2.
Indeed, CHI2TEST2 places the raw data into bins:
bins = unique([x1(:,1); x2(:,1)]); % create a bin for each unique value

Categorías

Más información sobre Hypothesis Tests en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by