How the CI of GLM parameters are calculated?

Question

TingTing el 8 de Ag. de 2014

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/146900-how-the-ci-of-glm-parameters-are-calculated

Respondida: Tom Lane el 12 de Ag. de 2014

I need to have the distributions of the coefficients estimated by glm model. I tried to find them and tested if the ones I found are the right ones by calculating the 95% CI from them and comparing those to those given by “predict” command. And the result turned out disappointing.

For example, if I use glm with gamma as distribution for response variable. The model is called mdl and the dataset is called dataset. After fitting the model, I wrote:

shape=1/mdl.Dispersion; [newf newc] = predict(mdl,dataset); scale=newf*mdl.Dispersion;

I tried to see if this is correct by calculating the 95% CI bounds for the first data point.

>> gamcdf(newc(1,2),shape,scale(1))

ans =

0.6329

>> gamcdf(newc(1,1),shape,scale(1))

ans =

0.4629

I think the different between them should be 0.95, but here 0.6329-0.4629 is obviously not.

So my understanding is wrong?

I also tried with normal distribution

>> mu=mdl.Fitted; >> variance=mdl.Dispersion; >> [newf newc] = predict(mdl,dataset); >> sigma=sqrt(variance);

>> normcdf(newc(1,2),newf(1),sigma)

ans =

0.5910

>> normcdf(newc(1,1),newf(1),sigma)

ans =

0.4152

Shouldn't the difference between the last two be 0.95? Is my understanding wrong or...?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Tom Lane el 12 de Ag. de 2014

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/146900-how-the-ci-of-glm-parameters-are-calculated#answer_148324

Abrir en MATLAB Online

I'm not exactly following what you are trying to do. It looks like you are using the estimated standard deviation of the observed response values as a measure of uncertainty in the predicted values. That's not exactly right. Imagine you had huge amounts of data. There would still be residual standard deviation, but the predicted values would be known much more accurately than that.

Here's an illustration of this using a regular linear model:

>> load fisheriris
>> x = meas(:,1:3);
>> y = meas(:,4);
>> lm = fitlm(x,y,'linear');
>> [newf,newc] = predict(lm,meas(1,1:3))
newf =
    0.2163
newc =
    0.1613    0.2712

Now here's how I can calculate those values myself:

>> newx = [1;meas(1,1:3)'];
>> C = lm.CoefficientCovariance;
>> b = lm.Coefficients.Estimate;
>> b'*newx
ans =
    0.2163
>> newf + [1 -1] * tinv(.025,lm.DFE) * sqrt(newx'*C*newx)
ans =
    0.1613    0.2712

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

How the CI of GLM parameters are calculated?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

How the CI of GLM parameters are calculated?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos