Getting NaN when computing partialcorr (no NaNs in data)

Question

1 voto

Hi, I am using partialcorr on series of data and it sometimes results in NaNs. Why is that? I am sure I have no NaNs in my data and no missing or empty entries. Sometimes using partialcorr([x y], 'rows','complete') helps bot it does not always fix the problem. Thanks for help.

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

ARIEL YEHUDA GOLDSTEIN el 4 de Mayo de 2021

encountering the same issue. I wish someone helps..

dpb el 10 de Oct. de 2022

Editada: dpb el 10 de Oct. de 2022

Abrir en MATLAB Online

tF=readtable(websave('Test_data.txt','https://www.mathworks.com/matlabcentral/answers/uploaded_files/125764/Test_data.txt'));
partialcorr([tF.flower_date,tF.cum_temp],[tF.Var1,tF.Var2])
ans = 2×2
     1   NaN
   NaN   NaN
fitlm(tF,'predictorVars',{'cum_temp','Var1','Var2'},'ResponseVar','flower_date','intercept',true)
Warning: Regression design matrix is rank deficient to within machine precision.
ans = 
Linear regression model:
    flower_date ~ 1 + Var1 + Var2 + cum_temp

Estimated Coefficients:
                   Estimate       SE         tStat       pValue  
                   ________    _________    _______    __________

    (Intercept)           0            0        NaN           NaN
    Var1             17.841      0.25253     70.647    1.8066e-59
    Var2           -0.42291     0.016155    -26.178    1.5975e-34
    cum_temp        0.36047    0.0049775     72.419    4.1539e-60


Number of observations: 64, Error degrees of freedom: 61
Root Mean Squared Error: 3.28
R-squared: 0.845,  Adjusted R-Squared: 0.84
F-statistic vs. constant model: 167, p-value = 1.9e-25

So partialcorr isn't lying to us; let's see what's going on between the independent variables themselves...

corrcoef([tF.cum_temp,tF.Var1,tF.Var2])
ans = 3×3
    1.0000   -0.9174   -0.4560
   -0.9174    1.0000    0.7726
   -0.4560    0.7726    1.0000

OK, none of those are identically 1 altho cum_temp is very highly correlated with Var1 and Var1,Var2 are pretty high with each other, they aren't directly correlated. So, the conclusion has to be that cum_temp is a linear combination of the other two...let's check that out next--

fitlm(tF,'predictorVars',{'Var1','Var2'},'ResponseVar','cum_temp','intercept',true)
ans = 
Linear regression model:
    cum_temp ~ 1 + Var1 + Var2

Estimated Coefficients:
                   Estimate    SE    tStat    pValue
                   ________    __    _____    ______

    (Intercept)      427       0      Inf       0   
    Var1             -61       0     -Inf       0   
    Var2               1       0      Inf       0   


Number of observations: 64, Error degrees of freedom: 61
R-squared: 1,  Adjusted R-Squared: 1
F-statistic vs. constant model: 8.54e+29, p-value = 0

That last shows that cum_temp is identically predicted by a linear combination of Var1, Var2 leading to the given results before.

This probably means that Var1, Var2 were/are derived, not observed variables and may throw doubt on the rest of the prior analyses as well, depending on just how those corollary variables were/are defined and what it is that prevented the above result for other cases as well.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Adam Danz el 4 de Mayo de 2021

0 votos

See similar question: getting a NaN in correlation coefficient

The same basic problem is happening with the partial correlation.

Matlab's partialcorr follows the steps explained in Wikipedia's Partial Correlation article.

When correlating variable X with variable Y while controlling for variable Z, the X variable may be predicted by Z so their residuals would be 0 or very close to 0. To prevent returning a spurious correlation, the partialcorr function detects residuals close to 0 and sets them to 0 to avoid floating point roundoff error. If you look at the equation in the wiki article, it will be clear why NaN values are returned in those cases since 0/0=NaN.

The partialcorr.m file contains valuable comments by its authors explaining this just above the lines of code that compute the correlation coefficients (r2021a).

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Getting NaN when computing partialcorr (no NaNs in data)

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Getting NaN when computing partialcorr (no NaNs in data)

4 comentarios Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos