regress and stats
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
In regress function there is an option to save stats that includes R^2 among the other things. I am trying to see the relationship between R^2 and corrcoef. When we have only simple linear regression (variable y (response) and variable x (independent variable), R = corrcoef (x, y); Also, R = corrcoef(y, y_from_regress_function); However, when I have say two independent variables x1 and x2, the relationship above do not hold. However, one relationship still has to hold. That is R from the regress output should still be equal to corrcoef(y,y_from_regress_function). Any suggestions on why matlab does not produce expected R2 in multiple regression? Here is the code I use: X = [one(size(x1)) x1 x2 x1.*x2]; [b,bind,r,rint,stats] = regress(y,X); model = b(1) + b(2)*x1 + b(3)*x3 + b(4).*x1.*x2; corr = corrcoef(model,y); I expected stats(1) = corr^2. But it is not. Any suggestions?
2 comentarios
the cyclist
el 20 de En. de 2012
It would be helpful if you used the Code button to format your code more readably.
the cyclist
el 20 de En. de 2012
It would also be helpful if you posted code with specification of x1, etc., such that it is a self-contained example that exhibits the issue. That saves people who might help you a lot of guesswork, and gives a common example to work with.
Respuestas (2)
Léon
el 24 de En. de 2012
That is not a matlab related questions, since it relies on econometrics/statistics.
In every case the coefficient of determination R^2 is the relation of the sum of explained squares and the sum of of all squares, R^2 = SSE / SST. In the bivariate case we can show that the correlation coefficient (Pearson) is sufficient to describe the explanatory power of the model, so that r^2 = R^2. Meaning that the covariance between y and x (where x is just 1 explanatory variable) increases with the variance in y and x and represents the variation that can be explained by that specific model. In other words, the R^2 in the bivariate case can be rewritten as r_(y,x) * (beta_x * s_x/s_y). Hence the coefficient of determination is as well the correlation coefficient weighted by the standardized regression coefficient in x. This relationship holds for the trivariate and multivariate case where R^2 can be expressed as the sum of all bivariate correlations, weighted by their specific standardized regression coefficients. So the point is in fact that in the bivariate case the standardized regression coefficient equals the correlation coefficient (!), such that, --> R^2 = r * r = r * (beta_x * s_x / s_y), (for the bivariate case).
I hope this helps you seeing the relation between the R^2 and the correlation between your variables clearer. But once again this is subject of elementary econometrics courses/books and you should be aware of these things before using such models that might give you biased/wrong results.
0 comentarios
Tom Lane
el 24 de En. de 2012
One problem is that the model you fit is not the same as the "model" value you computed afterward. Or maybe the "x3" was just a typo. Either way, here's some code showing that the square of the correlation between the observed and fitted y is equal to the R^2 value in the stats structure:
x1 = randn(100,1); x2 = 5*rand(100,1);
y = 100 + 10*x1 - 4*x1.*x2 + 3*x2.^2;
X = [ones(size(x1)) x1 x2 x1.*x2];
[b,bind,r,rint,stats] = regress(y,X);
model = X*b;
corr = corrcoef(model,y)
sqrt(stats(1))
0 comentarios
Ver también
Categorías
Más información sobre Gaussian Process Regression en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!