Overfitting and What is it ?

Question

HAAAAH el 12 de Sept. de 2019

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/480071-overfitting-and-what-is-it

Comentada: Image Analyst el 13 de Sept. de 2019

I have a data set and will divide it into " training set" and " validation set" . I choose that divide 70% training and 30% validation. I have calculated the linear regression from " training set" , and then calculate data of validation set by using linear regression from "training set". My result is pretty small than the correct one. Explaination can be the concept of our linear regression which do not apply for the new data (in this case is validation data) . Or is there another reason?

"And a model’s ability to predict future outcomes in light of what you discovered here". What does the question mean ?

The formula for fitting error where y1 = y * e^(t) is the estimated value of y

E(error) = (1/n )* sqrt ( ( sum(y1 -y)^2) (i= 0 to n )

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Torsten el 12 de Sept. de 2019

Explaination can be the concept of our linear regression which do not apply for the new data (in this case is validation data) . Or is there another reason?

How can we tell without knowing your data and the background of data gathering ?

"And a model’s ability to predict future outcomes in light of what you discovered here". What does the question mean ?

What question ?

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Image Analyst el 12 de Sept. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/480071-overfitting-and-what-is-it#answer_391539

You can't overfit a linear regression. Overfitting is basically where you have your model go through, or mostly through, your data points. For example if you had 4 data points and fit a cubic, that would be overfitting. If you have N data points and fit a polynomial with, oh I don't know, say, N/2 or something, then you might have over fitting. But that won't happen with a line since it won't go through all your data points unless you had only 2 points.

It could be that your data does not fit the model that was determined using the 70% of other points, but I doubt it since you probably chose them randomly. I'd think the 30% error was much worse than the other 70% just due to randomness. In general (usually) testing/validation will be worse error than the training points, but it could be better.

2 comentarios
Mostrar NingunoOcultar Ninguno

Bruno Luong el 12 de Sept. de 2019

"You can't overfit a linear regression."

Of course you can

Linear regression means the model is linear wrt the parameters, it does not mean fitting necessary a line/plane.

You can fit

yj = sum on i ( A_i * f_i(x_j)) to some data and it is easy to overfit.

Spline fitting is in this formulation and poeple can overfit. Waba and al are notorious for studying regularization method to make a "right" fit.

Image Analyst el 13 de Sept. de 2019

You're right. What I meant was fitting a line through data, which is not the same as fitting a higher order polynomial via a sum of weighted x^n terms (i.e., linear regression).

Iniciar sesión para comentar.

Overfitting and What is it ?

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Overfitting and What is it ?

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno