how to deal with a missing value of a time series?
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
I have few time series that are to be used in regression. for some of them the few first or last values are missing, how should I deal deal with this for the Matlab not to give error?
0 comentarios
Respuestas (2)
Fangjun Jiang
el 17 de Nov. de 2011
It depends on your need. You could fill it with zero, or nan. Or you can fill it with values that are interpreted from known data.
0 comentarios
Richard Willey
el 17 de Nov. de 2011
Hi Yoshiko
The treatment of missing data is a fairly complicated topic. The choice of techniques to handle a missing data problem very much depends on how you plan to use the resulting data.
If you plan to generate a regression model from your data then your best course of action is to code the missing data points as NaNs. The regress command in Statistics Toolbox will then ignore any row that contains a missing data point.
I would strongly advise against using interpolation to substitute new values for the missing data points.
- This will impact any future analysis you do with this data and potentially bias metrics like R^2
- Using an interpolation technique for extrapolation can produce very inaccurate results
In a similar vein, coding this missing data points with any kind of numeric value (say 0 or -9999) can cause significant problems. (The regression algorithm will treat this value as a valid number)
0 comentarios
Ver también
Categorías
Más información sobre Linear and Nonlinear Regression en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!