How to implement kNN imputation on test set without data leakage?
4 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I am using knnimpute to handle missing data for machine learning. My data is subdivided into a training and test set (mTrain and mTest). The usage of knnimpute for the training set is easy. For the test set, however, I need the algorithm to impute missing values by using the nearest neighbor from the training set to prevent data leakage. Now I am wondering how to implement knnimpute on the test set in this way. Does anybody have an idea how to code that?
1 comentario
Zexi Yang
el 17 de Ag. de 2022
Why do you have to impute test set using nearest neighbor from training set? You can just use nearest neighours from test set without having any data leakage. Data leakage is where you impute training set using data from test set.
Respuestas (0)
Ver también
Categorías
Más información sobre Hypothesis Tests en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!