How can i do k-fold cross validation with Matlab built-in k-mean?
8 views (last 30 days)
Naif Almusalam on 19 Nov 2017
I am having a matrix data and being able to use Matlab built-in k-mean. However, I am looking also to use the k-fold cross validation if possible? Iam not even sure if corss-vaidation can be used with clustering and not limited to classification. Best Regards
Bernhard Suhm on 15 Dec 2017
Cross-validation is indeed typically used in the context of classification, since it's a method to measure accuracy on "unseen" data without having to explicitly set aside training data.
However, if you indeed want to compare the "accuracy" of different clustering methods, Tibshirani described an approach, where you essentially compare the clustering obtained just on the test set with the "closest" cluster derived from the training set. You can find more detail about this e.g. at https://stats.stackexchange.com/questions/87098/can-you-compare-different-clustering-methods-on-a-dataset-with-no-ground-truth-b
There is no "built-in" crossvalidation in Matlab stats functions, but you can use crossvalind to program it out yourself, see example near the bottom of https://www.mathworks.com/help/bioinfo/ref/crossvalind.html.