How can I get the the ward distance change to find an optimal number of clusters.
7 views (last 30 days)
Using hierarchical clustering, I would like to get the the ward distance change in every step to find an optimal number of clusters. I can plot the dendrogram, but would like the actual distances and the number of clusters that they correspond to.
Bernhard Suhm on 29 Dec 2017
The evalclusters function determines the optimal number of clusters for you. ‘linkage’ will use agglomerative clustering as the algorithm with the ‘ward’ distance. You have a choice of cluster evaluation criteria, ‘CalinksiHarabasz’ and ‘DaviesBouldin’ compare the between and within cluster distances in slightly different ways, and there is also a ‘gap’ and ‘silhouette’ criterion. The output object from evalclusters contains the criterion values for each number of clusters along with the optimal value.
So for example,
eva = evalclusters(X,'linkage','CalinskiHarabasz','KList',[2:6])
with input data in the matrix X will evaluate 2-6 clusters and provide the following output:
InspectedK: [2 3 4 5 6]
CriterionValues: [180.0914 300.2080 254.8927 220.7171 199.2285]