Scatter-plot of data in which the cluster membership is coded by colors.

2 visualizaciones (últimos 30 días)
Mark S
Mark S el 31 de Mayo de 2021
Editada: Mark S el 1 de Jun. de 2021
Hello, I have created a dendrogram of my given data.
NumCluster = 1566;
dist = pdist(alldata, 'euclidean');
GroupsMatrix = linkage(dist, 'complete');
clust = cluster(GroupsMatrix, 'maxclust', NumCluster);
E = evalclusters(alldata,clust,'CalinskiHarabasz')
[H,T,perm] = dendrogram(GroupsMatrix, 1566, 'colorthreshold', 'default');
I want to create now a scatter-plot of the data in which the cluster membership is coded by colors. I have tried to implement it like this
gscatter (alldata(:,1),alldata(:,2), E.OptimalY,'rbgk','xod')
Update:
The error is now gone but all the scatter plot has the same color. How can I cluster the membership by different colors? And is my E chosen correctly for the number of clusters? For my E I have 1566 cluster. I did not know if this is okay.
  2 comentarios
the cyclist
the cyclist el 31 de Mayo de 2021
It can be difficult to diagnose issues without the data. Specifically, we don't see how E is defined. Can you upload the data, such that we can actually run your code and reproduce the error?

Iniciar sesión para comentar.

Respuestas (1)

KSSV
KSSV el 1 de Jun. de 2021
Editada: KSSV el 1 de Jun. de 2021
gscatter (alldata(:,1),alldata(:,2),clust,'rbgk','xod')
Check the option E.OptimalY, it is empty []. So all the points are shown by same color/ maekers.
  3 comentarios
KSSV
KSSV el 1 de Jun. de 2021
Editada: KSSV el 1 de Jun. de 2021
alldata = csvread('Data.csv') ;
NumCluster = 10; % <-----change cluster number here
dist = pdist(alldata, 'euclidean');
GroupsMatrix = linkage(dist, 'complete');
clust = cluster(GroupsMatrix, 'maxclust', NumCluster);
E = evalclusters(alldata,clust,'CalinskiHarabasz') ;
gscatter (alldata(:,1),alldata(:,2),clust)
Mark S
Mark S el 1 de Jun. de 2021
Editada: Mark S el 1 de Jun. de 2021
Thanks. It works fine. One additional question: How can I find an optimal cluster number? Is it best to vary the NumCluster myself or is there another method? I have found this in the matlab help:
https://it.mathworks.com/matlabcentral/answers/76879-determining-the-optimal-number-of-clusters-in-kmeans-technique
klist=2:500;%the number of clusters you want to try
myfunc = @(X,K)(kmeans(X, K));
eva = evalclusters(alldata,myfunc,'CalinskiHarabasz','klist',klist)
classes=kmeans(alldata,eva.OptimalK);
I get here for my optimalK=3. But I am not sure if this is ok. Is the calculation for the optimal cluster numbers so ok?

Iniciar sesión para comentar.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by