problem with hierarchical clustering
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi guys i need your help to solve this problem. I need to make a Hierarchical Clustering for university purposes. I write the code:
a = [.2 0 0 .2 .6 0; 0 0 0 0 1 0; 0 .11 0 .11 .22 .56; 0 0 .25 0 0 .75; .1 0 0 .3 .5 .1; .06 .16 .08 .12 .04 .53]; b = [0 0 0 .2 .4 .4; 0 .24 .06 .06 .29 .35; 0 0 0 0 0 0; .33 .33 0 0 0 .33; .05 0 .09 .27 .45 .14; .21 .11 0 .11 .24 .34]; c = [.3 .1 0 .1 0 .5; 0 0 0 0 0 0; 0 0 0 0 0 0; .21 .12 .18 .24 .06 .18; 0 0 0 0 0 0; .29 .14 .04 .25 .11 .18]; d = [.18 .18 .09 .18 .09 .27; 0 0 .17 .33 0 .5; 0 0 .29 .29 0 .43; 0 .18 .09 .09 0 .64; 0 0 0 0 0 0; .15 .14 .05 .18 .01 .47]; e = [ 0 0 0 0 0 0; .18 .18 .27 .18 .18 0; 0 0 0 0 0 0; 0 0 0 0 0 0; .11 .2 .17 .03 .2 .29; .2 0 0 0 .1 .7]; f = [ 0 0 0 0 0 0; 0 .43 0 .14 .14 .29; 0 0 0 0 0 0; 0 0 0 1 0 0; 0 0 0 0 0 0; .18 .09 .05 .14 .14 .41]; g = [ .21 .07 0 .21 .21 .29; 0 0 .67 0 0 .33; .05 .11 .37 .21 .11 .16; .14 .09 .05 .23 .23 .27; .13 .11 .18 .11 .13 .33; 0 0 .25 .25 .50 .0]; h = [0 0 0 0 0 0; .15 .2 .15 .2 .15 .15; .11 .1 .11 0 0 .67; 0 0 0 0 0 1; .11 .11 .22 .11 .11 .33; .27 .18 .14 .14 .14 .14]; i = [.1 0 .1 .1 .4 .3; 0 0 0 0 0 0; .12 .04 .08 .08 .42 .25; 0 0 0 0 0 0; .32 .09 .12 .03 .27 .17; .17 0 .02 .13 .43 .26]; l = [.29 0 0 .07 0 .64; 0 0 0 0 0 0; 0 0 0 0 0 0; 0 0 0 0 0 1; 0 0 0 1 0 0; .31 .04 .00 .02 .13 .51]; m = [0 .5 0 0 0 .5; 0 0 0 0 0 1; 0 0 0 0 0 1; .09 .18 0 0 .09 .64; 0 0 0 0 0 1; .08 .06 .01 .13 .05 .68]; n = [.26 .09 .00 .04 .17 .43; 0 0 0 0 0 0; 0 0 0 0 0 0; .29 0 0 0 .29 .43; .17 0 0 .17 .17 .5; .20 .04 .02 .1 .24 .41]; o = [0 .25 0 .25 0 .5; 0 0 0 0 0 0;0 0 0 0 0 0; .27 0 0 .14 .09 .5; 0 0 0 0 0 0; .41 .05 .05 .19 .08 .22]; p = [.25 0 0 .17 .08 .5; 0 0 0 0 0 0;0 0 0 0 0 0;1 0 0 .33 0 .57; 0 0 0 0 0 0; .06 .06 00 .25 .00 .62]; q = [.29 .24 0 .1 .14 .24; 1 0 0 0 0 0; 0 0 0 0 0 0; 0 0 0 0 0 0; .33 .04 0 .06 .43 .14; 0 0 0 0 .33 .67]; r = [.22 .04 .09 .04 .35 .26; 0 0 0 0 0 0; .25 .05 .05 .05 .25 .35; 0 0 .14 .14 .33 .33; 0 0 .08 .15 .23 .54; .17 .06 .17 .11 .17 .33]; s = [.37 .21 00 .16 00 .26; 0 .5 0 0 0 .5; 0 0 0 0 0 0; 0 0 0 0 0 0; .45 0 0 .23 .05 .27; .05 .05 .05 0 .18 .68]; t = [ 0 0 0 .67 .17 .17; 0 0 0 0 0 0; 0 0 0 0 0 0;0 0 0 0 0 0; .20 0 0 .2 0 .6; .22 .02 .04 .07 0 .64]; u = [ .08 00 .08 .25 0 .58; 0 0 0 0 0 0; 0 0 0 0 0 0; .08 0 .08 .33 0 .5; 0 0 0 0 0 0; .13 .03 .06 .1 0 .67]; v = [.11 0 .11 0 .33 .44; 0 0 0 0 0 0; 0 0 0 0 0 1; 0 0 0 0 0 0; 0 0 .07 .07 .22 .63; .09 .04 .09 .07 .20 .52];
all_data = [a;b;c;d;e;f;g;h;i;l;m;n;o;p;q;r;s;t;u;v]; size(all_data) Y = pdist(all_data); Z = linkage(Y,'average'); dendrogram(Z);
I wrote twenty matrices, why diagram (dendrogram) returns me over twenty nodes? where am I wrong?
sorry for my bad english.
2 comentarios
Brendan Hamm
el 30 de Sept. de 2016
What happens on the first iteration is that 2 points are combined into a new node and then distance is measured from this node to others.
Respuestas (1)
Kelly Kearney
el 30 de Sept. de 2016
You concatenated 20 6 x 6 matrices. This results in 120 rows in your matrix, hence 120 leaves in your dendrogram (assuming you actually show all the leaves... by default, dendrogram truncates the number of displayed leaves to 30).
2 comentarios
Kelly Kearney
el 3 de Oct. de 2016
Sure, if that makes sense for your application. You'll just need to reshape the matrix so all 36 property variables are in a single row for each leaf.
I would start by changing your variable naming system. The hard-coded alphabetical names are just asking for trouble. A 3D matrix or cell array will let you store the same data with much easier indexing.
So instead of
a = [.2 0 0 .2 .6 0; 0 0 0 0 1 0; 0 .11 0 .11 .22 .56; 0 0 .25 0 0 .75; .1 0 0 .3 .5 .1; .06 .16 .08 .12 .04 .53];
b = [0 0 0 .2 .4 .4; 0 .24 .06 .06 .29 .35; 0 0 0 0 0 0; .33 .33 0 0 0 .33; .05 0 .09 .27 .45 .14; .21 .11 0 .11 .24 .34];
c = [.3 .1 0 .1 0 .5; 0 0 0 0 0 0; 0 0 0 0 0 0; .21 .12 .18 .24 .06 .18; 0 0 0 0 0 0; .29 .14 .04 .25 .11 .18];
d = [.18 .18 .09 .18 .09 .27; 0 0 .17 .33 0 .5; 0 0 .29 .29 0 .43; 0 .18 .09 .09 0 .64; 0 0 0 0 0 0; .15 .14 .05 .18 .01 .47];
try this:
ndata = 4;
var = nan(6,6,ndata);
var(:,:,1) = [.2 0 0 .2 .6 0; 0 0 0 0 1 0; 0 .11 0 .11 .22 .56; 0 0 .25 0 0 .75; .1 0 0 .3 .5 .1; .06 .16 .08 .12 .04 .53];
var(:,:,2) = [0 0 0 .2 .4 .4; 0 .24 .06 .06 .29 .35; 0 0 0 0 0 0; .33 .33 0 0 0 .33; .05 0 .09 .27 .45 .14; .21 .11 0 .11 .24 .34];
var(:,:,3) = [.3 .1 0 .1 0 .5; 0 0 0 0 0 0; 0 0 0 0 0 0; .21 .12 .18 .24 .06 .18; 0 0 0 0 0 0; .29 .14 .04 .25 .11 .18];
var(:,:,4) = [.18 .18 .09 .18 .09 .27; 0 0 .17 .33 0 .5; 0 0 .29 .29 0 .43; 0 .18 .09 .09 0 .64; 0 0 0 0 0 0; .15 .14 .05 .18 .01 .47];
(I'm only demonstrating with 4 variables for now, but you should list all 20).
Now you can easily reshape that array into the nleaf x nproperty array needed by pdist (20 x 36, in your example):
Y = pdist(reshape(var,[],ndata)');
Z = linkage(Y,'average');
dendrogram(Z, 0);
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!