Matlab find unique column-combinations in matrix and respective index

Question

Benvaulter el 22 de Mzo. de 2017

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index

Editada: Jan el 23 de Mzo. de 2017

I have a large matrix with with multiple rows and a limited (but larger than 1) number of columns containing values between 0 and 9 and would like to find an efficient way to identify unique row-wise combinations and their indices to then build sums (somehwat like a pivot logic). Here is an example of what I am trying to achieve:

a =

uniqueCombs =

   2     3
   2     3
   2     1

numOccurrences =

 2
 1
 2

indizies:

[1;4]
[2]
[3;5]

From matrix a, I want to first identify the unique combinations (row-wise), then count the number occurrences / identify the row-index of the respective combination.

I have achieved this through generating strings with num2str and strcat, but this method appears to be very slow. Along these thoughts I have tried to find a way to form a new unique number through concatenating the values horizontally, but Matlab does not seem to support this (e.g. from [1;2;3] build 123). Sums won't work because they would remove the possibility to identify unique combinations. Any suggestions on how to best achieve this? Thanks!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Guillaume el 22 de Mzo. de 2017

2
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259890

Abrir en MATLAB Online

More or less the same as Jan's, using accumarray instead of splitapply (I'm still old school!):

A = [ 1     2     3
      2     2     3
      3     2     1
      1     2     3
      3     2     1];
[B, ~, ib] = unique(A, 'rows');
numoccurences = accumarray(ib, 1);
indices = accumarray(ib, find(ib), [], @(rows){rows});  %the find(ib) simply generates (1:size(a,1))'

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Guillaume el 23 de Mzo. de 2017

Editada: Guillaume el 23 de Mzo. de 2017

Abrir en MATLAB Online

I suspect that accumarray will be faster as it is built-in compiled code whereas splitapply is m code, but I haven't conducted any test.

Note: for the indices,

indices = accumarray(ib, (1:numel(ib))', [], @(rows){rows});

is probably slightly faster, just not as concise.

Jan el 23 de Mzo. de 2017

Editada: Jan el 23 de Mzo. de 2017

Abrir en MATLAB Online

@Guillaume: I compare this with cellfun: In older versions Matlab contained the C-sources for this Mex function. Here calling a function handle is very expensive, because the Matlab tier has to be called. Therefore the implicitely defined methods provided by strings are much faster: 'length', 'isclass' etc.

Then using a compiled Mex function is not a real benefit, because mexCallMATLAB has some overhead. This might concern accumarray also. I guess that your accumarray approach is faster than the loop, but I know that it looks very cryptic ;-)

But now I can leave the speculations and run a test: With

A = randi([1, 100], 1e5, 3); % Test data

my loop takes 14.75 seconds, your accumarray approach takes 0.44 seconds. The results differ in the order of the indices. So perhaps this is wanted:

[B, iB, iA] = unique(A, 'rows');
indices     = accumarray(iA, (1:numel(iA)).', [], @(r){sort(r)});

The result is clear: @Benvaulter, please unaccept my answer and select Guillaume's, and of course use it also to save time and energy.

Iniciar sesión para comentar.

Answer 2

Jan el 22 de Mzo. de 2017

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259879

Editada: Jan el 23 de Mzo. de 2017

Abrir en MATLAB Online

A = [ 1     2     3; ...
      2     2     3; ...
      3     2     1; ...
      1     2     3; ...
      3     2     1];
[B, iB, iA] = unique(A, 'rows');
G = unique(iA);
numOccurrences = splitapply(@sum, iA, G);

I cannot test a method to obtain the indices list as wanted. I assume this works with splitapply also. A simple loop approach at least:

n = length(G);
indices = cell(1, n);
for k = 1:n
  indices{k} = find(iA == G(k));
end

[EDITED] Code is tested now. Use the much faster solution of Guillaume for productive work.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Benvaulter el 23 de Mzo. de 2017

Perfect solution to my problem - thanks a lot!

Iniciar sesión para comentar.

Matlab find unique column-combinations in matrix and respective index

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Matlab find unique column-combinations in matrix and respective index

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

4 comentarios Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (1)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos