Removing duplicate rows (not "unique")

57 visualizaciones (últimos 30 días)
Michael Siebold
Michael Siebold el 4 de Mayo de 2016
Comentada: saad sulaiman el 5 de Nov. de 2022
I have a matrix with many (1e5+) rows and I want to remove both copies of all duplicate rows. Is there a fast way to do this? (This function needs to be run many times.)
  4 comentarios
jgg
jgg el 4 de Mayo de 2016
You can use the other calling methods to get replicate counts.
a = [1 2; 1 2; 2 3; 2 4; 2 5; 4 2; 4 2; 1 3; 1 3; 4 5];
[C,ia,ic] = unique(a,'rows');
[count key] = hist(ic,unique(ic));
Then you can just select the keys with non-unit counts and drop them.
Michael Siebold
Michael Siebold el 4 de Mayo de 2016
Perfect and thanks a million! I kept messing with ia and ic, but just wasn't thinking histogram... Would you mind submitting this as an answer so I can accept it?

Iniciar sesión para comentar.

Respuesta aceptada

Roger Stafford
Roger Stafford el 5 de Mayo de 2016
Editada: Roger Stafford el 5 de Mayo de 2016
Let A be your matrix.
[B,ix] = sortrows(A);
f = find(diff([false;all(diff(B,1,1)==0,2);false])~=0);
s = ones(length(f)/2,1);
f1 = f(1:2:end-1); f2 = f(2:2:end);
t = cumsum(accumarray([f1;f2+1],[s;-s],[size(B,1)+1,1]));
A(ix(t(1:end-1)>0),:) = []; % <-- Corrected
  6 comentarios
Michael Siebold
Michael Siebold el 5 de Mayo de 2016
Editada: Michael Siebold el 5 de Mayo de 2016
And this solution is even faster than the first suggestion in the comments! Thanks for all the help!
saad sulaiman
saad sulaiman el 5 de Nov. de 2022
greetings.
how could we apply this code to a mesh where we have coordinate points for each triangle, such that we remove the internal edges, or edges shared by two triangles?
thanks in advance.

Iniciar sesión para comentar.

Más respuestas (2)

Azzi Abdelmalek
Azzi Abdelmalek el 4 de Mayo de 2016
Editada: Azzi Abdelmalek el 4 de Mayo de 2016
A=randi(5,10^5,3);
tic
A=unique(A,'rows');
toc
The result
Elapsed time is 0.171778 seconds.
  3 comentarios
Azzi Abdelmalek
Azzi Abdelmalek el 4 de Mayo de 2016
Editada: Azzi Abdelmalek el 4 de Mayo de 2016
You said that unique function will leave a copy of duplicate rows. With this example, I show you that there is no duplicates rows stored! And also it doesn't take much time
Mitsu
Mitsu el 3 de Ag. de 2021
I reckon your answer does not address OP's question because running the following:
A=[1 1 1;1 1 1;1 1 0];
tic
A=unique(A,'rows');
toc
Will yield:
A = 1 1 0
1 1 1
Therefore, A still contains one instance of each row that was duplicate. I believe Michael wanted all instances of each row that appears multiple times be removed.

Iniciar sesión para comentar.


GeeTwo
GeeTwo el 16 de Ag. de 2022
%Here's a much cleaner way to do it with 2019a or later!
[B,BG]=groupcounts(A);
A_reduced=BG(B==1); % or just A if you want the results in the same variable.

Categorías

Más información sobre Creating and Concatenating Matrices en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by