How can I remove inverted repeat pairs of strings from a table?
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Paul Jimenez
el 6 de Mayo de 2024
Comentada: Voss
el 7 de Mayo de 2024
Hi I wanna extract the inverted repeat pairs of strings from a 650x2 table. Let say I have the following pairs in a table:
A123.B123 B123.C123
A456.B456 B456.C456
A789.B789 B789.C789
B123.C123 A123.B123
B456.C456 A456.B456
. .
. .
. .
So as you can see there are some pairs that if we invert the order of pairing they became the same pair, for example the first pair with the fourth pair so I wanna extract those inverted repeated pairs from my table but I dont know how to do it. I tried with the "unique" function but that doesnt seems to work for inverted repeats. Any suggestions?
3 comentarios
Dyuman Joshi
el 7 de Mayo de 2024
Editada: Dyuman Joshi
el 7 de Mayo de 2024
@Paul Jimenez, There are no inverted string pairs in the data you have -
readtable('table.csv')
Respuesta aceptada
Voss
el 7 de Mayo de 2024
Editada: Voss
el 7 de Mayo de 2024
T = readtable('table.csv')
Here's one way to find pairs of reversed rows:
temp = string(T.(1)) == string(T.(2)).';
[r2,r1] = find(temp & temp.');
r = [r1 r2];
disp(r)
That says row 1 is a reversed copy of row 104, row 2 is a reversed copy of row 33, and so on.
Checking the first few, they do seem to be reversed pairs of rows:
T{r(1,:),:} % rows 1 and 104
T{r(2,:),:} % rows 2 and 33
T{r(3,:),:} % rows 3 and 69
I'm not sure exactly what you want to do with this information.
1 comentario
Voss
el 7 de Mayo de 2024
Here's a slight modification that's useful for removing one of each pair of reversed rows from the table:
T = readtable('table.csv')
temp = string(T.(1)) == string(T.(2)).';
idx = tril(temp & temp.');
idx(1:size(T,1)+1:end) = false; % to avoid removing a row that is the reverse of itself,
% set elements of idx along the diagonal to false
[r,~] = find(idx);
T(r,:) = []
Checking again for reversed pairs of rows confirms that the only ones left are the reverse of themselves:
temp = string(T.(1)) == string(T.(2)).';
[r,~] = find(tril(temp & temp.'))
T(r,:)
Más respuestas (1)
Mathieu NOE
el 6 de Mayo de 2024
hello Paul
this would be my suggestion
attached your data simply pasted in a text file
hope it helps
clc
out = readcell('data.txt');
first_col = out(:,1);
second_col = out(:,2);
% main loop
n = 0;
for k = 1:numel(first_col)
tf = strcmp(first_col{k},second_col);
if any(tf)
n = n + 1; % increase counter
ind1(n,1) = k;
ind2(n,1) = find(tf);
end
end
% all matching pairs
out = [ind1 ind2]
2 comentarios
Mathieu NOE
el 7 de Mayo de 2024
hello again
seems that in the csv file , in each column you have duplicates of strings
so I simply asked to perform the same process but taking only the unique strings in consideration , but of course it's not the same list as your original file
it is what you wanted or not ?
data = readcell('table.csv');
first_col = unique(data(:,1));
second_col = unique(data(:,2));
% main loop
n = 0;
for k = 1:numel(first_col)
tf = strcmp(first_col{k},second_col);
if any(tf)
n = n + 1; % increase counter
ind1(n,1) = k;
ind2(n,1) = find(tf);
end
end
% all matching pairs
out = [ind1 ind2]
Ver también
Categorías
Más información sobre Whos en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!