Borrar filtros
Borrar filtros

How can the following code be optimized / vectorized?

2 visualizaciones (últimos 30 días)
L'O.G.
L'O.G. el 16 de Mzo. de 2022
Comentada: L'O.G. el 16 de Mzo. de 2022
How can you optimize the following? It is painfully slow and takes around 5 minutes to run on my computer. Here I am calcuatling the distances between atoms in molecules while taking account periodic boundary conditions (which mean an atom on the edge of the box might actually belong to a molecule that wraps around, so to speak). I imagine this could be done in a far more efficient and speedy way than my Fortran-esque implementation.
A % an array where column 1 is the atom ID, column 2 is the molecule ID, and columns 3-5 are the positions in x, y, and z
N = 200; % molecules in system
Sites = 150; % sites in each molecule
length = 120; % length of box on each side
SitesInSystem = N*Sites;
B = zeros(SitesInSystem,SitesInSystem);
% center box at origin
A(:,3) = A(:,3) - length/2;
A(:,4) = A(:,4) - length/2;
A(:,5) = A(:,5) - length/2;
for ii = 1:SitesInSystem
for jj = 1:SitesInSystem
if ii ~= jj % if not the same site
dx = A(jj,3) - A(ii,3);
dy = A(jj,4) - A(ii,4);
dz = A(jj,5) - A(ii,5);
% taking into account periodic boundaries
if (dx > length * 0.5)
dx = dx - length;
elseif (dx <= -length * 0.5)
dx = dx + length;
end
if (dy > length * 0.5)
dy = dy - length;
elseif (dy <= -length * 0.5)
dy = dy + length;
end
if (dz > length * 0.5)
dz = dz - length;
elseif (dz <= -length * 0.5)
dz = dz + length;
end
% calculate distances
B(ii,jj) = sqrt(dx^2+dy^2+dz^2);
end
end
end
B = triu(B); % to avoid double counting
Perhaps this also can be combined with the following snippet, which calculates something based on the values of the matrix B:
for ii = 1:N % molecule
for jj = 1:Sites % site
for kk = 1:N
for mm = 1:Sites
if ii ~= kk % if not the same molecule
if (B(ii*jj,kk*mm) < 1.75) && (B(ii*jj,kk*mm) > 0)
% do a calculation, i.e., add edge ii-kk to a graph
% with N nodes
end
end
end
end
end
end
  4 comentarios
Stephen23
Stephen23 el 16 de Mzo. de 2022
"Here you can read the data into MATLAB and use readtable and table2array to turn the data into a MATLAB array."
READMATRIX would be simpler and more efficient: https://www.mathworks.com/help/matlab/ref/readmatrix.html
L'O.G.
L'O.G. el 16 de Mzo. de 2022
@Stephen Thanks. I didn't know about that function!

Iniciar sesión para comentar.

Respuesta aceptada

Bruno Luong
Bruno Luong el 16 de Mzo. de 2022
Editada: Bruno Luong el 16 de Mzo. de 2022
It takes about 25 seconds only PC
Ac35 = A(:,3:5);
Ai = reshape(Ac35,[],1,3);
B = zeros(SitesInSystem,SitesInSystem);
for j=1:SitesInSystem
Aj = reshape(Ac35(j,:),1,[],3);
dxyz = Ai-Aj;
dxyz = mod(dxyz+length/2,length)-length/2;
B(:,j) = sqrt(sum(dxyz.^2,3));
end

Más respuestas (2)

KSSV
KSSV el 16 de Mzo. de 2022
I think pdist2, this inbuilt function will work for you.
T = readtable('Sample.xlsx') ;
A = table2array(T) ;
A = A(:,3:5) ; % pick coordinates (x,y,z)
d = pdist2(A,A) ;
B = triu(d) ; % you may check this with you answer. Use isequal to check.
  3 comentarios
KSSV
KSSV el 16 de Mzo. de 2022
I have checked your code and pdist2 for the provided 274*5 data. Both the methods, gave same solution.
L'O.G.
L'O.G. el 16 de Mzo. de 2022
Editada: L'O.G. el 16 de Mzo. de 2022
@KSSV Right, but for the full case (not the sample file I gave) I would need to consider that some atoms might be at the edges of the box, so that periodic boundaries as applied to that molecule must be taken into account. Sorry I didn't make that clearer. I mentioned it in my initial query and coded it using the if statements nested in the orignal set of for loops, but I guess not very clearly.

Iniciar sesión para comentar.


Walter Roberson
Walter Roberson el 16 de Mzo. de 2022
You can see that the rem() based expression I use here can be used to vectorize the calculation.
format bank
Length = 5;
x = -3.5 : .5 : 3.5
x = 1×15
-3.50 -3.00 -2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 3.00 3.50
for K = 1 : length(x)
dx = x(K);
if (dx > Length * 0.5)
dx = dx - Length;
elseif (dx <= -Length * 0.5)
dx = dx + Length;
end
newx_if(K) = dx;
end
newx_mod = rem((x - Length/2), Length) + Length/2;
newx_if
newx_if = 1×15
1.50 2.00 2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 -2.00 -1.50
newx_mod
newx_mod = 1×15
1.50 2.00 2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 3.00 3.50
newx_if - newx_mod
ans = 1×15
0 0 0 0 0 0 0 0 0 0 0 0 0 -5.00 -5.00
  3 comentarios
Walter Roberson
Walter Roberson el 16 de Mzo. de 2022
newxyz_mod = rem((A - Length/2), Length) + Length/2; % for x, y, z
looks fine, as long as Length is the same in all three directions.
You would follow it with
B = sqrt( sum(newxyz_mod.^2, 2) );
L'O.G.
L'O.G. el 16 de Mzo. de 2022
Editada: L'O.G. el 16 de Mzo. de 2022
@Walter Roberson Got it. Thank you! And about my other question, the differences -- rem() should use the differences, correct? Not the positions themselves? That's my understanding, but I don't know how to calculate that efficiently. As I mentioned in my previous comment, something like pdist(A(:,1),@(x,y) x-y); generates a huge vector and is slow, and that's just for x! I think there will be (N*Sites)^2 / 2 possible distances which will fill an upper triangular matrix. I'm not sure how to code that efficiently here, though.

Iniciar sesión para comentar.

Categorías

Más información sobre Matrix Indexing en Help Center y File Exchange.

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by