How to speed up pairwise difference calculation between vectors under certain condition?

Question

Raouf Amara el 26 de Mayo de 2019

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/464048-how-to-speed-up-pairwise-difference-calculation-between-vectors-under-certain-condition

Respondida: Jan el 8 de Jun. de 2019

I have two double arrays A (size: Nx1) and B (Mx1) both with large increasing values (from ~1e9 to ~1e13; not sure this is relevant).

I am trying to compute the pairwise differences between the two vectors but only for pairs that respect a certain condition, that is the absolute value of the difference between the two values is smaller than t.

I first tried to brute force calculate all the pairwise differences but as you would expect a NxM array cannot be stored on a regular computer; it is around a few hundred of terabytes.

So far I have implemented a rather straightforward solution which works fine with small N and M.

Low = 1e9;    % lower bound for values in A & B
Upp = 1e13;   % upper bound for values in A & B
N = 1e4;      % number of elements in A and B (assuming they are equal, which is not the case)
A = sort((Upp-Low)*rand(N,1)+Low,'ascend');
B = sort((Upp-Low)*rand(N,1)+Low,'ascend');
maxdif = 1000000;    % maximum difference
PrWD = cell(1,N);
tic
for i = 1:N
    % pairwise difference if |B-A(i)|<=maxdif
    PrWD{i} = A(i) - B(abs(B-A(i))<=maxdif)';
end
toc

However, I have A and B arrays which are both very large (N and M is the order of a few tens of millions) such that this operation takes a few hours in practice.

I tried arrayfun or bsxfun on small subsets (N=M=1e5) but it does not speed up the process.

At this point, I understand that this is extremely demanding as calculating all the pairwise differences or even the indices verifying the condition seem to scale with N*M.

Is there any (Matlab) way to speed up this process? Maybe taking into acccount the fact that values are increasing, or that not all differences need to be calculated?

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Jan el 5 de Jun. de 2019

Editada: Jan el 5 de Jun. de 2019

You can improve the abs(B-A(i))<=maxdif part by locating the lower and upper indices iteratively. I will provide some code soon.

Dummy comment just to let me find this thread again: Jans magic message

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Jan el 8 de Jun. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/464048-how-to-speed-up-pairwise-difference-calculation-between-vectors-under-certain-condition#answer_378398

Abrir en MATLAB Online

Instead of creating the huge index matrix abs(B-A(i))<=maxdif in each iteration, it is cheaper to search the lower and upper limits of the matching elements iteratively:

tic
PrWD2 = cell(1,N);
ini = 1;
fin = ini;
for i = 1:N
   % Find initial and final index:
   while B(ini) < A(i) - maxdif && ini < N
      ini = ini + 1;
   end
   while B(fin) <= A(i) + maxdif && fin < N
      fin = fin + 1;
   end
   
   if ini <= fin - 1
      PrWD2{i} = A(i) - B(ini:fin - 1).';
   end
end
toc
% PrWD2(cellfun('isempty', PrWD2)) = {zeros(1,0)};
% isequal(PrWD, PrWD2)

With your example and N=1e5 this is about 1100 times faster than the original code.

This code replies the empty matrix [], if no element is matching, while the original code creates a [1 x 0] matrix. For a comparison of the results, the empty matrices are adjusted, but this is not needed in the productive code most likely.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

How to speed up pairwise difference calculation between vectors under certain condition?

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How to speed up pairwise difference calculation between vectors under certain condition?

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos