Vectorizing nonlinear matrix operation on many small matrices
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Adam Shaw
el 18 de Dic. de 2020
Comentada: Matt J
el 19 de Dic. de 2020
I am trying to optimize the following generic matrix operation:
m = 3; % small number in general
n = 2^20; % large power of 2 in general
A = rand(m,n);
B = zeros(m^2,m^2);
for ii = 1:size(A,2)
a = A(:,ii);
r = a*a';
B = B + kron(r,r);
end
% return B
On my computer the above takes ~7s. By compiling to a MEX file with MATLAB Coder I can improve this by ~15x. I have tried compiling to CUDA with GPU Coder, but this seems to be quite inefficient.
I think the difficulty comes from two different sources:
1) I am not sure of an efficient way to vectorize the creation of the "r" matrices from the columns of the A matrix, and so have to resort to the outer for loop approach
2) I think the Kronecker product is inefficient to implement on the gpu due to the small matrix size
The speedup from compiling to MEX is nice, but I just have this feeling that I am still doing something quite inefficiently. I would appreciate if anyone has any ideas on how to optimize the above calculation, either along the lines of the two difficulties I outlined above, or via a different approach.
2 comentarios
David Goodmanson
el 19 de Dic. de 2020
Hi Adam,
if you replace
B = B + kron(r,r);
with
r = r(:);
BB = BB + r*r';
the loop runs about 5 times faster. (The actual substitution runs faster than that, but the nonchanged steps in the loop still of course have to be included).
Matt J
el 19 de Dic. de 2020
@Adam,
It may be important to know what you plan to do with B, once you've computed it.
Respuesta aceptada
Matt J
el 19 de Dic. de 2020
Editada: Matt J
el 19 de Dic. de 2020
m = 3; % small number in general
n = 2^20; % large power of 2 in general
A = rand(m,n);
tic;
B = zeros(m^2,m^2);
for ii = 1:size(A,2)
a = A(:,ii);
r = a*a';
B = B + kron(r,r);
end
toc;
tic;
C=reshape(A,m,1,n).*reshape(A,1,m,n);
C=reshape(C,m^2,n);
B=C*C.';
toc;
7 comentarios
Más respuestas (0)
Ver también
Categorías
Más información sobre GPU Computing en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!