Memory cost of multiplying sparse matrices
Mostrar comentarios más antiguos
What is the memory cost for multiplying sparse matrices? It seems to be much larger than the memory used by either of the matrices being multiplied:
>> A = sprand(5e9,2, 1e-7); B = sparse(eye(2));
whos
Name Size Bytes Class Attributes
A 5000000000x2 16024 double sparse
B 2x2 56 double sparse
>> A*B;
Error using *
Out of memory. Type HELP MEMORY for your options.
As you can see in the example above, the sparse matrices A and B are not taking up much memory, but computing A*B still results in an out of memory error. Why does this happen, and is there a way to avoid it?
8 comentarios
James Tursa
el 15 de Sept. de 2020
Editada: James Tursa
el 15 de Sept. de 2020
It appears MATLAB may be using large intermediate arrays because it doesn't know ahead of time the sparsity of the result. Is it feasible to chunk up this calculation in your situation? What are the actual sizes you are using? If the 2nd dimension is not too large, maybe a mex routine would work for you to cut down on this memory usage.
Bruno Luong
el 18 de Sept. de 2020
Waoh someone must give us a proper explanation, since it might be a hidden bug.
Bruno Luong
el 18 de Sept. de 2020
R2018b, 2020a, 2020b.
Actually it seems related to RAM. It throws error on 16 Gb RAM PC like this one
>> memory
Maximum possible array: 11293 MB (1.184e+10 bytes) *
Memory available for all arrays: 11293 MB (1.184e+10 bytes) *
Memory used by MATLAB: 1834 MB (1.923e+09 bytes)
Physical Memory (RAM): 16198 MB (1.699e+10 bytes)
* Limited by System Memory (physical + swap file) available.
and not on PC with 32 Gb
>> memory
Maximum possible array: 30203 MB (3.167e+10 bytes) *
Memory available for all arrays: 30203 MB (3.167e+10 bytes) *
Memory used by MATLAB: 1698 MB (1.780e+09 bytes)
Physical Memory (RAM): 32670 MB (3.426e+10 bytes)
* Limited by System Memory (physical + swap file) available.
>> A = sprand(5e9,2, 1e-7); B = sparse(eye(2));
>> A*B;
What I suspect is Maximum possible array must be larger than this
>> RequiredMbRAM = round(size(A,1)*4/1024^2) % What I suspect
RequiredMbRAM =
19073
AS
el 18 de Sept. de 2020
Bruno Luong
el 18 de Sept. de 2020
Editada: Bruno Luong
el 18 de Sept. de 2020
Agress that task manager could miss it. I don't see any spike on my 32 Gb PC while
AB = A*B
is being carried out sous MATLAB

Respuesta aceptada
Más respuestas (2)
Bruno Luong
el 15 de Sept. de 2020
Editada: Bruno Luong
el 15 de Sept. de 2020
I guess MATLAB creates a temporary buffer of length equals to the number of rows of A when A*B is invoked. The exact detail only TMW employees who can acces to the source code can answer.
Here is what I suggest to multiply A*B for very long tall A
[iA, jA, a] = find(A);
m = size(A,1);
n = size(B,2);
p = numel(jA)*n; % Guess of size of I, J, S
% Preallocate
I = zeros(p,1,'uint32');
J = zeros(p,1,'uint32');
S = zeros(p,1);
p = 0;
for k=1:n
[jB, ~, b] = find(B(:,k));
[i, l] = ismember(jA,jB);
q = nnz(i);
idx = p+(1:q);
I(idx) = iA(i);
J(idx) = k;
S(idx) = a(i).*b(l(i));
p = p+q;
end
idx = 1:p;
AB = sparse(I(idx), J(idx), S(idx), m, n);
1 comentario
Bruno Luong
el 15 de Sept. de 2020
Editada: Bruno Luong
el 15 de Sept. de 2020
A variant
[iA, jA, a] = find(A);
m = size(A,1);
n = size(B,2);
p = numel(jA)*n; % Guess of size of I, J, S
% Preallocate
I = zeros(p,1,'uint32');
J = zeros(p,1,'uint32');
S = zeros(p,1);
p = 0;
for k=1:n
Bk = B(:,k);
jB = find(Bk);
i = ismembc(jA,jB); % undocumented stock function, too bad it's doesn't return second argument of ISMEMBER
q = nnz(i);
idx = p+(1:q);
I(idx) = iA(i);
J(idx) = k;
S(idx) = a(i).*Bk(jA(i));
p = p+q;
end
idx = 1:p;
AB = sparse(I(idx), J(idx), S(idx), m, n);
It doesn't seem to be faster than the first method when I test with tic/toc, but the tests I conducted are far from cover all the cases.
Here's another customized multiplication routine for tall A. I do not know how it compares to Bruno's in terms of speed, but it is loop-free.
A = sprand(5e9,2, 1e-7); B = speye(2);
tic
m=size(A,1);
n=size(B,2);
Ia=find(any(A,2));
Jb=find(any(B,1));
C=A(Ia,:)*B(:,Jb);
[Ic,Jc,S]=find(C);
AB=sparse( Ia(Ic) , Jb(Jc) , S , m,n); %equal to A*B
toc%Elapsed time is 0.001254 seconds.
Categorías
Más información sobre Resizing and Reshaping Matrices en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
