Why is subscripted assignment so inefficient?

3 visualizaciones (últimos 30 días)
Kenneth Johnson
Kenneth Johnson el 5 de Feb. de 2022
Comentada: Walter Roberson el 6 de Feb. de 2022
Why is MATLAB's subscripted assignment so abysmally inefficient with more than 2 subscripts? e.g.,
>> clear, a = []; tic, for j = 1e3:-1:1, for k = 1e3:-1:1, a(j,k,1) = 1; end, end, toc
Elapsed time is 0.193743 seconds.
>> clear, a = []; tic, for j = 1e3:-1:1, for k = 1e3:-1:1, a(j,k) = 1; end, end, toc
Elapsed time is 0.004241 seconds.
  1 comentario
Walter Roberson
Walter Roberson el 5 de Feb. de 2022
Interesting. I thought for a moment that it might be due to the extra multiplication and addition needed each time, but when I rewrite the code in terms of linear indexing as-if a 3D array, then I do not see nearly that much performance drop.

Iniciar sesión para comentar.

Respuesta aceptada

Matt J
Matt J el 5 de Feb. de 2022
Editada: Matt J el 5 de Feb. de 2022
I would guess it's only because you are using 3 subscripts on an array that only has 2 dimensions. This forces Matlab to to do extra work, namely running a check to make sure that in a(j,k,m) the 3rd subscript is m=1 and simplifying the subscripts from (j,k,1) to (j,k). When a actualy does have 3 dimensions, the times are much more comparable:
a=zeros(1e3,1e3,2);
tic,
for j = 1e3:-1:1
for k = 1e3:-1:1
a(j,k,1) = 1;
end
end
toc
Elapsed time is 0.019444 seconds.
a=zeros(1e3,1e3);
tic,
for j = 1e3:-1:1
for k = 1e3:-1:1
a(j,k) = 1;
end
end
toc
Elapsed time is 0.016890 seconds.
  7 comentarios
Walter Roberson
Walter Roberson el 6 de Feb. de 2022
Unfortunately I am running into time limits a lot in exploring the timing of allocations. With linear increasing allocations, if we assume approximately linear time, then we run into the 1t+2t+3t+4t+...n*t = t*n*(n+1)/2 problem, that the time needed to explore the suspected linear relation in a linear manner, requires squared time. I think I might need to switch to exponential increases in memory sizes to probe more closely.
N = 4;
G = 6;
start = datetime('now');
%timing pushed into a function to allow for potential JIT
t = run_timing(N, G);
stop = datetime('now');
elapsed = seconds(stop - start)
elapsed = 51.7211
plot(t);
xlabel('try #'); ylabel('seconds')
legend((1:G) + "G")
title('absolute time')
tmean = mean(t,1);
plot(tmean)
xlabel(' gigabytes'); ylabel('mean seconds')
p = polyfit(1:G, tmean, 1);
fprintf('roughly %g seconds + %g seconds per gigabyte\n', p(2), p(1))
roughly -0.00468463 seconds + 0.616974 seconds per gigabyte
function t = run_timing(N,G)
t = zeros(N,G);
for g = 1 : G
for n = 1 : N
tic();
initgig(g);
t(n,g) = toc;
end
end
end
%gigabyte 1073741824
%megabyte 1048576
function initgig(M)
a = ones(1,1073741824*M,'uint8');
end
Walter Roberson
Walter Roberson el 6 de Feb. de 2022
There is a lot of noise in these measurements, and it is difficult to tell what is meaningful or not.
I skip the first few runs in the plot because in nearly all runs there is a bad peak early on.
The second plot here is the more interesting one, showing the log of how the time to allocate increases with amount requested. Noise affects it a lot, so I can only describe trends that I encountered a fair bit:
  • time to allocate 1 byte starts off a bit higher
  • time to allocate a quite small number of bytes but more than 1, falls a bit, then rises a bit.
  • there does seem to be a peak at 256 bytes
  • and then it does seem to fall a bit
  • until at 2^14 there seems to be a distinct rise. This would correspond to 16384 bytes. I suspect that either 4096 or 16384 bytes is the size of an entry in the "small block pool"
  • in some of the plots, there was a leveling from 2^14 to 2^16; the more fine-grained I draw the less I see this, so it might have been a statistical artificat
  • from 2^16 onward, the log grows pretty much linearly as log2() of number of bytes increases. To phrase that a different way: from 2^16 onward, the allocation time grows linearly with number of bytes allocated. The actual boundary might possibly be 2^14
  • so, below 2^14, there appear to be at least two different allocation strategies, but the boundaries between them are a bit difficult to find. If there were only one allocation strategy for that range, using a free block pool with entries of size 2^14, then you would expect the time to be constant up to that point, but there do seem to be peaks.
  • I seem to recall that 256 bytes is the limit below which for some constant allocations and some colon expressions, that MATLAB keeps a copy around as an optimization, to be handed out when encountering another request with the same spelling (spacing even being important!). Perhaps that is why we see a blip at 256??
If someone cares to do a study of how small block allocation is done in MATLAB, I suggest 2^14 as the upper limit to study (and remember to take into account those new hidden copies that MATLAB recently starting taking for small constants.)
N = 50;
G = 2^24;
start = datetime('now');
%timing pushed into a function to allow for potential JIT
[t, gvals] = run_timing(N, G);
stop = datetime('now');
elapsed = seconds(stop - start);
display(elapsed)
elapsed = 0.5941
figure(1)
%tfilt = filloutliers(t, 'center','movmedian', 3, 1);
startat = 6;
semilogy(startat:N, t(startat:end,:));
xlabel('try #'); ylabel('seconds')
title('absolute time')
%legend("2^{"+string(gvals)+"}", 'NumColumns',4);
figure(2)
tmean = mean(t(startat:end,:),1);
semilogy(gvals, tmean)
xt = 1:ceil(max(gvals));
xticks(xt)
xticklabels("2^{"+string(xt)+"}");
xtickangle(45)
xlabel('bytes')
ylabel('mean seconds')
plin = polyfit(2.^gvals, tmean, 1);
%plog = polyfit(log2(gvals), log2(tmean),1);
fprintf('linear: roughly %g seconds + %g seconds per gigabyte\n', plin(2), plin(1)*2^30);
linear: roughly -4.90643e-08 seconds + 0.0490081 seconds per gigabyte
%fprintf('log: roughly %g seconds + %g seconds per gigabyte\n', 2.^plog(2), plog(1));
function [t,gvals] = run_timing(N,G)
%wb = waitbar(0, '0G');
%cleanMe = onCleanup(@()delete(wb));
gvals = 0:.1:ceil(log2(G));
ng = length(gvals);
t = zeros(N,ng);
for g = 1 : ng
gp = gvals(g);
gv = round(2.^gp);
%waitbar(0, wb, "2^{"+string(gp)+"}");
for n = 1 : N
tic();
initmem(gv);
t(n,g) = toc;
%waitbar(n/N, wb);
end
end
end
%gigabyte 1073741824
%megabyte 1048576
function initmem(M)
ma = ones(1,M,'uint8');
end

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre MATLAB en Help Center y File Exchange.

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by