Comparing the performance of colon, loop, and cell indexing

Question

Benjamin D'Anjou el 21 de Mayo de 2020

1
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/528938-comparing-the-performance-of-colon-loop-and-cell-indexing

Comentada: Benjamin D'Anjou el 21 de Mayo de 2020

Respuesta aceptada: Stephen23

Abrir en MATLAB Online

I will first state the problem and then ask my question.

PROBLEM

I have this code where repeatedly multiply a vector by a square matrix and store the resulting vector. The vector has a dimension dim and it is multiplied by the matrix numSteps times.

I wrote three variations of this code. First, I store the vector using colon indexing. Second, I store the vector using loop indexing. Third, I store the vector in a cell array using cell indexing.

The three functions are the following. They are identical except for the way the data is stored.

Colon indexing:

function output = funcColon(dim,numSteps)
%FUNCCOLON Function that stores vectors using colon indexing
% Initialize the output and the vector
output = zeros(numSteps,dim);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
    vec = mat*vec0;
    vec0 = vec;
    output(k,:) = vec; % Use colon indexing to store the results
end
end

Loop indexing:

function output = funcLoop(dim,numSteps)
%FUNCLOOP Function that stores vectors using a loop
% Initialize the output and the vector
output = zeros(numSteps,dim);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
    vec = mat*vec0;
    vec0 = vec;
    for j=1:dim
        output(k,j) = vec(j); % Use a loop to store the results
    end
end
end

Cell indexing:

function output = funcCell(dim,numSteps)
%FUNCCELL Function that stores vectors using a cell array
% Initialize the output and the vector
output = cell(numSteps,1);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
    vec = mat*vec0;
    vec0 = vec;
    output{k} = vec; % Store the vector in a cell array
end
end

I first set numSteps=100. I then ran each function for various values of dim and I timed them. I did this multiple times to accumulate statistics on the average runtime.

The results are shown in the attached figure. I take the average runtime of the colon indexing as a point of reference, i.e., the two curves show the ratio of the average runtimes of funcLoop and funcCell to the average runtime of funcColon.

There are basically two regimes.

First, there is the regime dim > 1. There, we have the hierarchy:

runtime loop < runtime cell < runtime colon

Second, there is the regime dim < 1. There, we have the hierarchy:

runtime loop < runtime colon < runtime cell

Note that in the limit of large dim, all functions tend to the same runtime because the runtime becomes limited by the matrix multiplication. But for small dim, the runtime is limited by the indexing. I checked that it is indeed the case with the Run & Time feature.

QUESTION

I'm quite confused by this result. I would have assumed that colon indexing would always be faster because it is a syntax designed specifically for matlab arrays. What is colon indexing doing that is taking so much additional runtime? Even cell array indexing is more efficient than colon indexing for dim > 1.

This seems to imply that I could rewrite an overloaded custom colon indexing function using a for loop and have it be more performant than MATLAB's native colon indexing.

Am I missing something? What are your suggestions for best practices with a code like that?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Stephen23 el 21 de Mayo de 2020

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/528938-comparing-the-performance-of-colon-loop-and-cell-indexing#answer_435288

Editada: Stephen23 el 21 de Mayo de 2020

Abrir en MATLAB Online

"Am I missing something?"

1- accessing data in non-contiguous locations is slow. Consider your indexing output(k,:): because MATLAB stores arrays in column-major order, your indexing refers to lots of separate locations spread out through the entire array. Carefully going through the array, finding those locations, splitting up the input array and copying the parts to each separate location... is not efficient, and gets slower for larger arrays. You data design forces MATLAB to be slow.

If you access only contiguous blocks of memory you would find the code would be much faster: output(:,k)

2- MATLAB (probably) does not actually copy those arrays, so although you write that it "stores" the input array in a cell:

output{k} = vec; % Store the vector in a cell array

actually MATLAB (probably) just stores a pointer to the already existing input array, and creating one small pointer does not take much time at all. The original array (probably) does not move or get copied anywhere, until you write to any location in that array, at which point it will finally get copied... and in your code this never happens, so you are comparing apples with oranges here.

Read about copy-on-write:

https://www.mathworks.com/help/matlab/matlab_prog/avoid-unnecessary-copies-of-data.html

Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!

"What are your suggestions for best practices with a code like that?"

Consider column-major in your design: if efficiency is important store data in contiguous locations.
Consider copy-on-write in your design.

2 comentarios
Mostrar NingunoOcultar Ninguno

Benjamin D'Anjou el 21 de Mayo de 2020

I think I get it. I read the copy-on-write doc before but I don't think I had grasped everything you say.

Just one clarification:

"Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!"

Are you saying that cell arrays store data contiguously by design, i.e., each element of a cell array occupies a contiguous memory location? That would make sense.

Stephen23 el 21 de Mayo de 2020

Editada: Stephen23 el 21 de Mayo de 2020

"each element of a cell array occupies a contiguous memory location?"

Every cell contains a separate array, which are each stored in separate memory locations:

https://www.mathworks.com/matlabcentral/answers/40640-memory-space-for-cell-array

Iniciar sesión para comentar.

Answer 2

Matt J el 21 de Mayo de 2020

2
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/528938-comparing-the-performance-of-colon-loop-and-cell-indexing#answer_435223

Editada: Matt J el 21 de Mayo de 2020

Abrir en MATLAB Online

I checked that it is indeed the case with the Run & Time feature.

I don't know what that is, but the better test in any case is to rewrite the functions so that they contain only the indexing operations.

function output = funcColon(dim,numSteps,output,vec)
%FUNCCOLON Function that stores vectors using colon indexing
    for k=1:numSteps
        output(k,:) = vec(k,:); % Use colon indexing to store the results
    end
    
end
function output = funcLoop(dim,numSteps,output,vec)
%FUNCLOOP Function that stores vectors using a loop
    for k=1:numSteps
        for j=1:dim
            output(k,j) = vec(k,j); % Use a loop to store the results
        end
    end
    
end
function output = funcCell(dim,numSteps,output,vec)
%FUNCCELL Function that stores vectors using a cell array
    for k=1:numSteps
        output{k} = vec(k,:); % Store the vector in a cell array
    end
    
end

The test code then gives me the following, less suprising results:

dim=20;
numSteps=1e5;
   vec = rand(numSteps,dim); 
   output = zeros(numSteps,dim);
   
   
tColon = timeit(@()funcColon(dim,numSteps,output,vec)) %0.0173 seconds
   output = zeros(numSteps,dim);
   
tLoop = timeit(@()funcLoop(dim,numSteps,output,vec)) %0.0207 seconds
  output = cell(numSteps,1);
tCell  = timeit(@()funcCell(dim,numSteps,output,vec)) %0.1120 seconds

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Benjamin D'Anjou el 21 de Mayo de 2020

Editada: Benjamin D'Anjou el 21 de Mayo de 2020

Abrir en MATLAB Online

Now I'm quite confused. I wrote your exact code, i.e., I ran the file:

dim=20;
numSteps=1e5;
vec = rand(numSteps,dim);
output = zeros(numSteps,dim);
tColon = timeit(@()funcColon(dim,numSteps,output,vec)) %0.0244 seconds
output = zeros(numSteps,dim);
tLoop = timeit(@()funcLoop(dim,numSteps,output,vec)) %0.0185 seconds
output = cell(numSteps,1);
tCell  = timeit(@()funcCell(dim,numSteps,output,vec)) %0.1253 seconds
function output = funcColon(~,numSteps,output,vec)
%FUNCCOLON Function that stores vectors using colon indexing
for k=1:numSteps
    output(k,:) = vec(k,:); % Use colon indexing to store the results
end
end
function output = funcLoop(dim,numSteps,output,vec)
%FUNCLOOP Function that stores vectors using a loop
for k=1:numSteps
    for j=1:dim
        output(k,j) = vec(k,j); % Use a loop to store the results
    end
end
end
function output = funcCell(~,numSteps,output,vec)
%FUNCCELL Function that stores vectors using a cell array
for k=1:numSteps
    output{k} = vec(k,:); % Store the vector in a cell array
end
end

It's different than before, but the loop is still significantly faster than the colon indexing. In fact, if I use your functions to make the same plot as in my OP this is what I get:

Now, the loop is better at low vector dimensions and the cell array is better at high dimensions. There is never a case where the colon indexing is the best solution... I'm a bit at a loss to understand this.

P.S. The Run & Time feature is a MATLAB widget that breaks down the execution time of each line in the code.

Matt J el 21 de Mayo de 2020

Editada: Matt J el 21 de Mayo de 2020

Abrir en MATLAB Online

To really know what's going on you should probably post your latest test code so that we can run it as well. Below are the results from my own test code which I have improved still further to exclude the memory allocation inherent in vec0(k,:). In almost all cases tLoop is slightly poorer than tColon. It is to be expected that tCell gets better and better because, as Stephen pointed out, it requires essentially no data copying. Be mindful however that if you make the variable output is held as a cell array, it may be relatively quick to build, but the memory access time to pull the data out later can be a lot slower, depending on numSteps.

Test code:

clear all
Dims=1:5:400;
for i=1:numel(Dims)
    dim=Dims(i);
    
   numSteps=3e3;
   vec0 = num2cell(rand(numSteps,dim),2); 
   
   output = zeros(numSteps,dim);
tColon(i) = timeit(@()funcColon(dim,numSteps,output,vec0));
   output = zeros(numSteps,dim);
   
tLoop(i) = timeit(@()funcLoop(dim,numSteps,output,vec0));
  output = cell(numSteps,1);
tCell(i)  = timeit(@()funcCell(dim,numSteps,output,vec0));
end
semilogy(Dims,tLoop./tColon,'r', Dims,tCell./tColon,'b');
legend('tLoop / tColon','tCell / tColon')
function output = funcColon(dim,numSteps,output,vec0)
%FUNCCOLON Function that stores vectors using colon indexing
    for k=1:numSteps
        output(k,:) = vec0{k}; % Use colon indexing to store the results
    end
    
end
function output = funcLoop(dim,numSteps,output,vec0)
%FUNCLOOP Function that stores vectors using a loop
    for k=1:numSteps
         vec=vec0{k};
        for j=1:dim
            output(k,j) = vec(j); % Use a loop to store the results
        end
    end
    
end
function output = funcCell(dim,numSteps,output,vec0)
%FUNCCELL Function that stores vectors using a cell array
    for k=1:numSteps
        output{k} = vec0{k}; % Store the vector in a cell array
    end
    
end

Benjamin D'Anjou el 21 de Mayo de 2020

Thanks. The behavior you observe is similar to the one I observe. I think now I understand better what to prioritize.

I will structure all my data in a 2D matrix and always loop through it column-wise. Then I'll reshape the array in whatever form I need at the end. I think that's probably the safest thing to do since the core of the program is basically a series of loops over the elements of a big array.

Iniciar sesión para comentar.

Comparing the performance of colon, loop, and cell indexing

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (1)

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Comparing the performance of colon, loop, and cell indexing

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (1)

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo