# Comparing the performance of colon, loop, and cell indexing

6 views (last 30 days)
Benjamin D'Anjou on 21 May 2020
Commented: Benjamin D'Anjou on 21 May 2020
I will first state the problem and then ask my question.
PROBLEM
I have this code where repeatedly multiply a vector by a square matrix and store the resulting vector. The vector has a dimension dim and it is multiplied by the matrix numSteps times.
I wrote three variations of this code. First, I store the vector using colon indexing. Second, I store the vector using loop indexing. Third, I store the vector in a cell array using cell indexing.
The three functions are the following. They are identical except for the way the data is stored.
Colon indexing:
function output = funcColon(dim,numSteps)
%FUNCCOLON Function that stores vectors using colon indexing
% Initialize the output and the vector
output = zeros(numSteps,dim);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
vec = mat*vec0;
vec0 = vec;
output(k,:) = vec; % Use colon indexing to store the results
end
end
Loop indexing:
function output = funcLoop(dim,numSteps)
%FUNCLOOP Function that stores vectors using a loop
% Initialize the output and the vector
output = zeros(numSteps,dim);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
vec = mat*vec0;
vec0 = vec;
for j=1:dim
output(k,j) = vec(j); % Use a loop to store the results
end
end
end
Cell indexing:
function output = funcCell(dim,numSteps)
%FUNCCELL Function that stores vectors using a cell array
% Initialize the output and the vector
output = cell(numSteps,1);
vec0 = ones(dim,1);
% Choose a matrix
mat = rand(dim); mat = mat./sum(mat,1);
% Do repeated matrix multiplications
for k=1:numSteps
vec = mat*vec0;
vec0 = vec;
output{k} = vec; % Store the vector in a cell array
end
end
I first set numSteps=100. I then ran each function for various values of dim and I timed them. I did this multiple times to accumulate statistics on the average runtime.
The results are shown in the attached figure. I take the average runtime of the colon indexing as a point of reference, i.e., the two curves show the ratio of the average runtimes of funcLoop and funcCell to the average runtime of funcColon. There are basically two regimes.
First, there is the regime dim > 1. There, we have the hierarchy:
runtime loop < runtime cell < runtime colon
Second, there is the regime dim < 1. There, we have the hierarchy:
runtime loop < runtime colon < runtime cell
Note that in the limit of large dim, all functions tend to the same runtime because the runtime becomes limited by the matrix multiplication. But for small dim, the runtime is limited by the indexing. I checked that it is indeed the case with the Run & Time feature.
QUESTION
I'm quite confused by this result. I would have assumed that colon indexing would always be faster because it is a syntax designed specifically for matlab arrays. What is colon indexing doing that is taking so much additional runtime? Even cell array indexing is more efficient than colon indexing for dim > 1.
This seems to imply that I could rewrite an overloaded custom colon indexing function using a for loop and have it be more performant than MATLAB's native colon indexing.
Am I missing something? What are your suggestions for best practices with a code like that?

Stephen Cobeldick on 21 May 2020
Edited: Stephen Cobeldick on 21 May 2020
"Am I missing something?"
1- accessing data in non-contiguous locations is slow. Consider your indexing output(k,:): because MATLAB stores arrays in column-major order, your indexing refers to lots of separate locations spread out through the entire array. Carefully going through the array, finding those locations, splitting up the input array and copying the parts to each separate location... is not efficient, and gets slower for larger arrays. You data design forces MATLAB to be slow.
If you access only contiguous blocks of memory you would find the code would be much faster: output(:,k)
2- MATLAB (probably) does not actually copy those arrays, so although you write that it "stores" the input array in a cell:
output{k} = vec; % Store the vector in a cell array
actually MATLAB (probably) just stores a pointer to the already existing input array, and creating one small pointer does not take much time at all. The original array (probably) does not move or get copied anywhere, until you write to any location in that array, at which point it will finally get copied... and in your code this never happens, so you are comparing apples with oranges here.
Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!
"What are your suggestions for best practices with a code like that?"
• Consider column-major in your design: if efficiency is important store data in contiguous locations.
• Consider copy-on-write in your design.

Benjamin D'Anjou on 21 May 2020
I think I get it. I read the copy-on-write doc before but I don't think I had grasped everything you say.
Just one clarification:
"Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!"
Are you saying that cell arrays store data contiguously by design, i.e., each element of a cell array occupies a contiguous memory location? That would make sense.
Stephen Cobeldick on 21 May 2020
"each element of a cell array occupies a contiguous memory location?"
Every cell contains a separate array, which are each stored in separate memory locations:

Matt J on 21 May 2020
Edited: Matt J on 21 May 2020
I checked that it is indeed the case with the Run & Time feature.
I don't know what that is, but the better test in any case is to rewrite the functions so that they contain only the indexing operations.
function output = funcColon(dim,numSteps,output,vec)
%FUNCCOLON Function that stores vectors using colon indexing
for k=1:numSteps
output(k,:) = vec(k,:); % Use colon indexing to store the results
end
end
function output = funcLoop(dim,numSteps,output,vec)
%FUNCLOOP Function that stores vectors using a loop
for k=1:numSteps
for j=1:dim
output(k,j) = vec(k,j); % Use a loop to store the results
end
end
end
function output = funcCell(dim,numSteps,output,vec)
%FUNCCELL Function that stores vectors using a cell array
for k=1:numSteps
output{k} = vec(k,:); % Store the vector in a cell array
end
end
The test code then gives me the following, less suprising results:
dim=20;
numSteps=1e5;
vec = rand(numSteps,dim);
output = zeros(numSteps,dim);
tColon = timeit(@()funcColon(dim,numSteps,output,vec)) %0.0173 seconds
output = zeros(numSteps,dim);
tLoop = timeit(@()funcLoop(dim,numSteps,output,vec)) %0.0207 seconds
output = cell(numSteps,1);
tCell = timeit(@()funcCell(dim,numSteps,output,vec)) %0.1120 seconds

Benjamin D'Anjou on 21 May 2020
Now I'm quite confused. I wrote your exact code, i.e., I ran the file:
dim=20;
numSteps=1e5;
vec = rand(numSteps,dim);
output = zeros(numSteps,dim);
tColon = timeit(@()funcColon(dim,numSteps,output,vec)) %0.0244 seconds
output = zeros(numSteps,dim);
tLoop = timeit(@()funcLoop(dim,numSteps,output,vec)) %0.0185 seconds
output = cell(numSteps,1);
tCell = timeit(@()funcCell(dim,numSteps,output,vec)) %0.1253 seconds
function output = funcColon(~,numSteps,output,vec)
%FUNCCOLON Function that stores vectors using colon indexing
for k=1:numSteps
output(k,:) = vec(k,:); % Use colon indexing to store the results
end
end
function output = funcLoop(dim,numSteps,output,vec)
%FUNCLOOP Function that stores vectors using a loop
for k=1:numSteps
for j=1:dim
output(k,j) = vec(k,j); % Use a loop to store the results
end
end
end
function output = funcCell(~,numSteps,output,vec)
%FUNCCELL Function that stores vectors using a cell array
for k=1:numSteps
output{k} = vec(k,:); % Store the vector in a cell array
end
end
It's different than before, but the loop is still significantly faster than the colon indexing. In fact, if I use your functions to make the same plot as in my OP this is what I get: Now, the loop is better at low vector dimensions and the cell array is better at high dimensions. There is never a case where the colon indexing is the best solution... I'm a bit at a loss to understand this.
P.S. The Run & Time feature is a MATLAB widget that breaks down the execution time of each line in the code.
Matt J on 21 May 2020
To really know what's going on you should probably post your latest test code so that we can run it as well. Below are the results from my own test code which I have improved still further to exclude the memory allocation inherent in vec0(k,:). In almost all cases tLoop is slightly poorer than tColon. It is to be expected that tCell gets better and better because, as Stephen pointed out, it requires essentially no data copying. Be mindful however that if you make the variable output is held as a cell array, it may be relatively quick to build, but the memory access time to pull the data out later can be a lot slower, depending on numSteps. Test code:
clear all
Dims=1:5:400;
for i=1:numel(Dims)
dim=Dims(i);
numSteps=3e3;
vec0 = num2cell(rand(numSteps,dim),2);
output = zeros(numSteps,dim);
tColon(i) = timeit(@()funcColon(dim,numSteps,output,vec0));
output = zeros(numSteps,dim);
tLoop(i) = timeit(@()funcLoop(dim,numSteps,output,vec0));
output = cell(numSteps,1);
tCell(i) = timeit(@()funcCell(dim,numSteps,output,vec0));
end
semilogy(Dims,tLoop./tColon,'r', Dims,tCell./tColon,'b');
legend('tLoop / tColon','tCell / tColon')
function output = funcColon(dim,numSteps,output,vec0)
%FUNCCOLON Function that stores vectors using colon indexing
for k=1:numSteps
output(k,:) = vec0{k}; % Use colon indexing to store the results
end
end
function output = funcLoop(dim,numSteps,output,vec0)
%FUNCLOOP Function that stores vectors using a loop
for k=1:numSteps
vec=vec0{k};
for j=1:dim
output(k,j) = vec(j); % Use a loop to store the results
end
end
end
function output = funcCell(dim,numSteps,output,vec0)
%FUNCCELL Function that stores vectors using a cell array
for k=1:numSteps
output{k} = vec0{k}; % Store the vector in a cell array
end
end
Benjamin D'Anjou on 21 May 2020
Thanks. The behavior you observe is similar to the one I observe. I think now I understand better what to prioritize.
I will structure all my data in a 2D matrix and always loop through it column-wise. Then I'll reshape the array in whatever form I need at the end. I think that's probably the safest thing to do since the core of the program is basically a series of loops over the elements of a big array.