- Consider column-major in your design: if efficiency is important store data in contiguous locations.
- Consider copy-on-write in your design.

6 views (last 30 days)

I will first state the problem and then ask my question.

PROBLEM

I have this code where repeatedly multiply a vector by a square matrix and store the resulting vector. The vector has a dimension dim and it is multiplied by the matrix numSteps times.

I wrote three variations of this code. First, I store the vector using colon indexing. Second, I store the vector using loop indexing. Third, I store the vector in a cell array using cell indexing.

The three functions are the following. They are identical except for the way the data is stored.

Colon indexing:

function output = funcColon(dim,numSteps)

%FUNCCOLON Function that stores vectors using colon indexing

% Initialize the output and the vector

output = zeros(numSteps,dim);

vec0 = ones(dim,1);

% Choose a matrix

mat = rand(dim); mat = mat./sum(mat,1);

% Do repeated matrix multiplications

for k=1:numSteps

vec = mat*vec0;

vec0 = vec;

output(k,:) = vec; % Use colon indexing to store the results

end

end

Loop indexing:

function output = funcLoop(dim,numSteps)

%FUNCLOOP Function that stores vectors using a loop

% Initialize the output and the vector

output = zeros(numSteps,dim);

vec0 = ones(dim,1);

% Choose a matrix

mat = rand(dim); mat = mat./sum(mat,1);

% Do repeated matrix multiplications

for k=1:numSteps

vec = mat*vec0;

vec0 = vec;

for j=1:dim

output(k,j) = vec(j); % Use a loop to store the results

end

end

end

Cell indexing:

function output = funcCell(dim,numSteps)

%FUNCCELL Function that stores vectors using a cell array

% Initialize the output and the vector

output = cell(numSteps,1);

vec0 = ones(dim,1);

% Choose a matrix

mat = rand(dim); mat = mat./sum(mat,1);

% Do repeated matrix multiplications

for k=1:numSteps

vec = mat*vec0;

vec0 = vec;

output{k} = vec; % Store the vector in a cell array

end

end

I first set numSteps=100. I then ran each function for various values of dim and I timed them. I did this multiple times to accumulate statistics on the average runtime.

The results are shown in the attached figure. I take the average runtime of the colon indexing as a point of reference, i.e., the two curves show the ratio of the average runtimes of funcLoop and funcCell to the average runtime of funcColon.

There are basically two regimes.

First, there is the regime dim > 1. There, we have the hierarchy:

runtime loop < runtime cell < runtime colon

Second, there is the regime dim < 1. There, we have the hierarchy:

runtime loop < runtime colon < runtime cell

Note that in the limit of large dim, all functions tend to the same runtime because the runtime becomes limited by the matrix multiplication. But for small dim, the runtime is limited by the indexing. I checked that it is indeed the case with the Run & Time feature.

QUESTION

I'm quite confused by this result. I would have assumed that colon indexing would always be faster because it is a syntax designed specifically for matlab arrays. What is colon indexing doing that is taking so much additional runtime? Even cell array indexing is more efficient than colon indexing for dim > 1.

This seems to imply that I could rewrite an overloaded custom colon indexing function using a for loop and have it be more performant than MATLAB's native colon indexing.

Am I missing something? What are your suggestions for best practices with a code like that?

Stephen Cobeldick
on 21 May 2020

Edited: Stephen Cobeldick
on 21 May 2020

"Am I missing something?"

1- accessing data in non-contiguous locations is slow. Consider your indexing output(k,:): because MATLAB stores arrays in column-major order, your indexing refers to lots of separate locations spread out through the entire array. Carefully going through the array, finding those locations, splitting up the input array and copying the parts to each separate location... is not efficient, and gets slower for larger arrays. You data design forces MATLAB to be slow.

If you access only contiguous blocks of memory you would find the code would be much faster: output(:,k)

2- MATLAB (probably) does not actually copy those arrays, so although you write that it "stores" the input array in a cell:

output{k} = vec; % Store the vector in a cell array

actually MATLAB (probably) just stores a pointer to the already existing input array, and creating one small pointer does not take much time at all. The original array (probably) does not move or get copied anywhere, until you write to any location in that array, at which point it will finally get copied... and in your code this never happens, so you are comparing apples with oranges here.

Read about copy-on-write:

Note that even if/when the input array is "copied" to the cell it would still be one contiguous array in memory!

"What are your suggestions for best practices with a code like that?"

- Consider column-major in your design: if efficiency is important store data in contiguous locations.
- Consider copy-on-write in your design.

Stephen Cobeldick
on 21 May 2020

"each element of a cell array occupies a contiguous memory location?"

Every cell contains a separate array, which are each stored in separate memory locations:

Matt J
on 21 May 2020

Edited: Matt J
on 21 May 2020

I checked that it is indeed the case with the Run & Time feature.

I don't know what that is, but the better test in any case is to rewrite the functions so that they contain only the indexing operations.

function output = funcColon(dim,numSteps,output,vec)

%FUNCCOLON Function that stores vectors using colon indexing

for k=1:numSteps

output(k,:) = vec(k,:); % Use colon indexing to store the results

end

end

function output = funcLoop(dim,numSteps,output,vec)

%FUNCLOOP Function that stores vectors using a loop

for k=1:numSteps

for j=1:dim

output(k,j) = vec(k,j); % Use a loop to store the results

end

end

end

function output = funcCell(dim,numSteps,output,vec)

%FUNCCELL Function that stores vectors using a cell array

for k=1:numSteps

output{k} = vec(k,:); % Store the vector in a cell array

end

end

The test code then gives me the following, less suprising results:

dim=20;

numSteps=1e5;

vec = rand(numSteps,dim);

output = zeros(numSteps,dim);

tColon = timeit(@()funcColon(dim,numSteps,output,vec)) %0.0173 seconds

output = zeros(numSteps,dim);

tLoop = timeit(@()funcLoop(dim,numSteps,output,vec)) %0.0207 seconds

output = cell(numSteps,1);

tCell = timeit(@()funcCell(dim,numSteps,output,vec)) %0.1120 seconds

Matt J
on 21 May 2020

To really know what's going on you should probably post your latest test code so that we can run it as well. Below are the results from my own test code which I have improved still further to exclude the memory allocation inherent in vec0(k,:). In almost all cases tLoop is slightly poorer than tColon. It is to be expected that tCell gets better and better because, as Stephen pointed out, it requires essentially no data copying. Be mindful however that if you make the variable output is held as a cell array, it may be relatively quick to build, but the memory access time to pull the data out later can be a lot slower, depending on numSteps.

Test code:

clear all

Dims=1:5:400;

for i=1:numel(Dims)

dim=Dims(i);

numSteps=3e3;

vec0 = num2cell(rand(numSteps,dim),2);

output = zeros(numSteps,dim);

tColon(i) = timeit(@()funcColon(dim,numSteps,output,vec0));

output = zeros(numSteps,dim);

tLoop(i) = timeit(@()funcLoop(dim,numSteps,output,vec0));

output = cell(numSteps,1);

tCell(i) = timeit(@()funcCell(dim,numSteps,output,vec0));

end

semilogy(Dims,tLoop./tColon,'r', Dims,tCell./tColon,'b');

legend('tLoop / tColon','tCell / tColon')

function output = funcColon(dim,numSteps,output,vec0)

%FUNCCOLON Function that stores vectors using colon indexing

for k=1:numSteps

output(k,:) = vec0{k}; % Use colon indexing to store the results

end

end

function output = funcLoop(dim,numSteps,output,vec0)

%FUNCLOOP Function that stores vectors using a loop

for k=1:numSteps

vec=vec0{k};

for j=1:dim

output(k,j) = vec(j); % Use a loop to store the results

end

end

end

function output = funcCell(dim,numSteps,output,vec0)

%FUNCCELL Function that stores vectors using a cell array

for k=1:numSteps

output{k} = vec0{k}; % Store the vector in a cell array

end

end

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.