Usage of structure as input of a function. Is it efficient ?

Hello Mathworks community,
I have a general questioning about the efficiency / relevance of using structures as function inputs. I'll try to be clear...
I'm working on a code that has around 300 variables. I find very convenient to gather / sort these variables as fields of a few (~5) structures to sort them by "type". That clears my workspace and allow me to make shorter function calls. For example, I have a structure Model that contains fields .param1 .param2 ... param40 (all different, can be scalars, small vectors or small 3D matrices).
So, when I want to make some computation, I just do :
[output] = some_function(Model, other_input) ;
Furthermore, I use in my code a lot of functions / sub-functions / sub-sub-functions... I know that this might not be the most relevant in MatLab, but I think I have a good reason to do so. First that simplifies the structure of my code since I have a lot of sequential actions, so having small divided tasks is way more easy to manage. And secondly I intend to translate the code in C using MatLab Coder to run on an embedded platform.
That scheme was fine for me, until I started working on optimization / computing time and realized that there was a big difference between the 2 solutions below:
function [output] = some_function(Model, other_input)
% [...]
% Solution 1
% (tic)
% (for i = 1:1e6)
output = some_sub_function(Model.param1, Model.param2, Model.param3, other_input) ;
% (end)
% (toc) ==> ~4 seconds
% Solution 2
P1 = Model.param1 ;
P2 = Model.param2 ;
P3 = Model.param3 ;
% (tic)
% (for i = 1:1e6)
output = some_sub_function(P1, P2, P3, other_input) ;
% (end)
% (toc) ==> ~0.4 seconds, 10 times faster !
% [...]
end
Since computing time is critical for my work, I start to doubt about the choices I made. The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...
I've struggled to find ressources online about that matter (maybe I didn't find the right keywords), so I am curious about advices or feedback you could have ?
Thanks !

1 comentario

Rik
Rik el 10 de Ag. de 2020
Do you have that many iterations in your real code? It looks to me like simply retrieving the fields and storing them in temporary variables for you function calls might already be the most important factor here.

Iniciar sesión para comentar.

 Respuesta aceptada

Bruno Luong
Bruno Luong el 10 de Ag. de 2020
Editada: Bruno Luong el 10 de Ag. de 2020
"I've struggled to find ressources online about that matter (maybe I didn't find the right keywords"
No, it's not your fault, TMW rarely makes any comment about such thing. They thing's it's not programmer business to understand how things work. I just dislike the way they hide the details for programmers.
I worked with MATLAB for about 20 years, and I did a lot of timings trying to understand how things work. Mex programming also helps since it exposes me to mxArray data structure and that gives the hints what behind MATLAB statements.
In your case, accessing the structure fields, MATLAB store the fieldnames as char-arrays and everytime you call
s.param1
it must goes throught all the list of fielnames of the structure, doing string comparison until it matches the string 'param1', then it returns the corresponding mxArray (structure field). That explains why it is not fast to use it within the for loop. If you move the out
p1 = s.param1
the fieldname matching is done outside the loop and p1 strores the mxArray ready to be used.
Personally I do a lot of function arguments like you, meaning few structures as input arguments (usulay one nested structure) and single output as output. Just do need to be careful with for-loop and prepare the body of for-loop to have very simple operations, and avoid calling functions, accessing structure fields, table, multiple-level indexing, etc.... and you'll be fine. You need also to pay attention to function/statement that make memory-copies of your data, this also kill the performance of MATLAB. Using object programming is also slow, so avoid them especially if they don't deal with large/complex data and comes with array. Most of the time that comes in the purest form as "vectorization" and the for-loop disappears entirely.

21 comentarios

James Tursa
James Tursa el 10 de Ag. de 2020
Editada: James Tursa el 10 de Ag. de 2020
s.param1
it must goes throught all the list of fielnames of the structure, doing string comparison until it matches the string 'param1', then it returns the mxArray corrsponding.
Unfortunately, it is more work than just that. The expression s.param1 creates a shared data copy of the param1 field variable. So an entire temporary mxArray structure (~120 bytes) is created behind the scenes each time this expression is encountered, possibly involving memory allocation. Same applies to cell array access. It is best to avoid these types of expressions in a loop if possible.
Bruno Luong
Bruno Luong el 10 de Ag. de 2020
Editada: Bruno Luong el 10 de Ag. de 2020
Are you sure James? I just test it and it seems the mxArray is not duplicated.
I enclose the mex and mfile codes I use to check the address of s.param1 and the mxArrayTag of field #0 and they are identical, so I conclude there is not data copying (R2020a)
>> s=struct('param1', [])
s =
struct with fields:
param1: []
>> PrintRhs0Addr(s.param1) % Address of s.param1
0xd77d4dc0
>> PrintStructFieldAddr(s) % (prhs[0]->data.struct_array.Data[0])
0xd77d4dc0
>>
James Tursa
James Tursa el 10 de Ag. de 2020
Editada: James Tursa el 10 de Ag. de 2020
I should have qualified that it can depend on MATLAB version and the size of the variable involved and how the expression is used. The rules for when to create shared data copies or reference copies for assignments and function arguments is not published (as you already know) and has changed multiple times in the past, and it is hard to keep up. An example test in Win64 R2017b:
>> format debug
>> s.param1 = 0
s =
struct with fields:
param1: 0
>> s.param1
ans =
Structure address = 47c590c0
m = 1
n = 1
pr = 1b08152a0
pi = 0
0
>> printAddress(s.param1)
0000000019DE8FE0
>> printStructFieldAddress(s)
0000000047C59670
So we have
47c590c0 = mxArray address of s.param1 at the m-file level
19DE8FE0 = mxArray address of s.param1 as passed into the mex function
47C59670 = mxArray address of s.param1 picked off of the data area of s directly
You can see that the original address of the s.param1 variable, 47C59670, does not match the other two addresses. So either a shared data copy or deep copy was made (I didn't check). It can be the case that MATLAB will make deep copies of scalars in some cases where otherwise it would make shared data copies.
And yet running a different example with a non-scalar field element yields:
>> s.param1 = 1:5
s =
struct with fields:
param1: [1 2 3 4 5]
>> s.param1
ans =
Structure address = 47c59910
m = 1
n = 5
pr = f6066ea0
pi = 0
1 2 3 4 5
>> printAddress(s.param1)
0000000047C592F0
>> printStructFieldAddress(s)
0000000047C59910
So we have
47c59910 = mxArray address of s.param1 at the m-file level
47C592F0 = mxArray address of s.param1 as passed into the mex function
47C59910 = mxArray address of s.param1 picked off of the data area of s directly
So the s.param1 expression at the m-file level produces the original address picked off the s data area directly, but when used as a function argument to a mex routine the address does not match indicating that a shared data copy was made. This behaviour of making shared data copies when field or cell elements are used in expressions or as function arguments is what I have observed in the past.
In a related note, it used to be that top-level workspace variables were passed as shared data copies to m-file function but passed as original addresses to mex functions. Some years ago (R2015b) even that changed so that shared data copies were passed to mex functions also. Details of when this happened can be found in my matlab_version FEX submission. See this link:
Complications in later versions of MATLAB can be that simple assignments at the m-file level can produce reference copies of variables instead of shared data copies in some cases when the variable is "small". Again, rules for when this happens are not published.
The mex functions:
// printAddress.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if( nrhs ) {
mexPrintf("%p\n",prhs[0]);
}
}
and
// printStructFieldAddress.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mxArray **vp;
if( nrhs && (vp = mxGetData(prhs[0])) ) {
mexPrintf("%p\n",*vp);
}
}
James Tursa
James Tursa el 10 de Ag. de 2020
Editada: James Tursa el 10 de Ag. de 2020
Similar results with R2019a (except for the bizarre [0 0 0 0 0] output!!! Maybe I should download the update)
>> format debug
>> s.param1 = 1:5
s =
struct with fields:
param1: [0 0 0 0 0]
>> s.param1
ans =
Structure address = ffaf3900
m = 1
n = 5
pr = 15d912860
1 2 3 4 5
>> printAddress(s.param1)
00000000FFA9FF90
>> printStructFieldAddress(s)
00000000FFAF3900
So we have
ffaf3900 = mxArray address of s.param1 at the m-file level
FFA9FF90 = mxArray address of s.param1 as passed into the mex function
FFAF3900 = mxArray address of s.param1 picked off of the data area of s directly
So the m-file expression by itself picked off the original address, but a copy gets passed into the mex routine.
And are those results same if you run the test inside a function?
James Tursa
James Tursa el 10 de Ag. de 2020
Editada: James Tursa el 10 de Ag. de 2020
Yes. E.g., code R2017b again
function junk44
format debug
disp('s.param1 = 0')
s.param1 = 0
disp('s.param1')
s.param1
disp('printAddress(s.param1)')
printAddress(s.param1)
disp('printStructFieldAddress(s)')
printStructFieldAddress(s)
disp('s.param1 = 1:5')
s.param1 = 1:5
disp('s.param1')
s.param1
disp('printAddress(s.param1)')
printAddress(s.param1)
disp('printStructFieldAddress(s)')
printStructFieldAddress(s)
end
and output
s.param1 = 0
s =
struct with fields:
param1: 0
s.param1
ans =
Structure address = 5adbeb0
m = 1
n = 1
pr = dbb2040
pi = 0
0
printAddress(s.param1)
0000000019E04B10
printStructFieldAddress(s)
0000000005ADBF20
s.param1 = 1:5
s =
struct with fields:
param1: [1 2 3 4 5]
s.param1
ans =
Structure address = 5adbeb0
m = 1
n = 5
pr = 46e45480
pi = 0
1 2 3 4 5
printAddress(s.param1)
0000000005ADBF20
printStructFieldAddress(s)
0000000005ADBEB0
Your codes and mine in R2020a return the same pointer address
>> s = struct('param1', rand(1,10));
>> PrintStructFieldAddr(s);
0xd77d5d20
>> PrintRhs0Addr(s.param1);
0xd77d5d20
>> JTursa_printAddress(s.param1);
0000023BD77D5D20
>> JTursa_printStructFieldAddress(s);
0000023BD77D5D20
Like I said ... it's hard to keep up with the rules changes! Looks like they have gone back to the pre-R2015b mex argument rules. You should check what happens with m-file functions also. E.g.,
s = struct('param1', rand(1,10));
PrintStructFieldAddr(s);
mfile_function(s.param1)
:
function mfile_function(x)
PrintRhs0Addr(x);
end
Same address printed inside mfile_function
Huh? How can that be? How can the mfile function do a copy-on-write for x if it is passed the original variable instead of a shared data copy?
This is exactly what I run
function testfieldaddress
s = struct('param1', rand(1,10));
PrintStructFieldAddr(s);
PrintRhs0Addr(s.param1);
JTursa_printAddress(s.param1);
JTursa_printStructFieldAddress(s);
mfile_function(s.param1);
end % testfieldaddress
function mfile_function(x)
PrintRhs0Addr(x);
end
and what I get
>> clear all
>> testfieldaddress
0xd77d6920
0xd77d6920
0000023BD77D6920
0000023BD77D6920
0xd77d6920 % <= this is PrintRhs0Addr(x) inside mfile_function(x)
>>
Maybe the parser is smart enough to recognize there are no changes. What if you change x inside the mfile function?
function mfile_function(x)
PrintRhs0Addr(x);
x(1) = 0;
PrintRhs0Addr(x);
end
Bruno Luong
Bruno Luong el 10 de Ag. de 2020
Editada: Bruno Luong el 10 de Ag. de 2020
Without runing I'm pretty sure it changes (copy-on-write). As you said the parser is smart. I confirm with you in 10 seconds.
EDIT after 10 seconds:
>> testfieldaddress
0xbb76a380
0xbb76a380
0000023BBB76A380
0000023BBB76A380
0xbb76a380
0xbb7683a0 % <= Yeah it changes.
James Tursa
James Tursa el 10 de Ag. de 2020
Editada: James Tursa el 10 de Ag. de 2020
But this is a fundamental change from how they used to do it. It used to be that they always passed in shared data copies of variables into mfile functions (or deep copies of scalars). Then there was no special logic needed inside the function ... the usual copy-on-write rules applied same as always. But now it looks like they are passing in the original addresses and keeping track off to the side whether it is an input argument. So the copy-on-write gets triggered because it is an input argument, not because it is a shared data copy of another variable. So, effectively MATLAB is treating the variable as a shared data copy off to the side. There is nothing in the variable itself (i.e., the CrossLink list) that would tell you this, making it impossible for a mex routine to detect even simple variable sharing just by looking at the mxArray details. I think I am going to have to download R2020a and experiment. Maybe arguments are now passed in as reference copies instead of shared data copies. Makes in place variable changes in a mex routine even more vulnerable than it used to be.
Bruno Luong
Bruno Luong el 11 de Ag. de 2020
Editada: Bruno Luong el 11 de Ag. de 2020
I have observed the "allocated on use" method implemented in few last versions of MATLAB. I believe there is an intelligent dataflow analysis and decision of creating/duplicating mxArray happens independent to cross-link of the underlying data.
"it must goes throught all the list of fielnames of the structure, doing string comparison until it matches the string 'param1',"
A couple of months ago I did some timing tests that disproved that, and demonstrated that hard-coded field names were more optimized than that. I thought I had posted the code but I am having searching for it. I will search my files when I get back to my desk
Possibly, assuming the structure created in the same workspace than the field reference.
But in OP's case the structure is passed as input argument, I cannot imagine how MATLAB can figure out the field order of "s.param1"
My timing tests were compatible with the hypothesis that the search results were cached for hardcoded field names, but I need to dig the code out and recheck what I was really testing. There were parts of the timing that I was unable to reach a hypothesis about.
James Tursa
James Tursa el 11 de Ag. de 2020
Editada: James Tursa el 12 de Ag. de 2020
A quick update about R2020a:
Function arguments now seem to be passed as Reference Copies instead of Shared Data Copies. So the copy-on-write logic still works as before ... it just works with different types of copies. Passing Reference Copies is more efficient since all you have to do is increment an integer in the mxArray instead of creating/populating a brand new mxArray header.
BUT ... the method of shared data copies has radically changed in the background. The Cross Link lists that were maintained in the mxArray header itself pointing you to all of the copies are no longer used. Looks like that stuff is now hidden somewhere. So maybe no way to track down all the copies like you could in previous versions.
Before the data sharing is at the data pointer under mxArray structure.
Now it seems it go up one level and the sharing is at the mxArray pointer itself.
May be they move up the "super-cross-link" on top root level and localed now somewhere as a master table handled by MATLAB engine and no loger accessible to users?
James Tursa
James Tursa el 12 de Ag. de 2020
Editada: James Tursa el 12 de Ag. de 2020
That's what I suspect. I haven't done exhaustive testing, but here is what I am seeing in the mxArray header:
R2019a
Reverse Cross Llink pointer spot, points to "previous" variable in shared data copy linked list
Cross Link pointer spot, points to "next" variable in shared data copy linked list
R2020a
Reverse Cross Link pointer spot, points to an integer that contains the number of shared data copies in existence.
Cross Link pointer spot, NULL
Not sure if the change happened in R2019b or R2020a ... will have to check this later.
You can still detect that it is a shared data copy by looking at that integer, but you can't find the copies because that linked list is no longer part of the mxArray header that I can see. I guess it is good that you can still detect data sharing in a mex routine (e.g., as a check prior to doing any inplace operation), but I prefer the old way where you can actually find the copies through the linked list.
As a bit of history, many years ago it used to be that shared data copies was always the method used for individual variable sharing and reference copies was the method used for cell and struct element sharing. Then they started using reference copies for individual variable sharing a few years ago. Now it seems they have taken it a step farther and are using reference copies for function arguments. Makes sense because reference copies are more efficient, but functions like reshape( ) and typecast( ) must still use the shared data copy method because stuff in the mxArray header changes.

Iniciar sesión para comentar.

Más respuestas (2)

Matt J
Matt J el 10 de Ag. de 2020
Editada: Matt J el 12 de Ag. de 2020
The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...
But if your code is doing such intricate things as to be "gigantic", then surely the struct access timings that you have shown us, whether 4 sec. or 0.4 sec, are going to be small overhead compared to the rest of your actual computation. Why care about a 4 sec. difference if the whole compuation takes 20 minutes or so?
If you really are in a situation where the main work of the function is 10 times faster than the work of unpacking the input from a struct, then I am suspicious if encapsulating those particular commands inside a function is really necessary and worthwhile. It means the function must be doing almost nothing. The overhead of even calling the function would be significant compared to the actual work that it does.

4 comentarios

Thanks for your replies.
"Do you have that many iterations in your real code? It looks to me like simply retrieving the fields and storing them in temporary variables for you function calls might already be the most important factor here."
"If you really are in a situation where the main work of the function is 10 times faster than the work of unpacking the input from a struct, then I am suspicious if encapsulating those particular commands inside a function is really necessary and worthwhile. It means the function must be doing almost nothing. The overhead of even calling the function would be significant compared to the actual work that it does. "
I have quite a lot of iterations in my code yes, since I try to reproduce a real-time algorithm on a large time-scale. For each iteration, I don't have "that many" actions, but still a few.
I can surely reduce a little bit the number of functions I use.
Personally I do a lot of function arguments like you, meaning few structures as input arguments (usulay one nested structure) and single output as output. Just do need to be careful with for-loop and prepare the body of for-loop to have very simple operations, and avoid calling functions, accessing structure fields, table, multiple-level indexing, etc.... and you'll be fine. You need also to pay attention to function/statement that make memory-copies of your data, this also kill the performance of MATLAB. Using object programming is also slow, so avoid them especially if they don't deal with large/complex data and comes with array. Most of the time that comes in the purest form as "vectorization" and the for-loop disappears entirely.
Noted.
I think my difficulty here is to understand when MatLab makes memory-copies and when it doesn't. And to evaluate whether this memory-copy is time consuming regarding to what the function does. That is not easy... The profiler for example is of no help for that.
"I think my difficulty here is to understand when MatLab makes memory-copies and when it doesn't".
Not only you, look at my exchange between James and I, R1017 vs R2020. It looks like MATLAB does more and more sophisticated memory management and it quite hard to follow.
Exactly.
Looking on the documentation I found :
"Copy-on-Write
If a function does not modify an input argument, MATLAB does not make a copy of the values contained in the input variable."
"Structures and Memory
Each structure member is treated as a separate array in MATLAB. This means that if you modify one member of a structure, the other members, which are unchanged, are not copied. It's time for an illustration here."
So this is comforting.
Another thing to consider, is that the overhead you see in Matlab may not be there in the final C/C++ version that you obtain from the Matlab Coder. I would expect struct indexing in C/C++ to be less encumbered, because the compiler has the opportunity to sort out in advance which fields are being accessed, and where the relevant memory is located.

Iniciar sesión para comentar.

Earlier I mentioned that I had done some timing tests. I was not able to find the code for that, so I created some new code.
In the below tests, on my R2020a Mac system, this is a summary of the results:
  • accessing a parameter and accessing a local variable are fastest, and indistinguishable in this test; legends for these are 'parameter' and 'local'. These do not access the structure passed in, and are there to give baselines to compare struct access against
  • Surprisingly, using a constant value is a bit slower and irregular timing. This does not make sense to me at the moment. (This is the first line created below, a blue line in the plot, legend 'constant')
  • Accessing a fixed field name of an input structure is roughly 25% slower than accessing a plain local or parameter, still quite fast. The timings for first (of 1000) fields or last of them in the struct are slightly different but quite close; legend entries 'Fstat' and 'Lstat'. Considering the other timing tests, we have to rule out the possibility that for repeated calls to get the same field that it has to search through the fieldname list every time. We cannot, however, currently rule out the possibility that the JIT is remembering the location after the first iteration
  • Accessing a struct field name dynamically takes roughly twice as long as accessing a static field name. The timings for first field in the struct, legend 'Fdyn', and last field in the struct, legend 'Ldyn', and middle field in the struct, legend 'MDyn', and random field in the struct, legend 'Rdyn', are pretty irregular and not at all well separated. The timings do not rule out the possibility that first (Fdyn) is slightly faster than the others, but the uncertainty is high.
  • If, hypothetically, structure reference required searching the fieldname list linearly each time, then Fstat (first static) and Fdyn (first dynamic) would be nearly indistuinguishable, and Lstat (last static) and Ldyn (last dynamic) would be nearly indistinguishable, and there would be a clear difference between the First and Last cases, but neither of these are true. If, hypothetically, the JIT is merely caching the last fieldname accessed under the hypothesis that it might be used again, with the same hypothetical linear search mechanism being used for both cases, then the large random variation for the dynamic access should be no more than the random variation for the static accesses, but the random variations are clearly quite different. Static fieldname access appears to be implemented through a different mechanism than dynamic fieldname access -- or at least is JIT'd quite differently.
  • In the functions, the assignments to local that appear in all functions are there in all tests in order to eliminate the possibility that creating the local variable was "significantly more work" for the test (legend 'local') that would only be done for that test. On the other hand, it could be the case that the JIT is discarding the initialization if the variable is not used, so a more careful test would make sure to force a "use" of the variable in a way that could not be JIT'd away. Similar logic to force use of the structure input and auxillary input should also be undertaken in a more thorough test
  • An analysis weakness of the current test is that it passes in the same struct each time, so we cannot at present speak to Bruno's point that it cannot know the offset of a particular static field name.
Anyhow: Yes, using a field from a struct that is passed in is pretty efficient compared to many other possibilities. Not as efficient as using a parameter. But a more throughout test would require passing in 1000 parameters and comparing timing to access "first" or "last" or "random" parameter, because hypothetically parameter names are not hard-wired to parameter number references. (The Mathworks documentation on speed of variable access does imply that named parameter access is hard-wired to a stack offset by the JIT, and does imply that parameter access could be faster than local variables.)
N = 100;
NFs = 1000;
S.walla = 1;
for K = 1 : NFs-2
name = char(randi(0+['a', 'z'], 1, 5));
S.(name) = K + 1;
end
S.bingo = NFs;
FN = fieldnames(S);
middle = FN{floor(end/2)};
Fs = {@() time_constant(S, 7), ...
@()time_parameter(S, 7), ...
@()time_local(S,7), ...
@()time_first_static(S, 'walla'), ...
@()time_last_static(S,'bingo'), ...
@()time_dynamic(S, 'walla'), ...
@()time_dynamic(S, 'bingo'), ...
@()time_dynamic(S, middle) };
Labels = {'constant', 'parameter', 'local', 'Fstat', 'Lstat', 'Fdyn', 'Ldyn', 'Mdyn', 'Rdyn'};
NFs = length(Fs);
times = zeros(N, NFs+1);
for J = 1 : NFs; F = Fs{J}; for K = 1 : N; times(K,J) = timeit(F, 0); end; end
for J = NFs+1; for K = 1 : N; lab = FN{randi(NFs)}; F = @()time_dynamic(S,lab); times(K,J) = timeit(F,0); end; end
plot(times);
legend(Labels);
function r = time_constant(S, const) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = -1;
end
function r = time_parameter(S, const) %#ok<INUSL>
local = -123; %#ok<NASGU>
r = const;
end
function r = time_local(S, const) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = local;
end
function r = time_first_static(S, label) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = S.walla;
end
function r = time_last_static(S, label) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = S.bingo;
end
function r = time_dynamic(S, label)
local = -123; %#ok<NASGU>
r = S.(label);
end

Categorías

Más información sobre Scope Variables and Generate Names en Centro de ayuda y File Exchange.

Productos

Versión

R2019b

Preguntada:

el 10 de Ag. de 2020

Editada:

el 12 de Ag. de 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by