Usage of structure as input of a function. Is it efficient ?
Mostrar comentarios más antiguos
Hello Mathworks community,
I have a general questioning about the efficiency / relevance of using structures as function inputs. I'll try to be clear...
I'm working on a code that has around 300 variables. I find very convenient to gather / sort these variables as fields of a few (~5) structures to sort them by "type". That clears my workspace and allow me to make shorter function calls. For example, I have a structure Model that contains fields .param1 .param2 ... param40 (all different, can be scalars, small vectors or small 3D matrices).
So, when I want to make some computation, I just do :
[output] = some_function(Model, other_input) ;
Furthermore, I use in my code a lot of functions / sub-functions / sub-sub-functions... I know that this might not be the most relevant in MatLab, but I think I have a good reason to do so. First that simplifies the structure of my code since I have a lot of sequential actions, so having small divided tasks is way more easy to manage. And secondly I intend to translate the code in C using MatLab Coder to run on an embedded platform.
That scheme was fine for me, until I started working on optimization / computing time and realized that there was a big difference between the 2 solutions below:
function [output] = some_function(Model, other_input)
% [...]
% Solution 1
% (tic)
% (for i = 1:1e6)
output = some_sub_function(Model.param1, Model.param2, Model.param3, other_input) ;
% (end)
% (toc) ==> ~4 seconds
% Solution 2
P1 = Model.param1 ;
P2 = Model.param2 ;
P3 = Model.param3 ;
% (tic)
% (for i = 1:1e6)
output = some_sub_function(P1, P2, P3, other_input) ;
% (end)
% (toc) ==> ~0.4 seconds, 10 times faster !
% [...]
end
Since computing time is critical for my work, I start to doubt about the choices I made. The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...
I've struggled to find ressources online about that matter (maybe I didn't find the right keywords), so I am curious about advices or feedback you could have ?
Thanks !
1 comentario
Rik
el 10 de Ag. de 2020
Do you have that many iterations in your real code? It looks to me like simply retrieving the fields and storing them in temporary variables for you function calls might already be the most important factor here.
Respuesta aceptada
Más respuestas (2)
The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...
But if your code is doing such intricate things as to be "gigantic", then surely the struct access timings that you have shown us, whether 4 sec. or 0.4 sec, are going to be small overhead compared to the rest of your actual computation. Why care about a 4 sec. difference if the whole compuation takes 20 minutes or so?
If you really are in a situation where the main work of the function is 10 times faster than the work of unpacking the input from a struct, then I am suspicious if encapsulating those particular commands inside a function is really necessary and worthwhile. It means the function must be doing almost nothing. The overhead of even calling the function would be significant compared to the actual work that it does.
4 comentarios
Antoine Laurin
el 11 de Ag. de 2020
Bruno Luong
el 11 de Ag. de 2020
"I think my difficulty here is to understand when MatLab makes memory-copies and when it doesn't".
Not only you, look at my exchange between James and I, R1017 vs R2020. It looks like MATLAB does more and more sophisticated memory management and it quite hard to follow.
Antoine Laurin
el 11 de Ag. de 2020
Matt J
el 11 de Ag. de 2020
Another thing to consider, is that the overhead you see in Matlab may not be there in the final C/C++ version that you obtain from the Matlab Coder. I would expect struct indexing in C/C++ to be less encumbered, because the compiler has the opportunity to sort out in advance which fields are being accessed, and where the relevant memory is located.
Walter Roberson
el 12 de Ag. de 2020
Earlier I mentioned that I had done some timing tests. I was not able to find the code for that, so I created some new code.
In the below tests, on my R2020a Mac system, this is a summary of the results:
- accessing a parameter and accessing a local variable are fastest, and indistinguishable in this test; legends for these are 'parameter' and 'local'. These do not access the structure passed in, and are there to give baselines to compare struct access against
- Surprisingly, using a constant value is a bit slower and irregular timing. This does not make sense to me at the moment. (This is the first line created below, a blue line in the plot, legend 'constant')
- Accessing a fixed field name of an input structure is roughly 25% slower than accessing a plain local or parameter, still quite fast. The timings for first (of 1000) fields or last of them in the struct are slightly different but quite close; legend entries 'Fstat' and 'Lstat'. Considering the other timing tests, we have to rule out the possibility that for repeated calls to get the same field that it has to search through the fieldname list every time. We cannot, however, currently rule out the possibility that the JIT is remembering the location after the first iteration
- Accessing a struct field name dynamically takes roughly twice as long as accessing a static field name. The timings for first field in the struct, legend 'Fdyn', and last field in the struct, legend 'Ldyn', and middle field in the struct, legend 'MDyn', and random field in the struct, legend 'Rdyn', are pretty irregular and not at all well separated. The timings do not rule out the possibility that first (Fdyn) is slightly faster than the others, but the uncertainty is high.
- If, hypothetically, structure reference required searching the fieldname list linearly each time, then Fstat (first static) and Fdyn (first dynamic) would be nearly indistuinguishable, and Lstat (last static) and Ldyn (last dynamic) would be nearly indistinguishable, and there would be a clear difference between the First and Last cases, but neither of these are true. If, hypothetically, the JIT is merely caching the last fieldname accessed under the hypothesis that it might be used again, with the same hypothetical linear search mechanism being used for both cases, then the large random variation for the dynamic access should be no more than the random variation for the static accesses, but the random variations are clearly quite different. Static fieldname access appears to be implemented through a different mechanism than dynamic fieldname access -- or at least is JIT'd quite differently.
- In the functions, the assignments to local that appear in all functions are there in all tests in order to eliminate the possibility that creating the local variable was "significantly more work" for the test (legend 'local') that would only be done for that test. On the other hand, it could be the case that the JIT is discarding the initialization if the variable is not used, so a more careful test would make sure to force a "use" of the variable in a way that could not be JIT'd away. Similar logic to force use of the structure input and auxillary input should also be undertaken in a more thorough test
- An analysis weakness of the current test is that it passes in the same struct each time, so we cannot at present speak to Bruno's point that it cannot know the offset of a particular static field name.
Anyhow: Yes, using a field from a struct that is passed in is pretty efficient compared to many other possibilities. Not as efficient as using a parameter. But a more throughout test would require passing in 1000 parameters and comparing timing to access "first" or "last" or "random" parameter, because hypothetically parameter names are not hard-wired to parameter number references. (The Mathworks documentation on speed of variable access does imply that named parameter access is hard-wired to a stack offset by the JIT, and does imply that parameter access could be faster than local variables.)
N = 100;
NFs = 1000;
S.walla = 1;
for K = 1 : NFs-2
name = char(randi(0+['a', 'z'], 1, 5));
S.(name) = K + 1;
end
S.bingo = NFs;
FN = fieldnames(S);
middle = FN{floor(end/2)};
Fs = {@() time_constant(S, 7), ...
@()time_parameter(S, 7), ...
@()time_local(S,7), ...
@()time_first_static(S, 'walla'), ...
@()time_last_static(S,'bingo'), ...
@()time_dynamic(S, 'walla'), ...
@()time_dynamic(S, 'bingo'), ...
@()time_dynamic(S, middle) };
Labels = {'constant', 'parameter', 'local', 'Fstat', 'Lstat', 'Fdyn', 'Ldyn', 'Mdyn', 'Rdyn'};
NFs = length(Fs);
times = zeros(N, NFs+1);
for J = 1 : NFs; F = Fs{J}; for K = 1 : N; times(K,J) = timeit(F, 0); end; end
for J = NFs+1; for K = 1 : N; lab = FN{randi(NFs)}; F = @()time_dynamic(S,lab); times(K,J) = timeit(F,0); end; end
plot(times);
legend(Labels);
function r = time_constant(S, const) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = -1;
end
function r = time_parameter(S, const) %#ok<INUSL>
local = -123; %#ok<NASGU>
r = const;
end
function r = time_local(S, const) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = local;
end
function r = time_first_static(S, label) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = S.walla;
end
function r = time_last_static(S, label) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = S.bingo;
end
function r = time_dynamic(S, label)
local = -123; %#ok<NASGU>
r = S.(label);
end
Categorías
Más información sobre Scope Variables and Generate Names en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!