Will splitting a for loop up make code run quicker?

I have some code where I read data from a file then perform some pretty heavy anaylsis on it. This is repeated for 650 files and the whole process is taking around 3 hours to run. The general code is:
for i = 1:650
file_name = sprintf('file%f', i);
load(file_name);
.......... %rest of code
end
As the same actions are performed independently on each file, I was wondering if there's anyway to split this for loop up so that the code can be ran on several files siultaneously instead of having to wait for the previous loop to finish. Would this even make my code run quicker? I've briefly heard about 'parallel computing' before, but I'm not too familar with the concept or how to implement it.
Thanks.

9 comentarios

Jan
Jan el 6 de En. de 2022
load() without catching the inputs in a variable creates variables dynamically. This can slow down the processing by a factor of 100, because it impedes the JIT acceleration in a way comparable to the evil eval(). So prefer data=load(file_name) instead.
Use the profiler to finde the bottleneck of your code. If it is the load() command, the disk access needs more time than the processing. Then the parallel processing cannot accelerate the code substantially: It does not matter, if 1 or 8 threads are waiting for the slow disk.
If the import from the disk and the processing need the same amount of processing time, a parallel processing can save 50% of the total time.
S
S el 7 de En. de 2022
@per isakson thanks, I'll look into it
S
S el 7 de En. de 2022
@Jan Okay thanks, I'll try this.
KSSV
KSSV el 7 de En. de 2022
Points to consider:
  1. You can get all the files required using dir and then use the output in loop.
  2. Depending on how and what data the files has, you need to use a function.
  3. Speeding up also depends on what you are doing in the %rest of code. Unless this is known, we cannot comment on it.
S
S el 8 de En. de 2022
@KSSV Thanks, I tried point number 1 and it helped significantly in reducing time
% The first thing you can try:
parfor i = 1:650
file_name = sprintf('file%f', i);
load(file_name);
.......... %rest of code
end
"Would this even make my code run quicker?"
No. Yes. Maybe. Who knows?
Any potential benefit of using parallel computation depends on many factors that we do not know about your code.
However, it is best to avoid trying to micro-manage MATLAB's JIT optimization: when beginners try to do something very cunning to speed things up it just gets in the way of JIT engine (which is written to speed up clearly-written, well-organized code).
Jumping onto the parallel-computation bandwagon is certainly not a alternative to writing better code, for example:
  1. replacing directly LOADing into the workspace with LOADing into an output variable.
  2. not saving your code in the same location as your data files, using absolute/relative filenames instead.
  3. avoiding variable name i.
  4. no doubt many other improvements in the code that you did not show us.
"I've briefly heard about 'parallel computing' before, but I'm not too familar with the concept or how to implement it"
Learning good MATLAB practices would be a much better use of your time, until you can find a bottle-neck that really can only be solved using parallel computation.
S
S el 8 de En. de 2022
@Stephen Thanks for the advice, I'll take it on board. Unfortunately, I can't share my code as it's for a college project.
I ended up downloading the Parallel Computing Toolbox, and simply changing my 'for' loop into a 'parfor' loop resulted in my program becoming 3 times quicker, so thanks for the recommendation @per isakson

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Loops and Conditional Statements en Centro de ayuda y File Exchange.

Etiquetas

Preguntada:

S
S
el 6 de En. de 2022

Comentada:

S
S
el 8 de En. de 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by