Will splitting a for loop up make code run quicker?
Mostrar comentarios más antiguos
I have some code where I read data from a file then perform some pretty heavy anaylsis on it. This is repeated for 650 files and the whole process is taking around 3 hours to run. The general code is:
for i = 1:650
file_name = sprintf('file%f', i);
load(file_name);
.......... %rest of code
end
As the same actions are performed independently on each file, I was wondering if there's anyway to split this for loop up so that the code can be ran on several files siultaneously instead of having to wait for the previous loop to finish. Would this even make my code run quicker? I've briefly heard about 'parallel computing' before, but I'm not too familar with the concept or how to implement it.
Thanks.
9 comentarios
per isakson
el 6 de En. de 2022
Yes, see: Parallel Computing Toolbox
Jan
el 6 de En. de 2022
load() without catching the inputs in a variable creates variables dynamically. This can slow down the processing by a factor of 100, because it impedes the JIT acceleration in a way comparable to the evil eval(). So prefer data=load(file_name) instead.
Use the profiler to finde the bottleneck of your code. If it is the load() command, the disk access needs more time than the processing. Then the parallel processing cannot accelerate the code substantially: It does not matter, if 1 or 8 threads are waiting for the slow disk.
If the import from the disk and the processing need the same amount of processing time, a parallel processing can save 50% of the total time.
S
el 7 de En. de 2022
S
el 7 de En. de 2022
KSSV
el 7 de En. de 2022
Points to consider:
- You can get all the files required using dir and then use the output in loop.
- Depending on how and what data the files has, you need to use a function.
- Speeding up also depends on what you are doing in the %rest of code. Unless this is known, we cannot comment on it.
S
el 8 de En. de 2022
Chunru
el 8 de En. de 2022
% The first thing you can try:
parfor i = 1:650
file_name = sprintf('file%f', i);
load(file_name);
.......... %rest of code
end
Stephen23
el 8 de En. de 2022
"Would this even make my code run quicker?"
No. Yes. Maybe. Who knows?
Any potential benefit of using parallel computation depends on many factors that we do not know about your code.
However, it is best to avoid trying to micro-manage MATLAB's JIT optimization: when beginners try to do something very cunning to speed things up it just gets in the way of JIT engine (which is written to speed up clearly-written, well-organized code).
Jumping onto the parallel-computation bandwagon is certainly not a alternative to writing better code, for example:
- replacing directly LOADing into the workspace with LOADing into an output variable.
- not saving your code in the same location as your data files, using absolute/relative filenames instead.
- avoiding variable name i.
- no doubt many other improvements in the code that you did not show us.
"I've briefly heard about 'parallel computing' before, but I'm not too familar with the concept or how to implement it"
Learning good MATLAB practices would be a much better use of your time, until you can find a bottle-neck that really can only be solved using parallel computation.
S
el 8 de En. de 2022
Respuestas (0)
Categorías
Más información sobre Loops and Conditional Statements en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!