'parfor' seems to assign jobs of different loopvars to cores beforehand. How may I assign jobs dynamically to speed up calculation?
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Eugene Lyu
el 15 de Dic. de 2024
Comentada: Eugene Lyu
el 24 de Dic. de 2024
I am using parfor for speeding up calculation.
Lyaps=zeros(nsample,1);
flags=false(nsample,1);
parfor k=1:nsample
[Lyaps(k),flags(k)] = SympLargestLyap(SamplePoints(:,k),SympFx,opts);
end
For different 'k', the time for function 'SympLargestLyap (SamplePoints(:,k),SympFx,opts) ' to complete can vary a lot.
I find that when calculation for most k's has completed,
n = length(find(Lyaps==0));
'n' being much smaller than 'nsample', the calculation slows down.
Those remaining unfinished k's, are those whose
SympLargestLyap(SamplePoints(:,j),SympFx,opts)
takes a long time.
However, when calculation for most k's had completed and I checked my cpu, there were only three or four cores occupied, although parallel pool has 32 workers. It seems that the remaining 20s cores have fnished their job and are in rest.
May I assign the jobs for each core dynamically, so majority of them do not rest like this?
0 comentarios
Respuesta aceptada
Walter Roberson
el 15 de Dic. de 2024
and pass that parforOptions object to parfor()
This tells parfor to only allocate a single index to each worker, with the worker going back to ask for the next available task after performing the single iteration.
Normally parfor assigns chunks of indices to each worker, with the first chunk accounting for roughly 2/3 of the iterations, and the second chunk accounting for roughly 20% of the iterations, and then the remaining 10% assigned to single iterations.
When all of the iterations take roughly the same time, then the auto method works fine with minimal final waiting.
However, there is always the possibility that by chance one of the initial chunks happens to have a group of iterations that take much longer than average. For example you might be running through files but most of the files might be mostly empty and the few more substantive files might happen to be clustered near each other, so the worker that got that range of indices might take a long time while the other workers might all be fast.
By allocating SubrangeSize as 1 then you might still end up with single workers that take a long time, but you will not end up in a situation where there are multiple long iterations stuck on the same worker.
Más respuestas (0)
Ver también
Categorías
Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!