How to combine parfor & Parallel Optimization ?

6 visualizaciones (últimos 30 días)
nah
nah el 21 de Jun. de 2012
I am new to Matlab Parallel computing.
Am working on an optimization problem using fminsearch. The objective function is a complicated function that takes a long time to process a long trajectory data and so I have used parfor to reduce the processing time.
I cut the long trajectory data into many shorter fragments & do the expensive calculations after distributing them via parfor to many workers and later combining the many results into a single scalar for fminsearch to work. This parallelization is working well.
The obj. function also has some fixed parameters and I want to do a parameter sweep over them. How do I achieve this with parallelization ?
As I understand:
  1. No nested parfors ie., If I run the objective function under an outer parfor, it is going to run serially.
  2. fminsearch doesn't obey 'UseParallel' option. Even if I use a parallel minimizer, the above problem applies
So, this seems like an insurmountable issue for me. Kindly help with your suggestions.
%%%%%%% Pseudo- Code Example:
load trajectoryData % a [30000x2],; %timeseries data
fixParams = [fp1; fp2;fp3 fp4]; % but i want to do this over a vector of fp3 & fp4 ;
matlabpool start 72
[optparams,fval~,~] = fminsearch('objfun', iniGuess, options,trajectoryData, fixParams)
matlabpool close
% objFunc.m
[objVal] = objFunc(params, trajectoryData,fixParams)
[a b] = size(trajectoryData) ; %b = 2 always
% split into fragments of 1000 points
fragsMat = reshape(trajectoryData,1000, a*2/1000) ; % (or anywhich way)
parfor ix = 1: numFragments
% do heavy calculations
costVal(ix) = costValFrag;
end
objVal = sum(costVal) ; % just an example;
%%%%%%
My configuration:
I have a cluster with 128 workers (with torque manager).
% Things am thinking of:
  • do i do a dfeval ?
  • do i create a batch job for each fixParam set & launch it.
(i don't know how to do these)
main problems:
  • Do I need to run multiple matlabpools, in that case ?
Thanks for your help.

Respuestas (3)

Walter Roberson
Walter Roberson el 21 de Jun. de 2012
It is no directly nested parfor. You can have a parfor call a function which has a parfor in it.
I have been getting mixed messages from the Mathworks people about whether pools can be nested or not.
Darn, that's the only relevant posting I can come up with at the moment. I know we have had discussions about this in the past, but I cannot seem to locate them :(
  4 comentarios
Walter Roberson
Walter Roberson el 22 de Jun. de 2012
PARFOR and SPMD use all *allocated* parallelism immediately. For example if you have 8 cores and your matlabpool is 5 cores, then PARFOR and SPMD use those 5, but do not use the other 3 as well even though they are "available" in some sense. Some of the past discussions have suggested that if there were unallocated nodes then a worker within a MATLABPOOL could allocate more. I gather from this current discussion that Mathworks is now saying, No, that cannot be done.
I recall there have been past discussions about automatic parallelism (e.g., LAPACK called by MATLAB) in workers. The discussions mostly tended to NO but some circumstances were left unclear, and there was a circumstance involving DCS in which (if I recall) it was said that it could happen.
(I recall that about 8 months or so ago I sent one of the Mathworks people email pointing out some conflicting words in discussions, and asking for clarification, but unfortunately no answer was forthcoming. I do remember now whom I sent the email to, so I _might_ be able to dig it up from my mail client to review what it was I found unclear at that time.)
Walter Roberson
Walter Roberson el 25 de Jun. de 2012
I seem to be finding conflicting information on this topic.
http://www.mathworks.com/help/toolbox/distcomp/brukbnp-9.html#brukbnp-12
"The body of an spmd statement cannot contain another spmd. However, it can call a function that contains another spmd statement. Be sure that your MATLAB pool has enough workers to accommodate such expansion."

Iniciar sesión para comentar.


Adam Filion
Adam Filion el 21 de Jun. de 2012
You cannot directly nest a parfor loop inside of another as the MATLAB parser will catch it and throw an error before attempting to run it. If you nest a parfor loop inside of a function that is called by another parfor loop then it will run without errors.
However, the MATLAB workers are started as single threaded processes, so they cannot parallelize anything. Note that applies to both parfor and built-in parallel capabilities like fft's.
So if you nest a second parfor inside of a function then it will run because it doesn't get caught by the MATLAB parser, but it isn't actually running in parallel, it just runs serially as if you had called parfor on your client session without opening a matlabpool.
I believe there is a way to start the workers as multithreaded processes to enable the built-in parallel capabilities, but I can't find it at the moment. However even then workers still couldn't start their own matlabpool so they couldn't run the nested parfor in parallel.
So for your particular use case you are going to have to choose between running the inner portions of the objective function in parallel or running the outer parameter sweep in parallel, but not both. You would need to test it out to determine which helps you more.
  4 comentarios
Walter Roberson
Walter Roberson el 22 de Jun. de 2012
Adam, is that approach tested or hypothetical ?
Adam Filion
Adam Filion el 22 de Jun. de 2012
Hi Walter, I don't have enough cores to really test this myself, but the answer we got from our developers a while back was that this should work.

Iniciar sesión para comentar.


nah
nah el 29 de Jun. de 2012
Since running parfor inside a function called under an outer parfor is no good (it only runs it serially), the solution I have adopted now to implement the parameter sweep is based on the comment Walter pointed to (Jiro's).
It is basically creating multiple jobs each with a given set of input arguments and submitting them.
%%%%%%%%%%%%%%%%%%%%%%%%%
sched = findResource('scheduler','type','torque');
sched.ClusterSize = 144;
sched.HasSharedFilesystem = true;
sched.ClusterMatlabRoot = '/storage/shares/matlabr2011b/';
sched.ResourceTemplate = '-l nodes=1:ppn=12,mem=1gb';
jobStart=tic; counter = 1;
for ix = 1:length(dfVect) for jx = 1:length(dexVect)
dfin = dfVect(ix); dexin = dexVect(jx);
iniGuessInputs = [1.8 6 dfVect(ix) dexVect(jx)];
argsIn = {iniGuessInputs,timeseriesfrag,N,T,Roin,Cmin,mmin};
job(counter) = createParallelJob(sched);
set(job(counter),'FileDependencies'{'runOptimizationForGivenIniGuesses.m','obj_func_with_parfor.m'})
set(job(counter),'MaximumNumberOfWorkers',12,'MinimumNumberOfWorkers',8);
createTask(job(counter),@runOptimizationForGivenIniGuesses,numArgsOut,argsIn);
counter = counter + 1;
end
end
%% for id = 1:counter-1
submit(job(id))
end
timeSubmission=toc(jobStart)
for id = 1:counter-1
waitForState(job(id), 'finished');
results{id} = getAllOutputArguments(job(id));
end
timeCompletion=toc(jobStart)
destroy(jm);
%%%%%%%%%%%%%%%%
%runOptimizationForGivenIniGuesses.m
%sets up a fminsearch optimization that uses obj_func_with_parfor.m as the cost/objective function and returns the RESULTS (for given parameters)
% obj_func_with_parfor.m % takes a given longer time_series data & uses parfor to calculate costs on the smaller fragments in parallel
%%%%%%%%%%%%%
Please let me know if you find any alternative or have suggestions for improving this. Thanks Walter & Adam for your earlier answers and comments.
  1 comentario
nah
nah el 29 de Jun. de 2012
MultiStart from Global Optimization Toolbox is supposed to do exactly this (Optimize from different starting points) but my cost function also has some fixed parameters I want to sweep on.
As well MultiStart is restriced to few solvers (fmincon) & the above method will work for any trivially parallel Optimization problem (or for that matter, to parallelize any function that already uses parfor inside it )
So, I think this as a general solution for:
1) nested parfor
2) parallelization of functions that uses parfor inside them.
3) parallelizing Optimization problems that needs parallelization also inside their objective functions. (Parallel Optimization with parfor)

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by