# Multistart and lsqnonlin - Parallelization doesn't seem to provide any benefit.

14 views (last 30 days)
Soumitra Bhoyar on 19 Jun 2020
Commented: Soumitra Bhoyar on 19 Jun 2020
I'm using Multistart and Lsqnonlin to do curve fitting and get parameters.
A quick overveiw of the problem: I have a Numerical DiffEq solver (in the form of a mex file) that is operated through MATLAB, a function blackboxfunction(A,t, X). The DiffEq solver provides my 'model' data (from theory).
We perform experients with known experimental values 'A' and for a known time from 0 to t.
X are parameters that are unknown, but can be obtained by inverse fitting the 'model' data to the 'experimental' data. Both experimental and model data are plotted in the form of curves.
This is essentially a non-convex optimization problem: What are the parameters (X) such that the function @(X)blackboxfunction(A,t,X) - y_experimental(t) is minimized.
In other words - what parameters 'X' gets the experimental and model curves to overlap perfectly?
I have a system with 16 nodes.
%% An excerpt of my code:_________________
% Function definition
fun = @(X)blackboxfunction( ...
A, ...
time_start,time_stop, ...
X) ...
-Experimental;
% Set up Lsqnonlin options
options = optimoptions(@lsqnonlin,...
'Algorithm','trust-region-reflective', ...
'Display', 'iter', ...
'UseParallel', false, ...
'StepTolerance', 5e-6, ... Step-size stopping criterion
'FunctionTolerance', 1e-6, ... Function stoppping criterion
'TypicalX', TypicalX, ...
'FiniteDifferenceStepSize', 1e-4, ... % This seems to work fine, and faster
'DiffMinChange', 1e-6); %, ...
% Create optimization problem
problem = createOptimProblem('lsqnonlin', ...
'objective', fun, ...
'x0', initial, ...
'lb', lowlimit, ...
'ub', uplimit, ...
'options', options);
% Create and run MultiStart object
ms = MultiStart('FunctionTolerance',2e-4,'XTolerance',5e-3,...
'StartPointsToRun','bounds-ineqs', 'Display', 'iter', 'UseParallel', true);
[ms_params,ms_fval,ms_eflag,ms_output,ms_manymins] = run(ms, problem, 10)
My questions are:
1. UseParallel should be 'true' for MultiStart alone, or both for MultiStart and the optimoptions object?
2. Neither options above seem to be making the parallelization work. I don't have tic/toc data right now, but it's going no faster than when I did lsqnonlin for a single starting point. Is MultiStart with 'n' start points supposed to take as much time as n*lsqnonlin runs with one start point? (I suppose that makes sense, but I thought I'd still ask)
3. What is the meaning of the error below:
[Error: idasErrorHandler::183] In function 'IDASolve' of module 'IDAS', error code 'IDA_ILL_INPUT':
At t = 14.3214, , mxstep steps taken before reaching tout.
[Error: integrate::1378] IDASolve returned IDA_TOO_MUCH_WORK at t = 14.3214

Alan Weiss on 19 Jun 2020
I have to ask: do you have Parallel Computing Toolbox installed? It is required for MultiStart to run in parallel.
I do not understand the error that you show, but it seems to be an error thrown from your ODE solver. IIs the ODE being solved for the time interval you specify?
If you read parfor Characteristics and Caveats, you will see that you do not have to set parallel computing for your local optimizer, lsqnonlin in your case, but it doesn't matter whether you do or not, because MultiStart takes the outer parallel loop, and this disables parallel lsqnonlin.
Alan Weiss
MATLAB mathematical toolbox documentation

Soumitra Bhoyar on 19 Jun 2020
I'm not sure what's going wrong in the previous comment, but I can't seem to edit it properly.
parfor j = 1:size(paramspace,1)
%Parallelize the paramspace entry for each 'j ' - %%%%% PARAMSPACE_PAR = 'X', or parameter
paramspace_par = [paramspace(j,:,:), param_4]
% Solve the model
protein_model = Langmuir_run(...
paramspace_par); %%%%%%% THIS LINE CONTAINS THE POINT IN THE PARAMETER SPACE X(j)
%%% THERE IS NO OPTIMIZATION INVOLVED, we're just exploring the entire parameter space and
%%% plotting an output curve %%%.
%%%%% EVERYTHING BELOW IS SOME FORM OF OUTPUT %%%%%
% Get sum of squares
sum_of_sq(j,i) = sum_of_sq_par;
% Plot the solved model
figure
plot(...
% Figure Name
figname = ....
end
Alan Weiss on 19 Jun 2020
Good job testing the parallel configuration. That is exactly the kind of test that shows the evaluations can occur in parallel.
FYI, when you run lsqnonlin in parallel, the thing that is parallelized is the graident estimation, which is done by finite difference steps. Depending on the situation, this could be faster or slower when evaluating in parallel, because there can be significant communication overhead in parceling out the evaluation points and other data. However, in your case, where each evaluation takes a long time, it might be beneficial to parallelize this calculation even with no MultiStart involved.
MultiStart is usually beneficial to run in parallel because local optimizations usually take some time, and having multiple local optimizations runniing in parallel usually saves time. I still find it mysterious that you see no benefit to running MultiStart in parallel. Unless perhaps you were running lsqnonlin in parallel, and that saved a lot of time, and then when you switched to parallel MultiStart the lsqnonlin calculation is no longer in parallel, and the two just happened to balance out. But I really don't know. Perhaps you could test to see the benefit from running lsqnonlin in parallel and not in parallel on small cases.
For more details on what the solvers do when running in parallel, see What Is Parallel Computing in Optimization Toolbox? and MultiiStart.
I hope this helps.
Alan Weiss
MATLAB mathematical toolbox documentation
Soumitra Bhoyar on 19 Jun 2020
Thanks for the discussion!
>>Unless perhaps you were running lsqnonlin in parallel, and that saved a lot of time, and then when you switched to parallel MultiStart the lsqnonlin calculation is no longer in parallel, and the two just happened to balance out.
Indeed I was running lsqnonlin in parallel. I will check this!
Also, thank you for your other answers on Mathworks - by now I've read a good dozen of them.
In fact, I was thinking of trying out genetic algorithms, when I encountered a comment of yours somwhere, advising MultiStart instead.
What I shall do is evaluate what you said in the italic text above. Then, if I see no time-benefit by Multistart (despite everything working as it should), I'll give the Genetic Algorithm a try.
Best,
SB.