Borrar filtros
Borrar filtros

fmincon with additional parameters + user-defined gradient + user-defined hessian

5 visualizaciones (últimos 30 días)
Hey guys, I would like to know if there is a way to work with this combination in an optimization problem: multiple parameters (independent variable + other constants) and using user-defined gradient and hessian, included as output of the objective function.
Please, do not focus on the problem equations, equation details; from my perspective this is an I/O kind of syntax problem. The gradient and hessian are calculated inside the function and will look like this:
function [fobj, grad, hess]= optimization_nlc(x, pexp, n)
fobj = 0;
grad = zeros(n, 1);
hess = sparse(n, 1);
for i = 1:n
fobj = fobj - polyval(pexp{i}, x(i));
grad(i) = polyval(polyder(pexp{i}, 1), x(i));
hess(i) = polyval(polyder(pexp{i}, 2), x(i));
end
hess = diag(hess);
Just for you to have an idea, pexp is a cell array with polynomial coefficients. This cell array will change for every problem (number of terms, number of polynomial, ...), and it is not possible to read it from a .mat file inside the function.
The objective function is a combination (sum) of polynomials and each polynomial depends on only one variable (p1 -> f(x(1)), p2 -> f(x(2)), and so). This is why the gradient and hessian are formulated like this.
I am trying to use the approach of anonymous function but it does not work:
function [x, fval, flag, elapsedTime] = optimization_nlcf(pexp, n, Qdlm, options)
if ~isempty(Qdlm)
A = ones(1, n);
b = Qdlm;
else
A = [];
b = [];
end
x0 = 50*ones(n, 1);
lb = zeros(n, 1);
fobj = @(x) optimization_nlc(x, pexp, n);
tic
[x, fval, flag] = fmincon(fobj, x0, A, b,[],[], lb, [], [], options);
x = sum(x);
elapsedTime = toc;
Any ideas? I would appreciate your comments. I am using the trust-region algorithm, since I see that I cannot use the interior-point with user-defined gradient/hessian.
The optimization problem should be fairly simple.

Respuesta aceptada

Ruben Ensalzado
Ruben Ensalzado el 27 de Jun. de 2016
Using the information from previous comments, the solution was as follows:
The problem is an optimization application of separable objective functions, depending on just one variable each. In this case, each objective function can be approximated by a polynomial expression of d-th order. The problem data is stored in a matrix mx2n, as sets of (x, y) points, where n is the number of separable fuctions, and m the number of points.
In my application, n = 100, and d = 3. The problem includes a constraint: sum(x) <= Qdlm. If given, Qdlm is used. Also, the lower boundary of xi = 0 (non-negative values), using a starting value of 50 for each variable.
Gradient and Hessian calculation is given by the polynomial derivatives.
These are my functions for this application:
d = 3;
n = 100;
pexp = cell(n, 1);
for i = 1:n
pexp{i} = polyfit(M(:, 2*i), M(:, 2*i - 1), d);
end
Optimization auxiliary funcion:
function [x, tx, fval, exitflag, elapsedTime] = optimization_nlcf(pexp, n, Qdlm)
if ~isempty(Qdlm)
A = ones(1, n);
b = Qdlm;
else
A = [];
b = [];
end
x0 = 50*ones(n, 1);
lb = zeros(n, 1);
fobj = @(x) optimization_nlc(x, pexp, n);
fhes = @(x, lambda) hessian_nlc(x, lambda, pexp, n);
options = optimoptions('fmincon', 'Algorithm', 'interior-point',...
'GradObj', 'on', 'Hessian', 'user-supplied', 'HessFcn',fhes, ...
'TolFun', 1e-3, 'TolX', 1e-3, 'Display', 'iter-detailed');
tic
[x, fval, exitflag, output, lambda, grad, hessian] = fmincon(fobj, x0, A, b, [], [], lb, [], [], options);
tx = sum(x);
elapsedTime = toc;
My optimization function is:
function [fobj, grad] = optimization_nlc(x, pexp, n)
fobj = 0;
for i = 1:n
fobj = fobj - polyval(pexp{i}, x(i));
end
if nargout > 1
grad = zeros(n, 1);
for i = 1:n
grad(i) = -polyval(polyder(pexp{i}, 1), x(i));
end
end
My hessian function:
function hess = hessian_nlc(x, ~, pexp, n)
hess = sparse(n, 1);
for i = 1:n
hess(i) = -polyval(polyder(pexp{i}, 2), x(i));
end
hess = diag(hess);
Hope this simple application could help others to implement their objective functions.

Más respuestas (2)

Walter Roberson
Walter Roberson el 16 de Mayo de 2016
Editada: Walter Roberson el 16 de Mayo de 2016
Your objective has to test the number of output parameters. Here is a utility I put together that takes x and the function handles for objective, gradient, and hessian, and returns appropriately:
function [F,g, h] = FunGrad(x, Fun, Grad, Hess)
F = Fun(x);
if nargout > 1
g = Grad(x);
end
if nargout > 2
h = Hess(x);
end
end
Any of the function handles you pass can be parameterized.
You would use @(x) FunGrad(x, obj, grad, hess) as the first parameter to fmincon, where obj, grad, hess are appropriate function handles.
It would be perfectly valid to instead use a single function that calculated all three parts.
Example:
Frv, Frg, Frh were function handles
Frm = @(x) FunGrad(x, Frv, Frg, Frh);
options = optimoptions('fmincon', 'Algorithm', 'trust-region-reflective', 'SpecifyObjectiveGradient', true, 'HessianFcn', 'objective', 'StepTolerance', 1e-20, 'FunctionTolerance', 1e-5, 'MaxIterations', 5000, 'MaxFunctionEvaluations', 5000);
problem = createOptimProblem('fmincon', 'objective', Frm, 'x0', [1000 -500 -4000 -2500 -4000 -1 -2000 -1], 'options', options);
[x,f] = fminunc(problem);
  1 comentario
Ruben Ensalzado
Ruben Ensalzado el 27 de Jun. de 2016
Walter, Thanks for your comments. I finished my thesis and this information was pretty useful. I will publish at the end of the threat the application details.

Iniciar sesión para comentar.


Alan Weiss
Alan Weiss el 17 de Mayo de 2016
In general, you might want to use the 'interior-point' algorithm, because it can handle a wider variety of problem types than the 'trust-region-reflective' algorithm. You must give the input Hessian differently for the two algorithms, as the documentation describes.
It seems to me that the issue of including gradients and Hessians is unrelated to the issue of passing extra parameters. I suggest that you first get your derivatives in order, including all the requisite options, and then figure out how to include extra parameters. I suppose that you have read Passing Extra parameters. It is hard for me to add to that explanation, though I realize that things can sometimes look more complicated. But once you take an abstract enough point of view (what are inputs? what are extra parameters?), then it usually becomes clear quite quickly.
Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by