Find zero of function with least amount of iterations

Hi there
I have a function that takes a very long time to calculate (can be several hours), and I need to find when it becomes zero, with a tolerance on x of around 1e-5, and tolerance of the zero of maybe 1e-7. Typical values are x between 1 and 6, and y between -0.01 and 0.01.
I can use fzero to do what I want (and with quite few function calls, I might add - 13 in one test, 8 when I reduced TolX), but I wonder if there exists a function that can do this with even fewer function calls than fzero? I think t will be well worth it to spend some time implementing this to save hours in the end, so it does not have to be a simple solution (although simpicity is preferred, of course!)
I do not believe the function can be optimized significantly to save computation time - it requires a lot of calculations, and I have collegues who do similar stuff in FORTRAN and C who also require hours to do their calculations.
Thanks in advance
Henrik
EDIT: changed iterations to function calls as pointed out by Matt and Mohammad. There is more clarification in my comments below.

9 comentarios

Without knowing more about your function it’s impossible to say for sure, but two options I would suggest are polyfit and its friends (including roots), and possibly interp1.
Henrik
Henrik el 11 de Oct. de 2014
OK, thanks, I will take a look at those.
I could upload the function if you want, but I doubt it will aid much. In short, I calculate the eigenvalues of a matrix, where each term in the matrix is the result of a matrix product of a few other matrices and some sums of several larger matrices. This all depends on around 10 parameters, and I need to know when the lowest eigenvalue of the matrix is zero for one of the parameters.
As I understand your Comment, you are iterating over parameters and then calculating eigenvalues at each value of the parameters.
You could consider this as a linear or nonlinear multiple regression problem with the parameters as the independent variable and the eigenvalues as the dependent variable, then solve the inverse of the regression for zero to give you the estimated values of the parameters that would produce one or more zero eigenvalues.
That requires several assumptions (continuity among them), the possibility of having to account for complex eigenvalues, and in the nonlinear situation an objective function that describes the eigenvalues as a function of the parameters. Those would determine your approach. I assume you have already plotted each of the eigenvalues as functions of the parameters, so you know their essential trends.
If you decide that a regression is appropriate, for the multiple linear regression I would use the Statistics Toolbox regress function, and for the nonlinear regression, either nlinfit or the Optimization Toolbox function lsqcurvefit.
I am only suggesting an approach to your problem. I cannot be more specific or provide example code, so I am not entering this as an Answer.
Henrik
Henrik el 11 de Oct. de 2014
Thanks a lot, but I am not sure it can be used. Here's a more complete description of the function, although I still left out some complications. Roughly the following is done in a loop over a parameter around 250 times, let's call this parameter q.
Generate two matrices (typically 16x16) from q and x and around 10 user-defined parameters that stay constant throughout the calculations.
Find the eigenvectors of these matrices, let's call them u and v (they are matrices).
Create a new matrix, M, where the (i,j,n,m)'th element is u(i,n)*v(j,m)
Multiply this matrix with another matrix of same size (which also depends on q,x, and all the other parameters) and sum over n and m, to get a matrix, f.
Generate a new matrix, N=diag(1)-x*f, find the eigenvalues of N, E_N, and determine the lowest eigenvalue of N, E(q)=min(E_N). Now repeat the above for all ~250 values of q, and determine the lowest value of E(q), E_min=min(E(q)).
For which value of x is E_min=0?
As you see, it's such a convoluted calculation that it's very hard to analytically figure out what the correspondence is between the input, x, and the output, E_min.
x is a real, positive number, and the eigenvalues are real as well. I know that E_min>0 for small x (x<1), and for intermediate x (somewhere around x=2-3) E_min becomes negative, and for some larger x, E_min can become positive again (around x=8). I'm looking for the point in the intermediate range where E_min=0.
fzero can do this surprisingly fast, with only around 8-13 iterations in my tests so far, but this can still take half a day, so shaving off a few iterations would be very nice. That's why I'm wondering if there is a function that does the same thing, but with fewer iterations. I don't care if the function itself is more complicated than fzero, or takes much longer to run than fzero, if it means it has to call my function fewer times.
I apologise for oversimplifying a quite complicated procedure. The fzero function may be the best option available for your problem.
The only other options that I’m aware of are the other Optimization Toolbox functions (like fsolve) or Global Optimization Toolbox functions, but I can’t say that they would be faster or more efficient. I suggest you consider them if you haven’t. I still don’t understand your problem well enough to suggest a particular function.
I hope others here see this and can offer a more effective solution.
Henrik
Henrik el 11 de Oct. de 2014
Thanks a lot for the suggestions, they are much appreciated. I will update if I find something that works better than fzero.
My pleasure!
Matt J
Matt J el 12 de Oct. de 2014
Editada: Matt J el 12 de Oct. de 2014
I don't think the best strategy is to try to cut down on number of iterations. For any iterative method, that number will be dependent on the initial guess anyway, and would be very hard to control. The better strategy would be to see if you can cut down on the amount of computation per iteration.
For example, I see no reason to be computing the matrix N explicitly. Getting the minimum eigenvalue of N to equal 0 is the same as getting the maximum eigenvalue of f to equal 1/x. So you shouldn't have to compute further than f.
Also, M is a Kronecker product, so the multiplication you describe should be doable more efficiently without explicitly computing M.
These are just examples, of course. It doesn't look like the computations you've listed should take very long for the data sizes you've mentioned. My computer can do 250 eigendecompositions of a 256x256 matrix in about 11 seconds. So, presumably the computation bottlenecks are in steps you haven't mentioned. It might be worth expanding on those so that the community can explore ways to do them faster.
As for the choice of algorithm, it might be worth exploring the Global Optimization Toolbox as Star Strider suggests. I have my doubts about fsolve and other algorithms in the (non-global) Optimization Toolbox, however, since your problem does not look smooth.
Thanks for the clarification, of course it is the number of function calls that is relevant. The numbers 8-13 that I quoted in the original post were function calls, not iterations.
There are some more loops that I did not mention. The bulk of the computation time (85% if I recall correctly) is spent at the bsxfun call shown below, with the sum coming in as a second in time consumption. Here, typically N=16 and M=50, and F and the matrices F and L have already been calculated, they are size (2*N, 2*N, N,N) and (2*N, 2*N, M), respectively.
f0=bsxfun(@times,reshape(F,[2*N 2*N N N 1]),reshape(L,[2*N 2*N 1 1 M]));
f2=sum(sum(f0));
In a typical calculation, this needs to be done 3*10^5 times for different values of the matrices F and L. The numbers here are typical values, but they may vary by almost an order of magnitude. This means that I can't really vectorize the code any more than I already have, as I run out of memory.

Iniciar sesión para comentar.

 Respuesta aceptada

Matt J
Matt J el 14 de Oct. de 2014
Editada: Matt J el 14 de Oct. de 2014
The bulk of the computation time (85% if I recall correctly) is spent at the bsxfun call shown below,
I see a speed-up factor of about 30x when I re-implement that as a matrix multiplication, as below.
Ftr=reshape(F,4*N^2,[]).';
tic;
Lr=reshape(L,4*N^2,M);
f2=Ftr*Lr;
f2=reshape(f2,[N,N,M]);
toc
I didn't include the transpose .' operation in my timing tests, because you should be able to avoid that altogether, just by revising the order in which your data is generated.

6 comentarios

Henrik
Henrik el 14 de Oct. de 2014
Editada: Henrik el 14 de Oct. de 2014
...Wow!
Matt, that is absolutely incredible. I did a quick implementation and just gained about a factor of 25 in speed.
Would you mind taking a look at the new bottlenecks in my code? Calculation of F takes around 50% of the time now, followed by 30% of the time from the product and sum you just sped up, and around 15% of the time from calculating L. Here's a working example of the code.
I can also make this a new question if you want?
% Test data
N=16;
M=50;
FEU=rand(2*N,1);
FED=rand(2*N,1);
EU=rand(2*N,1);
ED=rand(2*N,1);
Om=rand(M,1);
eta=1e-3;
v_ustar=rand(2*N,N,N);
vstar_u=rand(2*N,N,N);
u_ustar=rand(2*N,N,N);
vstar_v=rand(2*N,N,N);
% Calculate L
FEU=reshape(FEU,[2*N 1 1]);
FED=reshape(FED,[1 2*N 1 ]);
EU=reshape(EU,[2*N 1 1]);
ED=reshape(ED,[1 2*N 1]);
Om=reshape(Om,[1 1 M 1]);
L=(repmat(FEU,[1 2*N M])+repmat(FED,[2*N 1 M])-1)./...
(repmat(Om,[2*N 2*N 1])+1i*eta-repmat(EU,[1 2*N M])-repmat(ED,[2*N 1 M]));
L=reshape(L,4*N^2,M);
% Calculate F
F=...
repmat(reshape(v_ustar,[2*N 1 N N]),[1 2*N 1 1]).*...
repmat(reshape(conj(vstar_u), [1 2*N N N]),[2*N 1 1 1])-...
repmat(reshape(u_ustar,[2*N 1 N N]),[1 2*N 1 1]).*...
repmat(reshape(conj(vstar_v), [1 2*N N N]),[2*N 1 1 1]);
F=reshape(F,4*N^2,[]).';
% And calculate f0
f2=F*L; %magic happens
f0=-1/N^2*reshape(f2,[N,N,M]);
EDIT: I accepted the answer because it's so awesome that my code is so much faster, but I'm still interested in reducing the number of function calls to find where the function is zero.
I see about a factor of 2x speed-up in the following. It requires MTIMESX ( Download ) which may not support R2014 yet.
bsx=@(a,b) bsxfun(@minus,a,b); %define only once
r=@(x) reshape(x,2*N,1,N,N);
v_ustar=r(v_ustar);
vstar_u=r(vstar_u);
u_ustar=r(u_ustar);
vstar_v=r(vstar_v);
tic;
% Calculate L
L=bsxfun(@rdivide, bsx(FEU,1-FED), bsx( bsx(Om,EU-1i*eta), ED ) );
L=reshape(L,4*N^2,M);
% Calculate F
F=mtimesx([v_ustar,u_ustar], [vstar_u,-vstar_v],'c');
F=reshape(F,4*N^2,[]).';
toc;
Henrik
Henrik el 15 de Oct. de 2014
Editada: Henrik el 15 de Oct. de 2014
Thanks a lot, the calculation of L is now 2 times faster!
However, I see about a factor 2-4 increase in computation time with MTIMESX. Maybe it has something to do with me using Ubuntu and MATLAB R2014a? You wouldn't happen to know of another way to speed up the calculation of F?
Anyway, I appreciate the effort!
Henrik
Henrik el 15 de Oct. de 2014
Hi again.
I forgot to mention this earlier, but for some initial calculations, I will have Om=0 and thus M=1. In this case, there is hardly any gain in speed. Could you perhaps suggest how to improve the code in this particular case?
Matt J
Matt J el 15 de Oct. de 2014
Editada: Matt J el 15 de Oct. de 2014
However, I see about a factor 2-4 increase in computation time with MTIMESX. Maybe it has something to do with me using Ubuntu and MATLAB R2014a?
My understanding was that the mtimesx_build tool that comes with mtimesx wasn't working for R2014. I'd be interested to know how you got it to compile. Other than that, I don't know what's happening. I definitely see a factor fo 2 speed-up in computation of F. Maybe if you try initializing mtimesx explicitly with
>> mtimesx SPEED
As for Om=0, M=1 I can't see how the computation of L would be the bottleneck when M=1. However, you can compute it instead as
L=bsxfun(@rdivide, bsx(FEU,1-FED), bsx( +1i*eta-bEU , ED ) );
in that case.
I didn't do much work to make it compile. I downloaded BLAS from http://gcc.gnu.org/wiki/GfortranBuild, compiled it
gfortran -shared -O2 *.f -o libblas.so -fPIC
and in MATLAB I used mex('-DDEFINEUNIX','mtimesx.c','path_to_liblas/libblas.so')
I do get the following warning:
Warning: You are using gcc version '4.8.2'. The version of gcc is not supported. The version currently supported with MEX is '4.7.x'. For a list of currently supported compilers see: http://www.mathworks.com/support/compilers/current_release.
Warning: You are using gcc version '4.8.2-19ubuntu1)'. The version of gcc is not supported. The version currently supported with MEX is '4.7.x'. For a list of currently supported compilers see: http://www.mathworks.com/support/compilers/current_release.
my_path/mtimesx/mtimesx.c: In function ‘mexFunction’:
my_path/mtimesx/mtimesx.c:592:10: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default] ans = mexGetVariablePtr("caller","OMP_SET_NUM_THREADS");
^
MEX completed successfully.
mtimesx SPEED returns SPEED, but it is still much slower than the other method.
You are right, L is not the bottleneck. I was just surprised that the speed up compared to my old code was basically zero for M=1, when it's around x20 faster for M=50.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Optimization Toolbox en Centro de ayuda y File Exchange.

Preguntada:

el 11 de Oct. de 2014

Comentada:

el 15 de Oct. de 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by