How do I choose the initial values for non-linear curve/function fitting?

Question

0 votos

I have some data taken from an optics experiment which consists of applied voltage (X-axis) and Intensity (Y-axis). There is a mathematical relation between them which is quite complicated but here it is in Matlab format:

F= cos((a(1).^2.*(a(2) - (a(2).*a(3))./(a(2).^2.*cos(a(4) - 2.*atan(exp(-(V- a(5))./a(6)))).^2 + a(3).^2.*sin(a(4) - 2.*atan(exp(-(V - a(5))./a(6)))).^2).^(1/2)).^2 + a(7).^2).^(1/2)).^2

Here F is the theoretical expression for the intensity and V is the voltage and 'a' are fit parameters that I need to find. Since this is highly nonlinear, it seems hard to find the fit parameters such that the curve fits to the data well.

My Previous Attemps and Observations:

I did notice that the fit parameters are extremely sensitive both initial start points and its lower and upper limits.
I got a 'somewhat' decent fit after so many attempts as shown below but cannot seem to get better beyond this:

I also tried to constrain some of the above parameters to some possible way using Physics but still can't seem to do any better: for example, a(2) and a(3) should lie around 1.5-1.7 in most cases. a(1) should be around 38750, a(5) around 80. Im not very sure about a(7) and a(4) but they should likely be between -5 to 5.

Is there someway in which I could choose the initial values so that I could get a good fit? I feel like it gets stuck at some local optima but doesn't reach teh global optimal fit.

My raw data is attached here. Let me know in case I need to attach more information

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Mathieu NOE el 17 de Jul. de 2023

It would definitively help if you share your data and code along

as usual, the more we know from the model and underlying physics, the better we can make the fit process converge

I wonder how the theoretical equation was constructed and what ensures that the experiment is done according to this model

in other words , we need to use as much a priori information to bound the search interval (or even better , to fix some parameters)

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

John D'Errico el 17 de Jul. de 2023

Editada: John D'Errico el 17 de Jul. de 2023

Abrir en MATLAB Online

1 voto

Yes, there is an easy way to do it automatically, but its a SECRET! They would have built it into the code, but it is such a big secret, that it could not be given to just anybody, and you might figure it out from the code.

Yeah. Right. I'm sorry, but there is no magic way to automatically determine good starting values.

Optimizations with many parameters often have locally optimal sub-optimal "solutions", where the optimizer gets stuck, IF you start in a bad place. (Your model qualifies as having many parameters.) Call those points stationary points, where the optimzer cannot find a better place to look from there.

a = sym('a',[1,7]);

syms V

F= cos((a(1).^2.*(a(2) - (a(2).*a(3))./(a(2).^2.*cos(a(4) - 2.*atan(exp(-(V- a(5))./a(6)))).^2 + a(3).^2.*sin(a(4) - 2.*atan(exp(-(V - a(5))./a(6)))).^2).^(1/2)).^2 + a(7).^2).^(1/2)).^2

F =

Models with trig components ALWAYS seem to have problems with multiple solutions. This is just a natural consequence of periodic functions. Powers and roots of parameters also cause problems, because again, it introduces multiple solutions.

Again, I'm sorry, but this is just a bullet you need to bite. You need to understand the model you have proposed. After all, if you chose to build that model, you should have done so for a reason. So spend the time to learn about how a parameter impacts the model. In this model, at least a5 and a6 are trivial to understand, as shift and scale parameters. The others can probably have some interpretations, if you spend some time, but I won't go into that rabbit hole.

Your best solution is probably to use a multi-start method, or perhaps a tool like GA. In either case, they are designed to be LESS sensitive to problems of this sort, but expect it to be a difficult problem.

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Akaash Srikanth el 24 de Jul. de 2023

Editada: Akaash Srikanth el 25 de Jul. de 2023

Abrir en MATLAB Online

@Alex Sha,

Thank you for the help. So I realized something that there was a slight error in my mathematical model. Thats probably why the fit is not within the values that I expected. I corrected it and now we are trying to again fit another function to a slightly different set of data. (The new files have been posted here: x_axis:volt and y_axis is phase) Would it be possible predict some good values for the fit in that case? My fit results are slightly better but still would prefer a much better fit. Here is the best I could go to.

a = sym('a',[1,8]);
syms V
F= @(a,V) a(1).*(a(2) + a(3) - a(4) - (a(2).*a(5))./(a(2).^2.*cos(a(6) - 2.*atan(exp(-(V - a(7))./a(8)))).^2 + a(5).^2.*sin(a(6) - 2.*atan(exp(-(V - a(7))./a(8)))).^2).^(1/2))
F = function_handle with value:
    @(a,V)a(1).*(a(2)+a(3)-a(4)-(a(2).*a(5))./(n.^2.*cos(a(6)-2.*atan(exp(-(V-a(7))./a(8)))).^2+a(5).^2.*sin(a(6)-2.*atan(exp(-(V-a(7))./a(8)))).^2).^(1/2))

Again some contraints on the parameter:

a(1): (about 20000, 80000) - This is the ratio of the thickness of a crystal and the wavelength of the laser we are using.
a(2) and a(5): (about 1.4-1.8) - Both these are refractive indices.
No strict range on a(3) and a(4)
a(7) : Around (0.3-0.9) - These are some voltages in V. (Can go slightly above 0.9)
a(8): Around (0.3-2)- Again some voltage in V.
a(6)- No strict range on this too

If needed I could post this as a new question too as I am not very familiar with Matlab community guidelines.

Alex Sha el 26 de Jul. de 2023

Abrir en MATLAB Online

If taking parameter ranges as: a1=[20000, 80000],a2=[1.4,1.8],a3,a4,a5=[1.4,1.8],a6,a7=[0.3,0.9],a8=[0.3,2];

Sum Squared Error (SSE): 0.913188311071197
Root of Mean Square Error (RMSE): 0.0825520329387615
Correlation Coef. (R): 0.999569920445517
R-Square: 0.999140025859457
Parameter	Best Estimate    
---------	-------------    
a1       	20000.0000113322 
a2       	1.40000693995549 
a3       	125.071659909581 
a4       	125.071307157303 
a5       	1.40050427357894 
a6       	1.24015985413439 
a7       	0.300000000000583
a8       	0.796932506770127

It is easy to see that parameter a1, a2, a5 and a7 are all in the lower bound of their ranges, so if change rages as: a1=[10000, 80000],a2=[0,1.8],a3,a4,a5=[0,1.8],a6,a7=[0.0,0.9],a8=[0.0,2];

the result will be a little better:

Sum Squared Error (SSE): 0.392117528292817
Root of Mean Square Error (RMSE): 0.054094826103246
Correlation Coef. (R): 0.999815349109032
R-Square: 0.999630732314016
Parameter	Best Estimate       
---------	-------------       
a1       	10000.0637696401    
a2       	0.00172350448796937 
a3       	-279.422101867729   
a4       	-279.421813550937   
a5       	0.000794456392713509
a6       	-0.260580870966713  
a7       	0.256682315830075   
a8       	1.33579711975512

Iniciar sesión para comentar.

How do I choose the initial values for non-linear curve/function fitting?

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

How do I choose the initial values for non-linear curve/function fitting?

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

6 comentarios Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos