solve a non-linear least squares problem
Mostrar comentarios más antiguos
I want to fit data with my custom function to calculate parameters of the model. Data of x and y are attached, and they are shown as the figure. The custom function is:
y=a+(-((4/3)*3.14*((60e-10)^3-((60e-10)-x).^3).*b*(c-1096)/c+4*3.14*(60e-10)^2.*(d.*((60e-10)-x).^2/(60e-10)^2+e+(f-e-d.*((60e-10)-x).^2/(60e-10)^2).*exp(-x./g)))/(1e3*1.38e-23*1090))
and parameters have constraints and initials:
-inf<a<inf, a=2e4
1e11<b<1e12, b=7.8e11
1120<c<1300, c=1200
70<d<130, d=127
300<e<700, e=680
400<f<900, f=850
1e-10<g<3e-10, g=2.5e-10
However, I both tried matlab and rigin to fit data with the model, but they all failed to find a good fit. I am appreciate if you can provide any suggestions. In fact, I understand there are too many parameters, and I also tried to fix parameter b, d, e and g while free others, but still no good results.

Respuestas (2)
dpb
el 6 de Jul. de 2014
With your constraints as they are, the cuic term dominates -- a single quadratic would appear to likely fit the data quite nicely. Try
[b,s,mu]=polyfit(x,y,2);
Note that the above uses the internal scaling to condition the matrix numerically.
Or, figure out some other set of bounds that essentially turn the cubic portion off or fit some other correlation that has the essence of a quadratic. It's not feasible to turn the answer into something different with the constraints you've placed in both the form of the equation and the bounds on the coefficients.
8 comentarios
John D'Errico
el 6 de Jul. de 2014
Editada: John D'Errico
el 6 de Jul. de 2014
Excellent advice. In fact, after centering and scaling x, a simple quadratic fit is nearly perfect, with an RMSE on the order of 7e-13. To ask for any more than that is wild overkill.
It is numerically impossible to estimate the coefficients of that complex model from this data.
Cong
el 7 de Jul. de 2014
Matlab isn't the problem; the problem is your data don't fit the model worth a hooey (or more precisely, the model doesn't reflect the data).
It won't matter what estimation technique or software package you use; you aren't going to fit those data with that model; just ain't a-gonna' happen.
Go find some data that reflect the model if that's the purpose; but those you have showed here ain't them.
I used lsqcurvefit which, while it complained some, did its darndest, but as noted, with the higher order terms in the model and nothing in the data but one inflection point, there just is no hope of a decent fit.
Cong
el 8 de Jul. de 2014
John D'Errico
el 8 de Jul. de 2014
Editada: John D'Errico
el 8 de Jul. de 2014
CFTOOL is good enough, IF your data supports the model. You don't appreciate that you CANNOT estimate that model. NOTHING you will do in double precision arithmetic will suffice to estimate those parameters, and certainly nothing will ive you any degree of confidence in the result. The information content is simply not there to support estimation of those model coefficients. And whether or not you agree is not pertinent. Since you clearly know relatively little about parameter estimation and numerical methods, how much does your opinion really matter?
Just wanting to do something that is numerically impossible is not sufficient. I believe the old saying was something like "If wishes were horses, beggars would ride."
...that the model cannot fit the data, can you convince me about it?
Probably not, given your predisposition to believe what should be an obviously false premise, but I'll give at least one additional comment on it.
Rearranging your functional, I get something like --
C=60e-10;
D=4*pi;
E=(1e3*1.38e-23*1090);
z=C-x;
y=a - (D/3*(C^3-z.^3).*b*(c-1096)/c + ...
D*C^2.*(d.*z.^2/C^2 + ...
e + (f-e-d.*z.^2/C^2).*exp(-x./g)))/E;
which shows a cubic term in z plus two quadratic terms, one heavily weighted by an exponential. Combined with the agglomeration of constants and the aliasing of coefficients in terms such as f-e-d.*z which make the effects of the coefficients that are combined impossible to estimate independently, that there is as noted before only a single inflection point in the input data and nothing at all approaching the appearance of an exponential, the chances of being able to estimate the coefficients such that the quadratic only would dominate are essentially zero just by inspection.
When you provide a set of constraints on coefficients, then there's no chance for the algorithm to try to smoosh the coefficient b to very small values to make that term go away. Making c small doesn't help because it's in the divisor as well so its contribution then grows geometrically or if it is made large it's contribution --> 1. There is the "magic value" of 1096 that will kill the cubic term, of course which if you then made g small as compared to x such that exp(-x/g)-->0, then you'd get a decent fit for the quadratic portion of
D*C^2.*(d.*z.^2/C^2
that would allow for d and a to be estimated and possibly a decent quadratic to be estimated.
Similar observations can be made on the other parameters' effects on the third term.
The above is, of course, essentially based on the assumption of nearly infinite precision in double precision, which of course, is not even close to reality. As John mentioned, in double precision your formulation has other serious issues -- as just a start, consider for you initial guesses the first additive terms --
>> e+(f-e-d.*((60e-10)-x).^2/(60e-10)^2).*exp(-x./g)
ans =
738.1804
>> -(4/3)*3.14*((60e-10)^3-((60e-10)-x).^3).*b*(c-1096)/c
ans =
-1.0626e-14
>> ans+783==783
ans =
1
>>
This shows that the one term completely obliterates the other in magnitude so that the summation just as well never took place--there just aren't sufficient digits of precision to keep the small values in relation to the larger.
You say this is supposed to have some physical meaning -- just out of curiosity where does the model come from and what is the meaning of the physical constants?
Cong
el 9 de Jul. de 2014
dpb
el 9 de Jul. de 2014
Can't see the article so not of much help. If they indeed did estimate such, their data surely had to have had much more structure in it than that you presented to have had any chance at all.
Alex Sha
el 31 de Mayo de 2019
0 votos
only need to free the range of parameter "g" while keep other parameter ranges as originly, the result should be very good:
Root of Mean Square Error (RMSE): 0.00665318927569739
Sum of Squared Residual: 0.0557295437706628
Correlation Coef. (R): 0.99999931888018
R-Square: 0.999998637760824
Adjusted R-Square: 0.999998635591654
Determination Coef. (DC): 0.999998637759301
Chi-Square: -0.00690242716732607
F-Statistic: 153179168.346496
Parameter Best Estimate
---------- -------------
a 11997.8585863498
b 499078186880.248
c 1120
d 70
e 404.459269397904
f 400.807634142111
g -5.39659727990207E-9
Categorías
Más información sobre Linear Least Squares en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!