Curve fitting toolbox can return bogus results for 2 term exponential functions. Is this a bug?
5 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Matt Brown
el 2 de Nov. de 2022
Comentada: Walter Roberson
el 3 de Nov. de 2022
I am fitting some test data using the curve fitting toolbox and the built in 2 term exponential form, f(x) = a*exp(b*x) + c*exp(d*x). For some reason, one of the test fits returns coeff's that cause the function to go to zero everywhere. However the curve and GOF data that is displayed does not jive with the reported results. If f(x)=0 everywhere, there is no way that R-square would be anywhere close to 1 given the original data.
Is this a bug that needs to be reported? Bad data? Please help!
The input vectors are as follows:
x=[0,12.5450000000000, 25.0900000000000, 54.3900000000000, 81.6900000000000, 109.090000000000, 138.790000000000, 139.090000000000, 139.090000000000, 163.890000000000, 195.390000000000, 202.390000000000]
y=[0, 0.00337831199999972, 0.688170198000000, 1.39894291200000, 2.42361711600000, 3.96215670600000, 7.38800638200000, 7.38893139600000 ,7.35868746000000, 11.7483615060000, 19.7050504080000, 23.7202546560000]
0 comentarios
Respuesta aceptada
Walter Roberson
el 2 de Nov. de 2022
Sum of exponentials is notoriously difficult to fit.
One of the problems is that you did not put bounds on your variables. When you have a*exp(b*x) + c*exp(d*x) then a and b can change places with c and d and you would have exactly the same sum. If a and c are different signs (not at all uncommon for this kind of fitting) then when you do not put in bounds, the fitting will not be able to tell which of the two is to be positive or negative, and so the error bounds will cross the entire range, and the fit will be useless. Any time you have terms with identical forms, you need to impose constraints to have any hope of getting a useful fit.
But even then, it is quite common that one of b or d goes towards negative infinity times sign(x) so that exp(d*x) goes to 0, nearly removing the term -- or that one of the two goes towards 0, making exp(d*x) go to 1, making c into nearly an additive term. Yes, in theory you can do better mathematically, but in practice the error often increases as you move away from one of those two positions. If your starting positions do not happen to fall inside the right range, then the exponential increase in error as you move towards the peak error lead the miniizers to move further from the actual best location . Starting values are crucial for the fitting techniques that are used by the curve fitting toolbox.
I have read that there exist fitting algorithms specifically for sum of exponentials, that should do a better job, but I have not researched those algorithms. Some kind of transform has to be used, I gather.
4 comentarios
Walter Roberson
el 3 de Nov. de 2022
Sorry, there is not much control over the provided apps. Sometimes if you dig into the source code for long enough you can come up with a usable code change, but most of the time it is a change in code that is needed, not a change in some setting. (Occasionally if you dig harder still, it is possible to figure out how to dig into the internals far enough to change some settings without changing the source code.)
Más respuestas (1)
John D'Errico
el 2 de Nov. de 2022
Editada: John D'Errico
el 2 de Nov. de 2022
Is it bug? NO.
Is it due to poor starting values? Almost always, yes. At least, unless the curve is simply not well fit by a two term exponential.
You have a dozen data points, and you want to fit 4 parameters? Using exponentials? And you want to see good results? Sigh.
x=[0,12.5450000000000, 25.0900000000000, 54.3900000000000, 81.6900000000000, 109.090000000000, 138.790000000000, 139.090000000000, 139.090000000000, 163.890000000000, 195.390000000000, 202.390000000000];
y=[0, 0.00337831199999972, 0.688170198000000, 1.39894291200000, 2.42361711600000, 3.96215670600000, 7.38800638200000, 7.38893139600000 ,7.35868746000000, 11.7483615060000, 19.7050504080000, 23.7202546560000];
numel(x)
plot(x,y,'o')
When I look at that curve, I might bet that a single exponential will fit entirely reasonably. As such, if this next plot is a straight line, then it will be.
semilogy(x,y,'o')
So except for the VERY first data point, it virtually IS a straight line. And that means to fit a second term in that exponential fit, you have only ONE piece of data, maybe two, to support estimating that pair of coefficients.
Should you be even remotely surprised the two term fit looks strange to you, in the sense that one of those exponentials seemed to be nonsense? NO!!!!!
[mdl1,G1] = fit(x',y','exp1')
mdl1
G1
plot(mdl1,x,y,'ro')
So am I even REMOTELY surprised that R^2 is very near 1? WHY? THE FIT LOOKS QUITE GOOD, even for a 1-term exponential. Far too many people seem to be ruled by R^2. In my opinion, R^2 is slightly more valuable than a pile of rubbish, but not by a lot. If the curve appears to fit well when you plot it, don't worry about R^2.
This is not a bug in the curve fitting toolbox. It is a problem in your understanding of modeling and curve fitting. Can we try to fit a 2 term exponential? POSSIBLY. But ONLY if we use good starting values would there be much chance. And even then, again, you have WAY too little data. At least we have decent starting vlaues for the main term in the model, so I will use them, and then guess at the other term. (I tried a couple of times before I was satisfied with the results.) Your data is pretty much useless for that model, yet your expectations are really high. Double sigh.
[mdl2,G2] = fit(x',y','exp2','start',[0.54 0.018,-0.1,-0.01])
plot(mdl2,x,y,'ro')
Is that result meaningful? I doubt it is worth much, since that second exponential term is literally based on about 1 data point. Note the width of the confidence bounds on parameters c and d. Do you see that even the sign of that second rate parameter is in question?
Why is it that everytime someone sees something they don't understand, it must be a bug? This just requires experience in curve fitting.
2 comentarios
Ver también
Categorías
Más información sobre Descriptive Statistics en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!