Three terms exponential fit without initial guess

26 views (last 30 days)
Is there a possibility to fit data (enough data points, thousands of data points) with 3-term exponential fit without initial guesses ? If not, is there a way of having an idea about the initial guesses, mathematically, without prior knowledge on the physical model (just to be safe and unbiased by what we want to get).
Many thanks!
  2 Comments
Alex Sha
Alex Sha on 17 Aug 2022
Theoretically, the use of global optimization algorithms will not rely on the guessing of initial value to obtain a global optimal solution, such as Matlab's genetic algorithm toolbox, of course, unfortunately, the actual results are far from what was expected,at least so far.

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 17 Aug 2022
Edited: John D'Errico on 17 Aug 2022
There are a lot of misconceptions here.
  1. Can you do a fit without ANY initial guesses? No. Any nonlinear fit will use initial guesses. The ones that do not explicitly require a start point, like fit, still use guesses, but they supply random numbers as guesses. And that can be actively a bad thing. (I'll discuss that later.)
  2. You CAN use tools like GA. But even there you will need to have some intelligent set of bounds for the parameters, else the exponentials will just arbitrarily overflow or underflow. Either case is an actively bad thing in a fit.
  3. You can use tools like a partitioned least squares solver. (I'll show an example of that too.) These tools are more robust to bad starting values, they converge more stably,, etc. But they will still require some starting values.
A nonlinear least squares fit is just a search routine. You need to start it looking in some intelligent place. And exponentials are notorious for being nasty things, IF you fail to treat them properly. Again, remember that a fit is JUST a search. Think of it as putting a blind person down on the face of the earth, and tasking them with finding the point of lowest elevation. Give them a cane, to figure out which way is down hill. Also an altimeter (with spoken output), to figure out how high they are. And finally, give the poor fellow some scuba gear, as they will need it.
The point is, a nonlinear fit is little more intelligent that that. Start the fellow out in the vicinity of the Dead Sea, and expect he will get wet, but not that he will find a point in the bottom of the Pacific ocean.
Since you have supplied no data, I'll need to make up some data.
x = 1 + rand(500,1)*100;
y = 200 ./ x + randn(size(x));
plot(x,y,'.')
Yes. I know that data is NOT in fact a sum of three exponentials. But it has the necessary shape. And most of the time people are just guessing how many terms they need in such a model anyway. Much of the time, they don't even really know the true model. Too bad. This is my data. It looks a lot like any other exponential model data.
So let me see if I can fit this data as a sum of three exponentials. We can see some problems you might run into.
mdl = fittype('a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x)','indep','x')
mdl =
General model: mdl(a,b,c,d,e,f,x) = a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x)
First, look at the data. remember that exponential data often looks like that. The problem is, the points at the left end will carry the most influence, dragging the rest of the curve around. Any error there will have a huge impact on the fit.
Next, remember that the magnitude of your independent variable will be hugely important. If x is too large, then the exponential rate parameters can easily be large enough, or small enough in comparison, that you will see overflows in the exponential or underflows. And then the fit will become complete crap. For example, the data I gave here has x on the order of 0-100. Say an average value of 50. If, given a model with terms like
a*exp(-b*x)
if the starting estimate for b is on the order of 100, then ALL of your data will underflow at that start point. Your fit will probably fail to converge.
As well, you NEED to know the sign of the rate constants. Is this a positive or negative exponential model? The problem is that the optimization routine will have a great deal of difficulty in switching the signs of the rate constants. So it you provide a model as
a*exp(b*x)
and you give it the wrong sign for b as a starting value, the solver will usually fail to converge to something reasonable. Here, the problem is that exp(-x) and exp(x) have fundamentally different shapes. But since solvers don't really understand mathematics, they just fail to converge.
fittedmdl = fit(x,y,mdl)
Warning: Start point not provided, choosing random start point.
fittedmdl =
General model: fittedmdl(x) = a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x) Coefficients (with 95% confidence bounds): a = 346.6 (329.2, 364) b = 1.058 (0.9984, 1.118) c = 11.61 (10.58, 12.64) d = 0.01996 (0.01824, 0.02168) e = 69.76 (64.57, 74.95) f = 0.1848 (0.1712, 0.1984)
plot(fittedmdl)
hold on
plot(x,y,'b.')
hold off
I tested out this fit multiple times, allowing fit to choose its own starting values. It actually did surprisingly well on this specific data, even though the data is not at all a sum of exponentials. Sometimes it failed to converge. That happened a little less than half the time here in my play as I was writing this.
Another problem is that sums of exponentials are themselves difficult problems to estimate. I saw that the Curve Fitting Toolbox did not have an 'exp3' model built in as an option. That is a good thing, because sums of exponentials often resut in poorly posed problems in the optimization. One would see that in the form of multiple singular or nearly singular matrix warnings, telling you the solver is having grave numerical problems.
Ok, can you find good starting values for such a model? SOMETIMES this is doable. Doing it automatically is not easy though. You could do things like take the log of your data. Since the log of an exponential model is now linear in the unknown coefficients, you could see if the result is a inear fit. That is if we have
y = a*exp(b*x)
then
log(y) = log(a) + b*x
Essentially, we now would hope to see a straight line fit, and the slope of that line would be a good estimate of your rate constant. Typically, I do this using a semi-logy plot. If the curve looks nice and liner, then an exponential model will be a good idea. Sadly, this fails for a sum of exponentials, since the log of a sum is nothing special.
semilogy(x,y,'.')
Warning: Negative data ignored
What does that tell me? It tells me that a pure exponential fit must fail to fit well, since the curve is not a straight line. But maybe, if we have a sum of several terms it may work. The semi-logy plot also points out the problem with trying to model process with additive noise, and then taking the log.
Finally, you can use a partitioned nonlinear least squares solver. I have one of them posted on the file exchange, in the form of fminspleas. This tool needs to iterate on only the nonlinear parameters in the problem, so given a model like:
a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x)
the solver works differently on the intrinsically nonlinear parameters b,d, and f, cpompared to the conditionally linear parameters a,c, and e. Essentially, the solver needs to search only in a 3-dimensional parameter space instead of a 6-dimensional parameter space. And this makes the solve more efficient, more robustly convergent, etc. You can find fminspleas on the File Exchange for free download.
In the end though, a nonlinear least squares fit is often no better than the quality of the starting values you can provide.
  7 Comments
ChemPhysicist
ChemPhysicist on 20 Aug 2022
Edited: ChemPhysicist on 20 Aug 2022
Well, I can say almost yes. The model should normally be a 3-exponential terms, but as you can see the real signal is actually a bit more complex where it contains an effect of the impulse response.
If we ignore the that "spike" in the first 4000 data, what would be the best fitting algorithm to that gives the best fit ? I tried using the model you gave me first, but needed to run several times untill I got roughy an acceptable fit in its shape, then I used the outputs to rewrite the script with upper and lower options, so with this I can get immediately the best fit.
% model:
mdl = fittype('a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x)','indep','x')
% options obtained after several runs till a convergence to an acceptable shape of the fit
options_Exp = fitoptions(mdl);
options_Exp.Lower = [-0.02544 0.2425 -63.85 1.273 0.01355 0.001544];
options_Exp.Upper = [-0.02425 0.2468 -27.06 1.378 0.01356 0.001571];
% It would have been probably better to give start points instead of upper and lower limits ?
fittedmdl = fit(x,y,mdl,options_Exp)
figure
plot(fittedmdl)
hold on
plot(x,y,'b.')
hold off
So I thought whether your algorithm fminspleas or some MonteCarlo simullations can be a better alternative to this process (manual runs for several times).
Then last, if you could refer me to some questions useful for fitting and convolutting with the impulse signal (response of the detector) that would be great and would ve very grateful. (I'm also wondering whether it can be better to try deconvolve the signal first then fit 3-terms exponential, then reconvolve to fit the real signal.

Sign in to comment.

More Answers (1)

Sulaymon Eshkabilov
Sulaymon Eshkabilov on 17 Aug 2022
Yes, it is possible to get such fit. Here is an initial draft code syntax:
Xdata = ...
Ydata = ...
MODEL = fittype( @(a,b,c,d,f, g, x) (a*exp(b*x)+c*exp(d*x)+f*exp(g*x)), ...
'indep','x', 'coeff', {'a', 'b', 'c', 'd', 'f', 'g'} );
[FITRESULTS, GF] = fit(Xdata, Ydata, MODEL);
plot(FITRESULTS, Xdata, Ydata)
  1 Comment
John D'Errico
John D'Errico on 17 Aug 2022
I'm sorry, but that still uses initial guesses, but just that fit supplies comletely random guesses of its own. And that is often a bad idea.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by