Unable to resolve the warning on ill conditioned Jacobian

I want to fit my data (1st column: x , 2nd column: y, given in the text file) to a sigmoidal function using the given function file (sigm_fit_base_e.m) The function is the standard matlab function which I modified from base 10 to base e and increased the maxIter to 10000. My initial guess parameters are:
[0 0.2845 9.88 -1] and there are no fixed parameters.
Relevant code lines are:
fPar = sigm_fit_base_e(x,y,[],[0 0.2845 9.88 -1],0);
I get the following warning:
Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be estimated well (they are not identifiable). Use caution in making predictions.
> In nlinfit (line 384)
In sigm_fit_base_e (line 130)
I checked all the related answers in the Matlab community and tried to play around by modifiying the initial guess parameters and fixing the 2nd parameter but the warning still persists. Could you please help me fix this warning?
M = importdata('datai2j3.txt');
x = M(:,1);
y = M(:,2);
fPar = sigm_fit_base_e(x,y,[],[0 0.2845 9.88 -1],1)
f = function_handle with value:
@(param,xval)param(1)+(param(2)-param(1))./(1+exp((param(3)-xval)*param(4)))
ans = 1x4
0 0.2845 9.8800 -1.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be estimated well (they are not identifiable). Use caution in making predictions.
fPar = 1x4
-176.4593 0.2818 11.5553 -5.6354
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
function [param,stat]=sigm_fit_base_e(x,y,fixed_params,initial_params,plot_flag)
% Optimization of parameters of the sigmoid function
%
% Syntax:
% [param]=sigm_fit(x,y)
%
% that is the same that
% [param]=sigm_fit(x,y,[],[],[]) % no fixed_params, automatic initial_params
%
% [param]=sigm_fit(x,y,fixed_params) % automatic initial_params
% [param]=sigm_fit(x,y,[],initial_params) % use it when the estimation is poor
% [param]=sigm_fit(x,y,fixed_params,initial_params,plot_flag)
%
% param = [min, max, x50, slope]
%
% if fixed_params=[NaN, NaN , NaN , NaN] % or fixed_params=[]
% optimization of "min", "max", "x50" and "slope" (default)
%
% if fixed_params=[0, 1 , NaN , NaN]
% optimization of x50 and slope of a sigmoid of ranging from 0 to 1
%
%
% Additional information in the second output, STAT
% [param,stat]=sigm_fit(x,y,fixed_params,initial_params,plot_flag)
%
%
% Example:
% %% generate data vectors (x and y)
% fsigm = @(param,xval) param(1)+(param(2)-param(1))./(1+exp((param(3)-xval)*param(4)))
% param=[0 1 5 1]; % "min", "max", "x50", "slope"
% x=0:0.1:10;
% y=fsigm(param,x) + 0.1*randn(size(x));
%
% %% standard parameter estimation
% [estimated_params]=sigm_fit(x,y)
%
% %% parameter estimation with forced 0.5 fixed min
% [estimated_params]=sigm_fit(x,y,[0.5 NaN NaN NaN])
%
% %% parameter estimation without plotting
% [estimated_params]=sigm_fit(x,y,[],[],0)
%
%
% Doubts, bugs: rpavao@gmail.com
% Downloaded from http://www.mathworks.com/matlabcentral/fileexchange/42641-sigmoid-logistic-curve-fit
% warning off
x=x(:);
y=y(:);
if nargin<=1 %fail
fprintf('');
help sigm_fit
return
end
automatic_initial_params=[quantile(y,0.05) quantile(y,0.95) NaN 1];
if sum(y==quantile(y,0.5))==0
temp=x(y==quantile(y(2:end),0.5));
else
temp=x(y==quantile(y,0.5));
end
automatic_initial_params(3)=temp(1);
if nargin==2 %simplest valid input
fixed_params=NaN(1,4);
initial_params=automatic_initial_params;
plot_flag=1;
end
if nargin==3
initial_params=automatic_initial_params;
plot_flag=1;
end
if nargin==4
plot_flag=1;
end
if exist('fixed_params','var')
if isempty(fixed_params)
fixed_params=NaN(1,4);
end
end
if exist('initial_params','var')
if isempty(initial_params)
initial_params=automatic_initial_params;
end
end
if exist('plot_flag','var')
if isempty(plot_flag)
plot_flag=1;
end
end
%p(1)=min; p(2)=max-min; p(3)=x50; p(4)=slope
%f = @(p,x) p(1) + (p(2)-p(1)) ./ (1 + exp((p(3)-x)*p(4))); % Modified function form with base e
f_str='f = @(param,xval)';
free_param_count=0;
bool_vec=NaN(1,4);
for i=1:4;
if isnan(fixed_params(i))
free_param_count=free_param_count+1;
f_str=[f_str ' param(' num2str(free_param_count) ')'];
bool_vec(i)=1;
else
f_str=[f_str ' ' num2str(fixed_params(i))];
bool_vec(i)=0;
end
if i==1; f_str=[f_str ' + (']; end
if i==2;
if isnan(fixed_params(1))
f_str=[f_str '-param(1) )./ ( 1 + exp( (']; % Modified here for base e
else
f_str=[f_str '-' num2str(fixed_params(1)) ')./ (1 + exp((']; % Modified here for base e
end
end
if i==3; f_str=[f_str ' - xval ) *']; end
if i==4; f_str=[f_str ' ) );']; end
end
eval(f_str)
%------added by me--------
% Set optimization options
options = statset('nlinfit');
options.MaxIter = 10000; % Set the maximum number of iterations here
%-------------------------
f
initial_params(bool_vec==1)
[BETA,RESID,J,COVB,MSE] = nlinfit(x,y,f,initial_params(bool_vec==1),options);
stat.param=BETA';
% confidence interval of the parameters
stat.paramCI = nlparci(BETA,RESID,'Jacobian',J);
% confidence interval of the estimation
[stat.ypred,delta] = nlpredci(f,x,BETA,RESID,'Covar',COVB);
stat.ypredlowerCI = stat.ypred - delta;
stat.ypredupperCI = stat.ypred + delta;
% plot(x,y,'ko') % observed data
% hold on
% plot(x,ypred,'k','LineWidth',2)
% plot(x,[lower,upper],'r--','LineWidth',1.5)
free_param_count=0;
for i=1:4;
if isnan(fixed_params(i))
free_param_count=free_param_count+1;
param(i)=BETA(free_param_count);
else
param(i)=fixed_params(i);
end
end
if plot_flag==1
x_vector=min(x):(max(x)-min(x))/100:max(x);
plot(x,y,'k.',x_vector,f(param(isnan(fixed_params)),x_vector),'r-')
xlim([min(x) max(x)])
end
end

4 comentarios

If you get param(1) = -176.45 for a function that is almost constant over the x-values, you should conclude that your fitting function is not adequate.
Akankshya Majhi
Akankshya Majhi el 8 de Ag. de 2024
Editada: Akankshya Majhi el 12 de Ag. de 2024
@Torsten: the function is not constant over x -values. Also, visually the function seems to fit the data quite well, so I am not able to resolve the warning.
@dpb: the fitted parameters in your case, is also outside the x -values especially the x50. Do you know what might be the reason? After fitting, I use the x50 and the last parameter to calculate xlag (similar to tlag in aggregation studies).
In this data, I can only fix the max, all other parameters are unknown. I have a few questions:
  1. If I understand correctly, you fixed the min to 0? If yes, then it is incorrect for the given data as it does not end at zero but somewhere close to 0.2. I already tried to give initial guess for min as 0.2 but that didnt resolve the warning. Used this code line: fPar = sigm_fit_base_e(M(:,1),M(:,2),[], [0.2 0.2845 9.88 -1],0)
  2. By estimating/fixing one of the parameters, are you suggesting to do something like this in the codeline:
fPar = sigm_fit_base_e(M(:,1),M(:,2),[nan 0.2845 nan nan], [0 0.2845 9.88 -1],0).
With this, I got fPar = [-982.629 0.284531 12.50458 -4.15507]
3. I have also tried to optimise the initial guess for max as the average of first 8 datapoints, fix this as the max value and put the initial guess for min as 0.2, and I got
fPar = [ -46.647 0.281424 11.26924 -5.87087 ]
Torsten
Torsten el 8 de Ag. de 2024
Editada: Torsten el 12 de Ag. de 2024
the function is constant over x -values. Also, visually the function seems to fit the data quite well, so I am not able to resolve the warning.
Read @John D'Errico 's answer for an explanation. Your data are simply not suited for this fitting function (and vice versa).
Your fitting function assumes that you have data for a sigmoid, but all you have are data for a "one-sided" sigmoid. The asymptote of your data for x -> +Inf is unknown.
@Torsten: Thank you. Yes, makes sense. Do you have any idea if there is no other suitable model to fit the data and get the relevant parameter (drop off point)?

Iniciar sesión para comentar.

Respuestas (2)

dpb
dpb el 8 de Ag. de 2024
Editada: dpb el 8 de Ag. de 2024
M = readmatrix('datai2j3.txt');
%fPar = sigm_fit_base_e(M(:,1),M(:,2),[],[0 0.2845 9.88 -1],1)
fPar = sigm_fit_base_e(M(:,1),M(:,2),[0 nan nan nan])
f = function_handle with value:
@(param,xval)0+(param(1)-0)./(1+exp((param(2)-xval)*param(3)))
fPar = 1x4
0 0.2817 10.3666 -6.0641
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
function [param,stat]=sigm_fit_base_e(x,y,fixed_params,initial_params,plot_flag)
% Optimization of parameters of the sigmoid function
%
% Syntax:
% [param]=sigm_fit(x,y)
%
% that is the same that
% [param]=sigm_fit(x,y,[],[],[]) % no fixed_params, automatic initial_params
%
% [param]=sigm_fit(x,y,fixed_params) % automatic initial_params
% [param]=sigm_fit(x,y,[],initial_params) % use it when the estimation is poor
% [param]=sigm_fit(x,y,fixed_params,initial_params,plot_flag)
%
% param = [min, max, x50, slope]
%
% if fixed_params=[NaN, NaN , NaN , NaN] % or fixed_params=[]
% optimization of "min", "max", "x50" and "slope" (default)
%
% if fixed_params=[0, 1 , NaN , NaN]
% optimization of x50 and slope of a sigmoid of ranging from 0 to 1
%
%
% Additional information in the second output, STAT
% [param,stat]=sigm_fit(x,y,fixed_params,initial_params,plot_flag)
%
%
% Example:
% %% generate data vectors (x and y)
% fsigm = @(param,xval) param(1)+(param(2)-param(1))./(1+exp((param(3)-xval)*param(4)))
% param=[0 1 5 1]; % "min", "max", "x50", "slope"
% x=0:0.1:10;
% y=fsigm(param,x) + 0.1*randn(size(x));
%
% %% standard parameter estimation
% [estimated_params]=sigm_fit(x,y)
%
% %% parameter estimation with forced 0.5 fixed min
% [estimated_params]=sigm_fit(x,y,[0.5 NaN NaN NaN])
%
% %% parameter estimation without plotting
% [estimated_params]=sigm_fit(x,y,[],[],0)
%
%
% Doubts, bugs: rpavao@gmail.com
% Downloaded from http://www.mathworks.com/matlabcentral/fileexchange/42641-sigmoid-logistic-curve-fit
% warning off
x=x(:);
y=y(:);
if nargin<=1 %fail
fprintf('');
help sigm_fit
return
end
automatic_initial_params=[quantile(y,0.05) quantile(y,0.95) NaN 1];
if sum(y==quantile(y,0.5))==0
temp=x(y==quantile(y(2:end),0.5));
else
temp=x(y==quantile(y,0.5));
end
automatic_initial_params(3)=temp(1);
if nargin==2 %simplest valid input
fixed_params=NaN(1,4);
initial_params=automatic_initial_params;
plot_flag=1;
end
if nargin==3
initial_params=automatic_initial_params;
plot_flag=1;
end
if nargin==4
plot_flag=1;
end
if exist('fixed_params','var')
if isempty(fixed_params)
fixed_params=NaN(1,4);
end
end
if exist('initial_params','var')
if isempty(initial_params)
initial_params=automatic_initial_params;
end
end
if exist('plot_flag','var')
if isempty(plot_flag)
plot_flag=1;
end
end
%p(1)=min; p(2)=max-min; p(3)=x50; p(4)=slope
%f = @(p,x) p(1) + (p(2)-p(1)) ./ (1 + exp((p(3)-x)*p(4))); % Modified function form with base e
f_str='f = @(param,xval)';
free_param_count=0;
bool_vec=NaN(1,4);
for i=1:4;
if isnan(fixed_params(i))
free_param_count=free_param_count+1;
f_str=[f_str ' param(' num2str(free_param_count) ')'];
bool_vec(i)=1;
else
f_str=[f_str ' ' num2str(fixed_params(i))];
bool_vec(i)=0;
end
if i==1; f_str=[f_str ' + (']; end
if i==2;
if isnan(fixed_params(1))
f_str=[f_str '-param(1) )./ ( 1 + exp( (']; % Modified here for base e
else
f_str=[f_str '-' num2str(fixed_params(1)) ')./ (1 + exp((']; % Modified here for base e
end
end
if i==3; f_str=[f_str ' - xval ) *']; end
if i==4; f_str=[f_str ' ) );']; end
end
eval(f_str);
%------added by me--------
% Set optimization options
options = statset('nlinfit');
options.MaxIter = 10000; % Set the maximum number of iterations here
%-------------------------
f
initial_params(bool_vec==1);
[BETA,RESID,J,COVB,MSE] = nlinfit(x,y,f,initial_params(bool_vec==1),options);
stat.param=BETA';
% confidence interval of the parameters
stat.paramCI = nlparci(BETA,RESID,'Jacobian',J);
% confidence interval of the estimation
[stat.ypred,delta] = nlpredci(f,x,BETA,RESID,'Covar',COVB);
stat.ypredlowerCI = stat.ypred - delta;
stat.ypredupperCI = stat.ypred + delta;
% plot(x,y,'ko') % observed data
% hold on
% plot(x,ypred,'k','LineWidth',2)
% plot(x,[lower,upper],'r--','LineWidth',1.5)
free_param_count=0;
for i=1:4;
if isnan(fixed_params(i))
free_param_count=free_param_count+1;
param(i)=BETA(free_param_count);
else
param(i)=fixed_params(i);
end
end
if plot_flag==1
x_vector=min(x):(max(x)-min(x))/100:max(x);
plot(x,y,'k.',x_vector,f(param(isnan(fixed_params)),x_vector),'r-')
xlim([min(x) max(x)])
end
end
The model is over-paramaterized -- eliminating trying to estimate the first seems to remove the numerical instability issues...
The output is somewhat confusing; he returns all four possible parameters in the resulting array but the printed model only references the estimated ones. You'll have to fix up the function to use the results but looks like at least reasonable fit given the noise in the data.

1 comentario

John D'Errico
John D'Errico el 8 de Ag. de 2024
Editada: John D'Errico el 8 de Ag. de 2024
I would disagree about the statement of being over-parameterized.
A good example of a classically over-parameterized model is something like
y = (a+b)*x
where you cannot distinguish between the parameters a and b. All you can ever estimate there is the sum of a and b. And that is not the case here.
In the case of the OP model, param(1) and param(2) have very well defined meanings. And if the data were sufficient, they could indeed be estimated. The fundamental problem is that you cannot estimate the asymptote as x--> +inf. There simply is no information content in the data out there to show when the curve will start to turn around and flatten out on the right end.
This is why the model ends up with a singularity. It also explains why, if you removed one of the parameters, then the other parameter becomes sufficient to converge.

Iniciar sesión para comentar.

John D'Errico
John D'Errico el 8 de Ag. de 2024
Editada: John D'Errico el 8 de Ag. de 2024
A not uncommon misunderstanding about modeling.
First, look carefully at your data.
M = importdata('datai2j3.txt');
x = M(:,1);
y = M(:,2);
plot(x,y,'o')
AND think about the model you have posed. This is your model:
f = function_handle with value:
@(param,xval)param(1)+(param(2)-param(1))./(1+exp((param(3)-xval)*param(4)))
What happens in that model at plus infinity? That is, what is the limit assuming param(4) is negative? We don't really care what the value is, as long as param(4) is negative. The exponential will itself go to infinity. That means the model reduces to
y(+inf) = param(1)
when x gets large.
Similarly, we can look at what happens for large negative x. Now the exponential goes to zero, and the model reduces to
y(-inf) = param(1) +(param(2) - param(1))/(1 + 0)
so we will have
y(-inf) = param(2)
at minus infinity.
Ok, NOW GO BACK AND LOOK AT YOUR DATA!!!!!!!!!! THINK ABOUT IT!!!!!!!
The left asymptote is well defined. At minus infinity, I'd guess the left asymptote will be somewhere around 0.3, plus or minus some random crap.
What happens at x = PLUS infinity? Is there ANY information about the right asymptote? NOOOOOOOOOO!!!!!!!
There is no information content in your data to estimate the parameters in your model.
Now, finally, go back and read the message you got.
Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be estimated well (they are not identifiable). Use caution in making predictions.
There is no meaningful value that can be assigned to param(1), BASED ON THIS DATA. There simply is insufficient information.
If the data were different, where the curve is starting to roll over on the right end, then you would find the model estimation process becomes well-posed. The Jacobian will no longer be singular, and you would have been happy.

12 comentarios

The REAL mathematician speaks! :)
I admit all I did was look at the two additive parameters in the beginning of the model and remove one of those...I didn't look at the details at all--mea culpa!
John D'Errico
John D'Errico el 8 de Ag. de 2024
Editada: John D'Errico el 9 de Ag. de 2024
I'm just a numerical animalist.
Actually, this reminds me of my VERY early days. My mentor wanted me to write code to extrapolate long term behavior of a series, based on only the very beginning of a rise. Some impressive mathematics/statistics behind it, at least it sounded good.
Implemented entirely by a kid (me, pre-grad school) with literally no coding experience. The user interface was non-existent. On an old PDP series computer, over a phone link. And it was slow as molasses in the winter.
Any guesses as to how badly it failed? At least I learned a few lessons about what not to do.
Akankshya Majhi
Akankshya Majhi el 12 de Ag. de 2024
Editada: Akankshya Majhi el 12 de Ag. de 2024
Thank you for the nice explanation @John D'Errico. Now, I am curious (a bit confused) how the model fits all the other datasets (datai2j1 and datai2j7) similar to the the dataset in the original question or probably I am doing something entirely incorrect.
dataseti2j1: fPar1 = sigm_fit_base_e(M1(:,1), M1(:,2),[],[0 0.3502 9.88 -1],0); gives
fPar1 = [0.2731 0.3483 10.0183 -9.5138]
dataseti2j7: fPar7 = sigm_fit_base_e(M7(:,1), M7(:,2),[],[0 0.3502 9.88 -1],0); gives
fPar7 = [-0.1238 0.1395 10.4285 -7.1417]; fPar(1) is still not completely accurate.
Attached I have the plot with all the datasets (alldataplots.png), here, I am trying to fit the model only with the datapoints from x =9 to x =10, i.e. left half of each of the plots. All I am trying to do with these fiiting is to get a parameter xlag which corresponds to a 5% drop of the amplitude of the transition (ymax- ymin) from the maximum plateau value (ymax). Attached is an example (example_fit.png) where I have marked the xlag location (pink circle) from the curve fit. As you notice, the fit indeed does not have a right asymptote but atleast it follows the curve upto x=10 and a bit beyond that.
Would fitting the model with the full dataset (attached datai2j3_full.txt for the dataset in the original question) improve the model parameter estimation and give the correct xlag position that I am interested in? Do you suggest any other more efficient method to get the point where the curve deviates from the initial plateau value? Is there a better model to fit all these datasets?
Any suggestions would be helpful.
You clearly did not think about what I wrote. I looked at the first one of the sets you showed.
M = importdata('datai2j1.txt');
x = M(:,1);
y = M(:,2);
plot(x,y,'o')
Do you see that the very last data point on the right appears to be just starting to roll over? Maybe even the last two points?
Let me focus on only the last 4 points on that curve to make it clear.
p1 = fit(x(end-3:end),y(end-3:end),'poly1')
p1 =
Linear model Poly1: p1(x) = p1*x + p2 Coefficients (with 95% confidence bounds): p1 = -0.1716 (-0.1938, -0.1493) p2 = 2.029 (1.808, 2.251)
p2 = fit(x(end-3:end),y(end-3:end),'poly2')
p2 =
Linear model Poly2: p2(x) = p1*x^2 + p2*x + p3 Coefficients (with 95% confidence bounds): p1 = 0.1672 (0.1114, 0.2229) p2 = -3.502 (-4.613, -2.391) p3 = 18.62 (13.08, 24.15)
plot(p1,'r')
hold on
plot(p2,'b')
plot(x(end-3:end),y(end-3:end),'ks')
grid on
xlim([9.85 10.1])
hold off
Look Closely. Think!!
Do you see that the points miss a straight line fit through those last 4 points, in a way that hints at a positive curvature?
Now look carefully at the blue curve. Do you see this is a much better fit to those points? AND do you see that this region appears to be one of positive curvature? Now look at the regression coefficients for p2. Do you see that the confidence interval on the x^2 coefficient does not include 0 in the p2 fit? As I said in my visual impression from the plot alone, the curve appears to be just starting to roll over. And that is sufficient to not fail.
This gives the regression just enough of a hint as to the long term shape. It is enough of a hint, that you will not see a singularity. In fact, it you try to put confidence limits on that parameter, they would be insanely wide. But the limits would be FINITE. When a singularity exists, the confidence limits are essentially infinitely wide.
Now, go back and look at the data set I plotted. Nothing has started to roll over in the curve I plotted in my original answer.
You need to understand the difference between no information at all, and a tiny bit of information. It means that one estimation fails due to a singularity, while the other "succeeds" even though success in this case will still be a terribly poor estimate.
Now go back and read my original answer, CAREFULLY. Think about what I said. I explained quite clearly what is happening in that answer, and why it failed.
Akankshya Majhi
Akankshya Majhi el 14 de Ag. de 2024
Editada: John Kelly el 15 de Ag. de 2024
@John D'ErricoThank you for your reply and your detailed explanation.
I understood what you said in your original answer and the follow up answer but I am looking for a solution/a better model to fit my data sets (all of them) to get the drop off point. I am not sure if you looked carefully at all the attachments that I added to my follow up question (not that you have to check it further, you already have given a detailed explanation, thanks!).
Thank you once again for all the help!
How is the "drop off point" defined ? Can you analytically deduce it from the fitted function ? Or why else do you think fitting your data is necessary to determine this point ?
Drop off point is the point which corresponds to 5% drop of the amplitude of the transition (ymax- ymin) from the maximum plateau value (ymax), similar to the tlag which is common in protein aggregation studies (ThT assay I think). I have shown an example in the attachment (example_fit.png in the previous answer) where I have marked this point (pink circle) from the curve fit. This point has a physical meaning to my measurement data.
Yes, I analytically deduce it from the fitted function, I do the following: I fit the function to my data and then calculate the x-coordinate of this drop off position = .
If the fitting is good enough then this x-coordinate point would be between x=9 and x= 10. Due to the bad fitting, I get a random drop off point which does not make physical sense/explanation to the experimental data.
I believe that fitting my data to a specific function and deriving this point for all the different datasets (across various materials, flow conditions, etc.) would provide a more robust and consistent approach. However, if you have any other suggestions or see a different approach, I'd appreciate your input.
The model you want to use is meaningless when the data does not support it. One piece of information in that model CANNOT (please focus on the word CANNOT) be estimated. Do you understand what I mean when I say that? The result is a singularity. When you try to fit that model to the original data you posted, it will fail.
The followup cases you showed will be sufficient to estimate the model you posed. and so you will not see a singularity, even though two of the parameter estimates will be meaningless garbage. By this, I mean, in this model, with SOME (please focus on the word SOME) of your sets of data
f = function_handle with value:
@(param,xval)param(1)+(param(2)-param(1))./(1+exp((param(3)-xval)*param(4)))
here param(1) and param(2) will be hopelessly correlated when there is no information provided in the data about the long term behavior. You canot untangle them from each other, in that model, with the original data.
An important point is that you cannot fit any arbitrary model to any arbitrary set of data. Just wanting it to happen is not sufficient. Unlike in the movies, you cannot just click the ruby slippers together and make something happen.
Some of your data sets do not contain sufficient information to infer the long term shape of that curve. That will be generally the case when your data does not extend into the region of that curve where it has started to show a roll-over, thus where it begins to show positive curvature. It may also happen if your data is too noisy, expecially on the right end, where any information from curvature is crucially important.
Ok, in your comment, you state that you want to find a drop-off point, where the curve has dropped off by 5%. But 5% of WHAT? For example, in this data...
M3 = importdata('datai2j3.txt');
x3 = M3(:,1);
y3 = M3(:,2);
plot(x3,y3,'o')
Lets assume the left asymptote is approximately 0.285. Then a 5% decrease in that value will happen roughly at
0.95*0.285
ans = 0.2707
So we would be looking for the point where you cuve crosses
yline(0.95*0.285,'r')
That is easy to do.
OR, do you mean that you want to find the point where 5% of the total transition has occurred? The problem is, you CANNOT estimate the 5% point there, because you cannot infer the total distance from max to min. So if by the words drop-off, you need to infer that value, then there is nothing you can do. (Even on those curves where there is SOME evidence of roll-over, but very little, then that 5% point will be very poorly estimated.)
If all you want is to estimate the left hand asymptotoe, and then find a point that is lower by 5% of that value, then again the problem is trivial.
You seem to be willing to accept other models than the sigmoidal model. You talk about a 5% point. But what you need to understand is that IF you want to infer the distance from max to min and to then find the 5% point, then other models will ALSO fail. You cannot intelligently infer that distance from data that does not support any indication of even rolling over. And even then, on the cases where there is some evidence of a roll-over, the confidence limits on where the right hand asymptote will lie will be extremely wide, so as to make any computation of the 5% point meaningless.
If your goal is merely to find the left hand asymptote, and to then find a point where it has decreased by 5%, then things become simpler, but even then you will find some sets of data that will yield better predictions than others.
So first, you need to define what you mean by a 5% dropoff.
dpb
dpb el 14 de Ag. de 2024
Editada: John Kelly el 15 de Ag. de 2024
Would there be any reason to think the data are symmetric such that one could artifically reflect the data about the end point (L to R and Up to Down) and then fit the sigmoid?
Otherwise, @John D'Errico's points about there being no obvious way to define "5% from what?", it would appear to me you would be forced into some heuristic attempt to set the initial asymptotic value, but even that seems pretty nebulous given the apparent slope of the example dataset above even at the LH extreme...
M = importdata('datai2j3.txt');
x = M(:,1);
y = M(:,2);
plot(x,y,'o')
hold on
findchangepts(y,'statistic','mean')
findchangepts(y,'statistic','linear')
findchangepts(y,'statistic','rms')
ix=[findchangepts(y,'statistic','mean');findchangepts(y,'statistic','linear');findchangepts(y,'statistic','rms')];
S={'mean','linear','rms'};for i=1:3,fprintf('Bkpt %s: %d\n',S{i},ix(i)),end
Bkpt mean: 17 Bkpt linear: 16 Bkpt rms: 17
Is one prepackaged way to pick a point...there are probably better, but...

Iniciar sesión para comentar.

Productos

Versión

R2022b

Preguntada:

el 8 de Ag. de 2024

Editada:

el 15 de Ag. de 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by