File Exchange

image thumbnail


version 2.1.2 (3.47 MB) by milan batista
Estimation of coronavirus COVID-19 epidemic size by the logistic model


Updated 14 Apr 2020

View License

Editor's Note: This file was selected as MATLAB Central Pick of the Week

The function fitVirus03 implements a logistic model for estimation of epidemy final size from daily predictions. The model is data-driven, so its forecast is as good as data are. Also, it is assumed that the model is a reasonable description of the one-stage epidemic. If however, the epidemic evolves to the second phase the model becomes useless. The model is also useless to the initial epidemic phase.

The contribute contains data for coronavirus for Austria, Belgium, China, Croatia, Denmark, Germany, Hungary, France, Iran, Italy, Lombardia, Norway, Netherlands, NY State, Portugal, Slovenia, South Korea, Spain, Switzerland, UK, USA and data for outside of China (up to 24.Mar.2020)

The regression convergence may fail for a pure initial guess or small data set. Therefore the method does not apply to the early stages of an epidemic. Also, results are useless if regression statistic does not meet minimum criteria, say R^2 > 0.8, p-value < 0.05.

On the epidemy evaluation graph regions colors separate epidemy phases (these are not standard but arbitrarily chosen for convenience):
red - fast growth phase
yellow - transition to steady-state phase
green - ending phase (plateau stage)

The second figure produced is the evaluation of daily epidemy size. If these values do not converge to a constant then epidemic is probably not yet stable.

A more detailed description can be found in
Examples can be found in

A new version based on SIR model is available at

Data for other countries are available from

DISCLAIMER. Software and data are for education and not for medical or commercial use.

Cite As

milan batista (2020). fitVirus (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (44)

Rock Interpreter

Hi Milan!
First of all, I must appreciate and congratulate you for the codes. I have a little doubt regarding SIR model. As you said, SIR works better than the logistic model and more robust. I tried with your SIR model. But I find difficulties to understand the total population number N, which appears to be very small (about 6,26,000) while analyzing for India. Since India has a huge population (1.35 billion), how to implement SIR model? Your comment will be highly appreciated. Look forward for your reply.
Shib G

Roberto Parente

Yusuf Kursat Tuncel

Jeta Statovci

Can you add the Kosovo data?

roberto fragoso

Thank you very much update my programs and fixed the failure.

milan batista

Roberto, do you have the Statistical toolbox installed?

roberto fragoso

Hello, I have a problem trying to run the fitVirus03 function and Matlab presents me with this error:

>> fitVirus03(@getDataGermany);
**** Estimation of epidemy size for Germany
Initial guess K = 126123 r = 0.287012 A = 18645
Error using optimoptions (line 105)
'SpecifyObjectiveGradient' is not an option for LSQCURVEFIT.
A list of options can be found on the LSQCURVEFIT documentation page.

Error in fitVirus03 (line 50)
opts = optimoptions('lsqcurvefit','Display','off',...

They can help me know I'm doing wrong.
Thank you

David Franco

Thank you!

David Franco

Please, update this code with the graphics from fitVirusCOVID19.

David Franco

Thank you!

lue mark


Great, but the modell would have a much bigger impact if it would run in GNU Octave too (i.e. optimoptions and nested functions need compatible version).

Ricardo Pinheiro

Does anyone tried to port the Matlab code to other solution, like GNU Octave?

I sent do Mr. Batista the data from Brazil, so he can add to the report.

milan batista

To all. The SIR model version has improved convergence and initial guess calculation. I think it works better than the logistic model, nevertheless, it is more robust.

milan batista

Dear Claudio, Thanks for your suggestion. Please, keep in mind that the logistics model is very simple. Daily forecasts can be very good. My forecast for Slovenia was a few percents by March 19th. But on that day, we had a local outbreak (jump). After such an event, the forecasts are useless for a few days because the daily predicted values are below the actual ones. This situation changed in a few days (as in Chana Feb 12). The SIR model has a similar problem.

Adam Hepworth

Claudio Gelmi

Dear Milan, I have been using your function in Chile, and for the last three days, the predictions are quite good. I added a 95% confidence interval for the "next day" prediction. Since you are already using the SML Toolbox, it may be useful for more users. Here are my lines of code:

[betaNL,RNL,JNL] = nlinfit(samplaTime(1:n),sampleC(1:n),@fun,coef);
[Ypred,delta] = nlpredci(@fun,[samplaTime(end)+1]',betaNL,RNL,'Jacobian',JNL);
T = table(samplaTime(end)+2,round(Ypred),round(delta),'VariableNames',{'Day','Prediction','CI'})

Thanks again for sharing.

Diego Roldan

Very cool!

Ivo L

Great job Milan. For Portugal I suggest to check this source (Portugal's health department):

Sebastian Hölz

'fitnlm' requires Statistics and Machine Learning Toolbox, you should update the requirements.

Joshua McGee

For an updated version with condensed code (all in a single .m file) and automatic data retrieval for COVID-19 and each country:

Joshua McGee

Great job milan!

milan batista

Hi, what do you mean by your last sentence?



First of all thank you for the Matlab model. It seems to work perfectly. I updated with Portugal cases and it appears to be predicting perfectly also. How can I update the portuguese numbers?

Andrea Augello

milan batista

Prof. Rolf Boelens provides the data and scripts for Netherland and USA.

Morgan Evans

Excellent model. Use it everyday. Thank you to the hard work. Any idea when we can expect a USA model?

milan batista

Thank you. The intended goal of the program is to help people evaluate when an epidemic will be over and to estimate if the measures are effective. For now, I publish daily reports at the web address above.

Thank you for the update! Great work!
Can we expect new graphs every other day?
Here or somewhere else?

Maurice Politis

Mike Rudolph

Morgan Evans

Peter Graat

Nice, but requires Optimization Toolbox


I tried the model updating the italian data. Good job. Thanks for sharing.

milan batista

Successive regressions use MATLAB function lsqcurvefit which has no statistics output. So another fit is made with MATLAB function fitnlm. The results may differ (for the small data set) - I don't know why - therefore the warning just to remind one to be careful with the interpretation of results.

James Gana

Absolutely incredible job and model, just trying a couple of countries through it right now. In Germany, although the regression model seems to fit, I get the following message: "***Warning: results of lsqcurvefit and fitnlm differs significantly.
Knlm/Klsq = 358.476
rnlm/rlsq = 0.998951
Anlm/Alsq = 357.319"

I cannot understand the root cause, as the initial guess is succesfull...

milan batista

I have no experience with Github, but I will try to do what you are suggested.

Christian Schröder

@milan I was thinking of perhaps putting the code on Github/Gitlab so others could send pull requests etc.

Claudio Gelmi

Nice job Milan! Thanks for sharing.

milan batista

They can make their own MATLAB contribution and freely add fitVirus script. What do you suggest?

Christian Schröder

This is an excellent idea, and a great opportunity for students to learn a bit about both MATLAB and statistics.

What's the best way of contributing data files for other countries?

Gianmarco Zonta



Change image


Add RMSE to the graph. Update data.


Change the graph layout. Update data.


Suppress invalid forecasting when the actual number of cases is greater than actual


Correct description


Add data source


Add data for Denmark, Hungary, Norway, NY State


Improve initial guess. Add data for Belgium, Croatia, UK. Upgrades data for the Netherlands (thanks to Rolf Boelens)


add note about fitVirusCOVID19 program


Correct link to new version


Correct iniGuess (thanks to Nikolas Wernecke). Actual daily cases are added to the graph. R2 for total cases and infection rates are added to the summary. Data for Portugal are included.


Add note about a new version of the program (thanks to Joshua McGee)


Update requirements


Data and scripts for the Netherlands and USA added (thanks to Rolf Boelens)


Update description


Add link to examples


Add data for Austria


Correct data


Update data. Add data for France, Switzerland


Change summary


add data for Spain


update description


add data for Germany


Update data. Add summary report to live scripts.


Remove the upper limit.


Update data. Minor changes. R2 is now included in the table.


Minor changes


Change imaga


Major revision. Remove Weibull regression, remove graph for peak time, combine graph for epidemy evaluation and its rate. Add epidemy duration and end date to report. Replace regressor A with C0. Add data for Slovenia.


Minor corrections


Add data for Iran and out of China. Weibull regression is now optional.


Add data up to 7.Mar.2020


Correct data for South Korea


Add data for 5.Mar.2020


Correct calculation of relative error of daily predictions


Add data for Italy


Correct description


Add data for 4.Mar.2020


Correct description


Add data for 3 Mar 2020


Update description


Update example


Add image

MATLAB Release Compatibility
Created with R2019b
Compatible with any release
Platform Compatibility
Windows macOS Linux