Tobit
Description
Create and analyze a Tobit model object to calculate
                the exposure at default (EAD) using this workflow:
- Use - fitEADModelto create a- Tobitmodel object.
- Use - predictto predict the EAD.
- Use - modelDiscriminationto return AUROC and ROC data. You can plot the results using- modelDiscriminationPlot.
- Use - modelCalibrationto return the R-squared, RMSE, correlation, and sample mean error of predicted and observed EAD data. You can plot the results using- modelCalibrationPlot.
Creation
Description
TobitEADModel = fitEADModel(___,Name=Value)eadModel =
                            fitEADModel(EADData,ModelType,PredictorVars={'UtilizationRate','Age','Marriage'},ConversionMeasure="ccf",DrawnVar='Drawn',LimitVar='Limit',ResponseVar='EAD')
                        creates an eadModel object using a
                            Tobit model type. 
Input Arguments
Data for exposure at default, specified as a table.
Data Types: table
Model type, specified as a string with the value of
                                    "Tobit" or a character vector with the value
                                of 'Tobit'. 
Data Types: char | string
Name-Value Arguments
Specify optional pairs of arguments as
      Name1=Value1,...,NameN=ValueN, where Name is
      the argument name and Value is the corresponding value.
      Name-value arguments must appear after other arguments, but the order of the
      pairs does not matter.
    
Example: eadModel =
                        fitEADModel(EADData,ModelType,PredictorVars={'UtilizationRate','Age','Marriage'},ConversionMeasure="ccf",DrawnVar='Drawn',LimitVar='Limit',ResponseVar='EAD')
User-defined model ID, specified as ModelID
                                and a string or character vector. The software uses the
                                    ModelID text to format outputs and is
                                expected to be short.
Data Types: string | char
User-defined description for model, specified as
                                    Description and a string or character
                                vector.
Data Types: string | char
Predictor variables, specified as
                                    PredictorVars and a string array or cell
                                array of character vectors. PredictorVars
                                indicates which columns in the data input
                                contain the predictor information. By default,
                                    PredictorVars is set to all the columns in
                                the data input except for
                                    ResponseVar.
Data Types: string | cell
Response variable, specified as ResponseVar
                                and a string or character vector. The response variable contains the
                                EAD data and must be a numeric variable. By default,
                                    ResponseVar is set to the last column.
Data Types: string | char
Limit variable, specified as LimitVar and a
                                string or character vector. LimitVar indicates
                                which column in data contains the limit amount.
                                The limit amount value in the data must be a
                                positive numeric value. The limit depends on the loan. If its a
                                credit card, the limit is the credit limit, and if this is a
                                mortgage limit it is the initial loan amount. In general,
                                    LimitVar is the maximum amount that can be
                                borrowed. 
Note
LimitVar is required when
                                        ConversionMeasure is
                                        'ccf' or 'lcf'. For
                                    more information on CCF and LCF, see Conversion Measure Options.
Data Types: string | char
Drawn variable, specified as DrawnVar and a
                                string or character vector. DrawnVar is the
                                balance on the account at the time of observation, prior to default
                                and EAD is the balance at the time of default.
                                    DrawnVar indicates which column in
                                    data contains the drawn amount.  The drawn
                                variable value in the data can be a positive or
                                negative numeric value.
Note
DrawnVar is required when
                                        ConversionMeasure is
                                        'ccf'. 
If the ConversionMeasure is
                                        'lcf', DrawnVar is not
                                    required. In this case, DrawnVar is set to
                                        "".
For more information on CCF, see Conversion Measure Options.
Data Types: string | char
Response transform, specified as
                                    ConversionMeasure and a character vector or string.
- "ccf"— Credit conversion factor (CCF) is the portion of the undrawn amount that will be converted into credit. The undrawn amount is the limit minus the drawn amount. The EAD thus becomes the drawn amount plus the CCF times the limit minus the drawn amount (- EAD = Drawn + CCF*(Limit - Drawn)).- Note - A - Tobitmodel with- "ccf"can be unstable.
- "lcf"— Limit conversion factor (LCF) is a fraction of the limit representing the total exposure. The EAD is then defined as the LCF times the limit (- EAD = LCF*Limit).
For more information on CCF and LCF, see Conversion Measure Options.
Data Types: string | char
Censoring side, specified as CensoringSide
                                and a character vector or string. CensoringSide
                                indicates whether the desired Tobit model is left-censored,
                                right-censored, or censored on both sides.
Data Types: string | char
Left-censoring limit, specified as
                                    LeftLimit and a scalar numeric between
                                    0 and 1.
Data Types: double
Right-censoring limit, specified as
                                    RightLimit and a scalar numeric between
                                    0 and 1.
Data Types: double
Options for fitting, specified as
                                    SolverOptions and an
                                    optimoptions object that is created using
                                    optimoptions from
                                    Optimization Toolbox™. The defaults for the optimoptions
                                object are:
- "Display"—- "none"
- "Algorithm"—- "sqp"
- "MaxFunctionEvaluations"—- 500⨉ Number of model coefficients
- "MaxIterations"— The number of Tobit model coefficients is determined at run time; it depends on the number of predictors and the number of categories in the categorical predictors.
Note
When using optimoptions with a Tobit
                                        model, specify the SolverName as
                                            fmincon.
Data Types: object
Properties
User-defined model ID, returned as a string.
Data Types: string
User-defined description, returned as a string.
Data Types: string
This property is read-only.
Underlying statistical model, returned as a compact linear model
                            object. The compact version of the underlying regression model is an
                            instance of the classreg.regr.CompactLinearModel
                            class. For more information, see fitlm and CompactLinearModel.
Data Types: CompactLinearModel
Predictor variables, returned as a string array.
Data Types: string
Response variable, returned as a string.
Data Types: string
Limit variable, returned as a string.
Data Types: string
Drawn variable, returned as a string.
Data Types: string
Response transform, returned as a string.
Data Types: string
This property is read-only.
Censoring side, returned as a string.
Data Types: string
This property is read-only.
Left-censoring limit, returned as a scalar numeric between
                                0 and 1.
Data Types: double
This property is read-only.
Right-censoring limit, returned as a scalar numeric between
                                0 and 1.
Data Types: double
Object Functions
| predict | Predict exposure at default | 
| modelDiscrimination | Compute AUROC and ROC data | 
| modelDiscriminationPlot | Plot ROC curve | 
| modelCalibration | Compute R-square, RMSE, correlation, and sample mean error of predicted and observed EADs | 
| modelCalibrationPlot | Scatter plot of predicted and observed EADs | 
Examples
This example shows how to use fitEADModel to create a Tobit model for exposure at default (EAD). 
Load EAD Data
Load the EAD data.
load EADData.mat
head(EADData)    UtilizationRate    Age     Marriage        Limit         Drawn          EAD    
    _______________    ___    ___________    __________    __________    __________
        0.24359        25     not married         44776         10907         44740
        0.96946        44     not married    2.1405e+05    2.0751e+05         40678
              0        40     married        1.6581e+05             0    1.6567e+05
        0.53242        38     not married    1.7375e+05         92506        1593.5
         0.2583        30     not married         26258        6782.5        54.175
        0.17039        54     married        1.7357e+05         29575        576.69
        0.18586        27     not married         19590          3641        998.49
        0.85372        42     not married    2.0712e+05    1.7682e+05    1.6454e+05
rng('default'); NumObs = height(EADData); c = cvpartition(NumObs,'HoldOut',0.4); TrainingInd = training(c); TestInd = test(c);
Select Model Type
Select a model type for Tobit or Regression.
ModelType =  "Tobit";
"Tobit";Select Conversion Measure
Select a conversion measure for the EAD response values.
ConversionMeasure =  "LCF";
"LCF";Create Tobit EAD Model
Use fitEADModel to create a Tobit model using the EADData.
eadModel = fitEADModel(EADData,ModelType,PredictorVars={'UtilizationRate','Age','Marriage'}, ...
    ConversionMeasure=ConversionMeasure,DrawnVar="Drawn",LimitVar="Limit",ResponseVar="EAD");
disp(eadModel);  Tobit with properties:
        CensoringSide: "both"
            LeftLimit: 0
           RightLimit: 1
              ModelID: "Tobit"
          Description: ""
      UnderlyingModel: [1×1 risk.internal.credit.TobitModel]
        PredictorVars: ["UtilizationRate"    "Age"    "Marriage"]
          ResponseVar: "EAD"
             LimitVar: "Limit"
             DrawnVar: "Drawn"
    ConversionMeasure: "lcf"
Display the underlying model. The underlying model's response variable is the transformation of the EAD response data. Use the 'LimitVar' and 'DrawnVar' name-value arguments to modify the transformation.
disp(eadModel.UnderlyingModel);
Tobit regression model:
     EAD_lcf = max(0,min(Y*,1))
     Y* ~ 1 + UtilizationRate + Age + Marriage
Estimated coefficients:
                             Estimate         SE         tStat      pValue 
                            __________    __________    ________    _______
    (Intercept)                0.22735      0.026213       8.673          0
    UtilizationRate            0.47364      0.016436      28.818          0
    Age                     -0.0013929    0.00063758     -2.1847    0.02896
    Marriage_not married    -0.0068879      0.012276    -0.56108    0.57477
    (Sigma)                    0.36419     0.0038855       93.73          0
Number of observations: 4378
Number of left-censored observations: 0
Number of uncensored observations: 4377
Number of right-censored observations: 1
Log-likelihood: -1791.06
Predict EAD
EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict function with different options for the 'ModelLevel' name-vale argument.
predictedEAD = predict(eadModel,EADData(TestInd,:),ModelLevel="ead"); predictedConversion = predict(eadModel,EADData(TestInd,:),ModelLevel="ConversionMeasure");
Validate EAD Model
For model validation, use modelDiscrimination, modelDiscriminationPlot, modelCalibration, and modelCalibrationPlot. 
Use modelDiscrimination and then modelDiscriminationPlot to plot the ROC curve.
ModelLevel ="ConversionMeasure"; [DiscMeasure1,DiscData1] = modelDiscrimination(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel); modelDiscriminationPlot(eadModel,EADData(TestInd, :),ModelLevel=ModelLevel,SegmentBy="Marriage");

Use modelCalibration and then modelCalibrationPlot to show a scatter plot of the predictions.
YData =  "Observed";
[CalMeasure1,CalData1] = modelCalibration(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel);
modelCalibrationPlot(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel,YData=YData);
"Observed";
[CalMeasure1,CalData1] = modelCalibration(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel);
modelCalibrationPlot(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel,YData=YData);
Plot a histogram of observed with respect to the predicted EAD.
figure; histogram(CalData1.Observed); hold on; histogram(CalData1.(('Predicted_' + ModelType))); legend('Observed','Predicted');

More About
The exposure at default (EAD) Tobit models fit a Tobit model to EAD data.
Tobit models are "censored" regression models. Tobit models assume that the
                response variable can be observed only within certain limits, and no value outside
                the limits can be observed. Using ModelLevel, you can set the
                Tobit model level to EAD, CCF, or
                    LCF conversion measures. The EAD model
                level does not have any range, the CCF conversion measure has a
                range of -Inf to 1, and the
                    LCF conversion measure is 0 to
                    1. A distribution of response values where there is a high
                frequency of observations at the limits is consistent with the model
                assumptions.
The Tobit model combines the following two formulas:
where
- Y is the observed response variable, the observed EAD data for an EAD model. 
- L is the left limit, the lower bound for the response values, typically - 0for EAD models.
- R is the right limit, the upper bound for the response values, typically - 1for EAD models.
- Y* is a latent, unobserved variable. 
- βj is the coefficient of the jth predictor (or the intercept for j = - 0).
- σ is the standard deviation of the error term. 
- ϵ is the error term, assumed to follow a standard normal distribution. 
The first formula above is written using min and
                        max operators and is equivalent to
The standard deviation of the error is explicitly indicated in the formulas. Unlike traditional regression least-squares estimation, where the standard deviation of the error can be inferred from the residuals, for Tobit models the estimation is via maximum likelihood and the standard deviation needs to be handled explicitly during the estimation. If there are p predictor variables, the Tobit model estimates p+2 coefficients, namely, one coefficient for each predictor, plus an intercept, plus a standard deviation.
Three censoring side options are supported in the Tobit EAD models with the
                        CensoringSide name-value argument:
- 'both'— This option is the default option, with censoring on both sides. The estimation uses left and right limits.
- 'left'— The left-censored version of the model has no right limit (or R = ∞). The relationship between Y and Y* is Y =- maxâ¡{L,Y* }.
- 'right'— The right-censored version of the model has no left limit (or L = -∞). The relationship between Y and Y* is Y =- min{Y*,R}.
The parameters of the Tobit model are estimated using maximum likelihood. For observation i = 1,...,n, the likelihood function is
where
- Φ(x;m,s) is the cumulative normal distribution with mean m and standard deviation s. 
- φ(x;m,s) is the normal density function with mean m and standard deviation s. 
This likelihood function is for models censored on both sides. For left-censored models, the right limit has no effect, and the likelihood function has two cases only (R = ∞); likewise for right-censored models (L = -∞).
The log-likelihood function is the sum of the logarithm of the likelihood functions for individual observations
The parameters are estimated by maximizing the log-likelihood function. The only constraint is that the σ parameter must be positive.
To predict an EAD value, Tobit EAD models return the unconditional expected value of the response, given the predictor values
The expression for the expected value can be separated into the cases
Using the previous expression and the properties of the (truncated) normal distribution, it follows that
where
This expression applies to the models censored on both sides. For models censored on one side only, the corresponding expressions can be derived from here. For example, for left-censored models, let the R limit in the expression above go to infinity, and the resulting expression is
Similarly, for right-censored models, the L limit is decreased to minus infinity to get
You can relate the EAD to a scaling variable and derive
                conversion measures like credit conversion factor (CCF) and limit conversion factor
                (LCF) using the 'ccf' or 'lcf' options for the
                    ConversionMeasure name-value argument.
The following table summarizes the supported transformations using the
                    'ccf' or 'lcf' options for the
                    ConversionMeasure name-value argument:
| Measure | EAD Formula | Lower Bound | Upper Bound | Inverse Transformation | 
|---|---|---|---|---|
| CCF | EAD = Drawn + CCF × (Limit -
                                    Drawn) | -Inf | 1 | CCF = 1 - e(-
                                                CCFt) | 
| LCF | EAD = LCF ⨉ Limit | 0 | 1 | LCF = eLCFt
                                         ∕ (1 +
                                                eLCFt) | 
References
[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.
[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.
[3] Brown, Iain. Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT: Theory and Applications. SAS Institute, 2014.
[4] Roesch, Daniel and Harald Scheule. Deep Credit Risk. Independently published, 2020.
Version History
Introduced in R2021bThe modelAccuracy object function is renamed to
                    modelCalibration function. The use of
                    modelAccuracy is discouraged, use modelCalibration
                instead.
The modelAccuracyPlot object function is renamed to
                    modelCalibrationPlot function. The use of
                    modelAccuracyPlot is discouraged, use modelCalibrationPlot instead.
See Also
Functions
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Seleccione un país/idioma
Seleccione un país/idioma para obtener contenido traducido, si está disponible, y ver eventos y ofertas de productos y servicios locales. Según su ubicación geográfica, recomendamos que seleccione: .
También puede seleccionar uno de estos países/idiomas:
Cómo obtener el mejor rendimiento
Seleccione China (en idioma chino o inglés) para obtener el mejor rendimiento. Los sitios web de otros países no están optimizados para ser accedidos desde su ubicación geográfica.
América
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)