Basic Lifetime PD Model Validation
This example shows how to perform basic model validation on a lifetime probability of default (PD) model by viewing the fitted model, estimated coefficients, and p-values. For more information on model validation, see modelDiscrimination and modelCalibration.
Load Data
Load the portfolio data.
load RetailCreditPanelData.mat
data = join(data,dataMacro);
disp(head(data)) ID ScoreGroup YOB Default Year GDP Market
__ __________ ___ _______ ____ _____ ______
1 Low Risk 1 0 1997 2.72 7.61
1 Low Risk 2 0 1998 3.57 26.24
1 Low Risk 3 0 1999 2.86 18.1
1 Low Risk 4 0 2000 2.43 3.19
1 Low Risk 5 0 2001 1.26 -10.51
1 Low Risk 6 0 2002 -0.59 -22.95
1 Low Risk 7 0 2003 0.63 2.78
1 Low Risk 8 0 2004 1.85 9.48
Fit Model and Review Model Goodness of Fit
Create training and test datasets to perform a basic model validation.
nIDs = max(data.ID); uniqueIDs = unique(data.ID); rng('default'); % for reproducibility c = cvpartition(nIDs,'HoldOut',0.4); TrainIDInd = training(c); TestIDInd = test(c); TrainDataInd = ismember(data.ID,uniqueIDs(TrainIDInd)); TestDataInd = ismember(data.ID,uniqueIDs(TestIDInd));
Fit the model using fitLifetimePDModel for a Logistic, Probit, or Cox model.
ModelType ="probit"; pdModel = fitLifetimePDModel(data(TrainDataInd,:),ModelType,... 'AgeVar','YOB',... 'IDVar','ID',... 'LoanVars','ScoreGroup',... 'MacroVars',{'GDP','Market'},... 'ResponseVar','Default'); disp(pdModel)
Probit with properties:
ModelID: "Probit"
Description: ""
UnderlyingModel: [1×1 classreg.regr.CompactGeneralizedLinearModel]
IDVar: "ID"
AgeVar: "YOB"
LoanVars: "ScoreGroup"
MacroVars: ["GDP" "Market"]
ResponseVar: "Default"
WeightsVar: ""
TimeInterval: 1
Display the PD model and review the fit statistics, such as the p-values.
disp(pdModel.UnderlyingModel)
Compact generalized linear regression model:
probit(Default) ~ 1 + ScoreGroup + YOB + GDP + Market
Distribution = Binomial
Estimated Coefficients:
Estimate SE tStat pValue
__________ _________ _______ ___________
(Intercept) -1.6267 0.03811 -42.685 0
ScoreGroup_Medium Risk -0.26542 0.01419 -18.704 4.5503e-78
ScoreGroup_Low Risk -0.46794 0.016364 -28.595 7.775e-180
YOB -0.11421 0.0049724 -22.969 9.6208e-117
GDP -0.041537 0.014807 -2.8052 0.0050291
Market -0.0029609 0.0010618 -2.7885 0.0052954
388097 observations, 388091 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 1.85e+03, p-value = 0
pdModel.UnderlyingModel.Coefficients
ans=6×4 table
Estimate SE tStat pValue
__________ _________ _______ ___________
(Intercept) -1.6267 0.03811 -42.685 0
ScoreGroup_Medium Risk -0.26542 0.01419 -18.704 4.5503e-78
ScoreGroup_Low Risk -0.46794 0.016364 -28.595 7.775e-180
YOB -0.11421 0.0049724 -22.969 9.6208e-117
GDP -0.041537 0.014807 -2.8052 0.0050291
Market -0.0029609 0.0010618 -2.7885 0.0052954
See Also
fitLifetimePDModel | predict | predictLifetime | modelDiscrimination | modelCalibration | modelCalibrationPlot | Logistic | Probit | Cox
