Estimation of Transition Probabilities
Introduction
Credit ratings rank borrowers according to their credit worthiness. Though this ranking is, in itself, useful, institutions are also interested in knowing how likely it is that borrowers in a particular rating category will be upgraded or downgraded to a different rating, and especially, how likely it is that they will default.
Transition probabilities offer one way to characterize the
past changes in credit quality of obligors (typically firms), and are cardinal
inputs to many risk management applications. Financial Toolbox™ software supports the estimation of transition probabilities using
both cohort and duration (also known as hazard rate or intensity) approaches using
transprob
and related
functions.
Note
The sample dataset used throughout this section is simulated using a single transition matrix. No attempt is made to match historical trends in transition rates.
Estimate Transition Probabilities
The Data_TransProb.mat
file contains sample credit ratings
data.
load Data_TransProb
data(1:10,:)
ans = ID Date Rating __________ _____________ ______ '00010283' '10-Nov-1984' 'CCC' '00010283' '12-May-1986' 'B' '00010283' '29-Jun-1988' 'CCC' '00010283' '12-Dec-1991' 'D' '00013326' '09-Feb-1985' 'A' '00013326' '24-Feb-1994' 'AA' '00013326' '10-Nov-2000' 'BBB' '00014413' '23-Dec-1982' 'B' '00014413' '20-Apr-1988' 'BB' '00014413' '16-Jan-1998' 'B'
The sample data is formatted as a cell array with three columns. Each row contains an ID (column 1), a date (column 2), and a credit rating (column 3). The assigned credit rating corresponds to the associated ID on the associated date. All information corresponding to the same ID must be stored in contiguous rows. In this example, IDs, dates, and ratings are stored in character vector format, but you also can enter them in numeric format.
In this example, the simplest calling syntax for transprob
passes the
nRecord
s-by-3
cell array as the only input
argument. The default startDate
and endDate
are the earliest and latest dates in the data. The default estimation algorithm is
the duration method and one-year transition probabilities are
estimated:
transMat0 = transprob(data)
transMat0 = 93.1170 5.8428 0.8232 0.1763 0.0376 0.0012 0.0001 0.0017 1.6166 93.1518 4.3632 0.6602 0.1626 0.0055 0.0004 0.0396 0.1237 2.9003 92.2197 4.0756 0.5365 0.0661 0.0028 0.0753 0.0236 0.2312 5.0059 90.1846 3.7979 0.4733 0.0642 0.2193 0.0216 0.1134 0.6357 5.7960 88.9866 3.4497 0.2919 0.7050 0.0010 0.0062 0.1081 0.8697 7.3366 86.7215 2.5169 2.4399 0.0002 0.0011 0.0120 0.2582 1.4294 4.2898 81.2927 12.7167 0 0 0 0 0 0 0 100.0000
Provide explicit start and end dates, otherwise, the estimation window for two
different datasets can differ, and the estimates might not be comparable. From this
point, assume that the time window of interest is the five-year period from the end
of 1995 to the end of 2000. For comparisons, compute the estimates for this time
window. First use the duration
algorithm (default option), and
then the cohort
algorithm explicitly
set.
startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; transMat1 = transprob(data,'startDate',startDate,'endDate',endDate) transMat2 = transprob(data,'startDate',startDate,'endDate',endDate,... 'algorithm','cohort')
transMat1 = 90.6236 7.9051 1.0314 0.4123 0.0210 0.0020 0.0003 0.0043 4.4780 89.5558 4.5298 1.1225 0.2284 0.0094 0.0009 0.0754 0.3983 6.1164 87.0641 5.4801 0.7637 0.0892 0.0050 0.0832 0.1029 0.8572 10.7918 83.0204 3.9971 0.7001 0.1313 0.3992 0.1043 0.3745 2.2962 14.0954 78.9840 3.0013 0.0463 1.0980 0.0113 0.0544 0.7055 3.2925 15.4350 75.5988 1.8166 3.0860 0.0044 0.0189 0.1903 1.9743 6.2320 10.2334 75.9983 5.3484 0 0 0 0 0 0 0 100.0000 transMat2 = 90.1554 8.5492 0.9067 0.3886 0 0 0 0 4.9512 88.5221 5.1763 1.0503 0.2251 0 0 0.0750 0.2770 6.6482 86.2188 6.0942 0.6233 0.0693 0 0.0693 0.0794 0.8737 11.6759 81.6521 4.3685 0.7943 0.1589 0.3971 0.1002 0.4008 1.9038 15.4309 77.8557 3.4068 0 0.9018 0 0 0.2262 2.4887 17.4208 74.2081 2.2624 3.3937 0 0 0.7576 1.5152 6.0606 10.6061 75.0000 6.0606 0 0 0 0 0 0 0 100.0000
By default, the cohort
algorithm internally gets yearly
snapshots of the credit ratings, but the number of snapshots per year is definable
using the parameter/value pair snapsPerYear
. To get the estimates
using quarterly snapshots:
transMat3 = transprob(data,'startDate',startDate,'endDate',endDate,... 'algorithm','cohort','snapsPerYear',4)
transMat3 = 90.4765 8.0881 1.0072 0.4069 0.0164 0.0015 0.0002 0.0032 4.5949 89.3216 4.6489 1.1239 0.2276 0.0074 0.0007 0.0751 0.3747 6.3158 86.7380 5.6344 0.7675 0.0856 0.0040 0.0800 0.0958 0.7967 11.0441 82.6138 4.1906 0.7230 0.1372 0.3987 0.1028 0.3571 2.3312 14.4954 78.4276 3.1489 0.0383 1.0987 0.0084 0.0399 0.6465 3.0962 16.0789 75.1300 1.9044 3.0956 0.0031 0.0125 0.1445 1.8759 6.2613 10.7022 75.6300 5.3705 0 0 0 0 0 0 0 100.0000
Both duration
and cohort
compute one-year
transition probabilities by default, but the time interval for the transitions is
definable using the parameter/value pair transInterval
. For
example, to get the two-year transition probabilities using the
cohort
algorithm with the same snapshot periodicity and
estimation
window:
transMat4 = transprob(data,'startDate',startDate,'endDate',endDate,... 'algorithm','cohort','snapsPerYear',4,'transInterval',2)
transMat4 = 82.2358 14.6092 2.2062 0.8543 0.0711 0.0074 0.0011 0.0149 8.2803 80.4584 8.3606 2.2462 0.4665 0.0316 0.0030 0.1533 0.9604 11.1975 76.1729 9.7284 1.5322 0.2044 0.0162 0.1879 0.2483 2.0903 18.8440 69.5145 6.9601 1.2966 0.2329 0.8133 0.2129 0.8713 5.4893 23.5776 62.6438 4.9464 0.1390 2.1198 0.0378 0.1895 1.7679 7.2875 24.9444 57.1783 2.8816 5.7132 0.0154 0.0716 0.6576 4.2157 11.4465 16.3455 57.4078 9.8399 0 0 0 0 0 0 0 100.0000
Estimate Transition Probabilities for Different Rating Scales
The dataset data
from Data_TransProb.mat
contains sample credit ratings using the default rating scale {'AAA',
'AA','A', 'BBB', 'BB', 'B', 'CCC', 'D'}
. It also contains the dataset
dataIGSG
with ratings investment grade
('IG'
), speculative grade ('SG'
), and
default ('D'
). To estimate the transition matrix for this
dataset, use the labels
argument.
load Data_TransProb startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; dataIGSG(1:10,:) transMatIGSG = transprob(dataIGSG,'labels',{'IG','SG','D'},... 'startDate',startDate,'endDate',endDate)
ans = '00011253' '04-Apr-1983' 'IG' '00012751' '17-Feb-1985' 'SG' '00012751' '19-May-1986' 'D' '00014690' '17-Jan-1983' 'IG' '00012144' '21-Nov-1984' 'IG' '00012144' '25-Mar-1992' 'SG' '00012144' '07-May-1994' 'IG' '00012144' '23-Jan-2000' 'SG' '00012144' '20-Aug-2001' 'IG' '00012937' '07-Feb-1984' 'IG' transMatIGSG = 98.1986 1.5179 0.2835 8.5396 89.4891 1.9713 0 0 100.0000
There is another dataset, dataIGSGnum
, with the same
information as dataIGSG
, except the ratings are mapped to a
numeric scale where 'IG'=1
, 'SG'=2
, and
'D'=3
. To estimate the transition matrix, use the
labels
optional argument specifying the numeric scale as a
cell array.
dataIGSGnum(1:10,:) % Note {1,2,3} and num2cell(1:3) are equivalent; num2cell is convenient % when the number of ratings is larger transMatIGSGnum = transprob(dataIGSGnum,'labels',{1,2,3},... 'startDate',startDate,'endDate',endDate)
ans = '00011253' '04-Apr-1983' [1] '00012751' '17-Feb-1985' [2] '00012751' '19-May-1986' [3] '00014690' '17-Jan-1983' [1] '00012144' '21-Nov-1984' [1] '00012144' '25-Mar-1992' [2] '00012144' '07-May-1994' [1] '00012144' '23-Jan-2000' [2] '00012144' '20-Aug-2001' [1] '00012937' '07-Feb-1984' [1] transMatIGSGnum = 98.1986 1.5179 0.2835 8.5396 89.4891 1.9713 0 0 100.0000
Any time the input dataset contains ratings not included in the default rating
scale {'AAA', 'AA', 'A', 'BBB', 'BB', 'B', 'CCC', 'D'}
, the full
rating scale must be specified using the labels
optional
argument. For example, if the dataset contains ratings 'AAA', ..., 'CCC,
'D'
, and 'NR'
(not rated), use
labels
with this cell array {'AAA', 'AA',
'A','BBB','BB','B','CCC','D','NR'}
.
Working with a Transition Matrix Containing NR
Rating
This example demonstrates how 'NR'
(not rated) ratings are
handled by transprob
, and how to get
transition matrix that use the 'NR'
rating information for the
estimation, but that do not show the 'NR'
rating in the final
transition probabilities.
The dataset data
from Data_TransProb.mat
contains sample credit ratings using the default rating scale {'AAA',
'AA','A', 'BBB', 'BB', 'B', 'CCC', 'D'}
.
load Data_TransProb
head(data,12)
ans = 12×3 table ID Date Rating __________ _____________ ______ '00010283' '10-Nov-1984' 'CCC' '00010283' '12-May-1986' 'B' '00010283' '29-Jun-1988' 'CCC' '00010283' '12-Dec-1991' 'D' '00013326' '09-Feb-1985' 'A' '00013326' '24-Feb-1994' 'AA' '00013326' '10-Nov-2000' 'BBB' '00014413' '23-Dec-1982' 'B' '00014413' '20-Apr-1988' 'BB' '00014413' '16-Jan-1998' 'B' '00014413' '25-Nov-1999' 'BB' '00012126' '17-Feb-1985' 'CCC'
Replace a transition to 'B'
with a transition to
'NR'
for the first company. Note that there is a subsequent
transition from 'NR'
to 'CCC'
.
dataNR = data; dataNR.Rating{2} = 'NR'; dataNR.Rating{7} = 'NR'; head(dataNR,12)
ans = 12×3 table ID Date Rating __________ _____________ ______ '00010283' '10-Nov-1984' 'CCC' '00010283' '12-May-1986' 'NR' '00010283' '29-Jun-1988' 'CCC' '00010283' '12-Dec-1991' 'D' '00013326' '09-Feb-1985' 'A' '00013326' '24-Feb-1994' 'AA' '00013326' '10-Nov-2000' 'NR' '00014413' '23-Dec-1982' 'B' '00014413' '20-Apr-1988' 'BB' '00014413' '16-Jan-1998' 'B' '00014413' '25-Nov-1999' 'BB' '00012126' '17-Feb-1985' 'CCC'
'NR'
is treated as another rating. The transition matrix shows
the estimated probability of transitioning into and out of 'NR'
.
In this example, the transprob
function uses
the'cohort'
algorithm, and the 'NR'
rating
is treated as another rating. The same behavior exists when using the transprob
function with the
'duration'
algorithm.
RatingsLabelsNR = {'AAA','AA','A','BBB','BB','B','CCC','D','NR'}; [MatrixNRCohort,TotalsNRCohort] = transprob(dataNR,... 'Labels',RatingsLabelsNR,... 'Algorithm','cohort'); fprintf('Transition probability, cohort, including NR:\n') disp(array2table(MatrixNRCohort,'VariableNames',RatingsLabelsNR,... 'RowNames',RatingsLabelsNR)) fprintf('Total transitions out of given rating, including 6 out of NR (5 NR->NR, 1 NR->CCC):\n') disp(array2table(TotalsNRCohort.totalsVec,'VariableNames',RatingsLabelsNR))
Transition probability, cohort, including NR: AAA AA A BBB BB B CCC D NR ________ _______ ________ _______ ________ ________ ________ ________ ________ AAA 93.135 5.9335 0.74557 0.15533 0.031066 0 0 0 0 AA 1.7359 92.92 4.5446 0.58514 0.15604 0 0 0.039009 0.019505 A 0.12683 2.9716 91.991 4.3124 0.4711 0.054358 0 0.072477 0 BBB 0.021048 0.37887 5.0726 89.771 4.0413 0.46306 0.042096 0.21048 0 BB 0.022099 0.1105 0.68508 6.232 88.376 3.6464 0.28729 0.64088 0 B 0 0 0.076161 0.72353 7.997 86.215 2.7037 2.2848 0 CCC 0 0 0 0.30936 1.8561 4.4857 80.897 12.374 0.07734 D 0 0 0 0 0 0 0 100 0 NR 0 0 0 0 0 0 16.667 0 83.333 Total transitions out of given rating, including 6 out of NR (5 NR->NR, 1 NR->CCC): AAA AA A BBB BB B CCC D NR ____ ____ ____ ____ ____ ____ ____ ____ __ 3219 5127 5519 4751 4525 2626 1293 4050 6
To remove transitions to 'NR'
from the transition matrix, you
need to use the 'excludeLabels'
optional name-value input
argument to transprob
.
The 'labels'
input to transprob
may or may not include
the label that needs to be excluded. In the following example, the
NR
rating is removed from the labels for display purposes,
but passing RatingsLabelsNR
to transprob
would also work.
RatingsLabels = {'AAA','AA','A','BBB','BB','B','CCC','D'}; [MatrixCohort,TotalsCohort] = transprob(dataNR,'Labels',RatingsLabels,'ExcludeLabels','NR','Algorithm','cohort'); fprintf('Transition probability, cohort, after postprocessing to remove NR:\n')
Transition probability, cohort, after postprocessing to remove NR:
disp(array2table(MatrixCohort,'VariableNames',RatingsLabels,... 'RowNames',RatingsLabels))
Transition probability, cohort, after postprocessing to remove NR: AAA AA A BBB BB B CCC D ________ _______ ________ _______ ________ ________ ________ ________ AAA 93.135 5.9335 0.74557 0.15533 0.031066 0 0 0 AA 1.7362 92.938 4.5455 0.58525 0.15607 0 0 0.039017 A 0.12683 2.9716 91.991 4.3124 0.4711 0.054358 0 0.072477 BBB 0.021048 0.37887 5.0726 89.771 4.0413 0.46306 0.042096 0.21048 BB 0.022099 0.1105 0.68508 6.232 88.376 3.6464 0.28729 0.64088 B 0 0 0.076161 0.72353 7.997 86.215 2.7037 2.2848 CCC 0 0 0 0.3096 1.8576 4.4892 80.96 12.384 D 0 0 0 0 0 0 0 100 Total transitions out of given rating, AA and CCC have one less than before: AAA AA A BBB BB B CCC D ____ ____ ____ ____ ____ ____ ____ ____ 3219 5126 5519 4751 4525 2626 1292 4050
fprintf('Total transitions out of given rating, AA and CCC have one less than before:\n')
Total transitions out of given rating, AA and CCC have one less than before
disp(array2table(TotalsCohort.totalsVec,'VariableNames',RatingsLabels))
AAA AA A BBB BB B CCC D ____ ____ ____ ____ ____ ____ ____ ____ 3219 5126 5519 4751 4525 2626 1292 4050
All transitions involving 'NR'
are removed from the sample, but
all other transitions are still used to estimate the transition probabilities. In
this example, the transition from 'NR'
to
'CCC'
has been removed, as well as the transition from
'AA'
to 'NR'
(and five more transitions
from 'NR'
to 'NR'
). That means the first
company is still contributing transitions from'CCC'
to
'CCC'
for the estimation, only the periods overlapping with
the time this company spent in 'NR'
have been removed from the
sample, and similarly for the other company.
This procedure is different from removing the 'NR'
rows from
the data itself.
For example, if you remove the 'NR'
rows in this example, the
first company seems to stay in its initial rating of 'CCC'
all
the way from the initial date in 1984 to the default event in 1991. With the
previous approach, the estimation knows that the company transitioned out of
'CCC'
at some point, it knows it was not staying at
'CCC'
all the time.
If the 'NR'
row is removed for the second company, this company
seems to have stayed in the sample as an 'AA'
company until the
end of the sample. With the previous approach, the estimation knows that this
company stopped being an 'AA'
earlier.
dataNR2 = dataNR; dataNR2([2 7],:) = []; head(dataNR2,12)
ans = 12×3 table ID Date Rating __________ _____________ ______ '00010283' '10-Nov-1984' 'CCC' '00010283' '29-Jun-1988' 'CCC' '00010283' '12-Dec-1991' 'D' '00013326' '09-Feb-1985' 'A' '00013326' '24-Feb-1994' 'AA' '00014413' '23-Dec-1982' 'B' '00014413' '20-Apr-1988' 'BB' '00014413' '16-Jan-1998' 'B' '00014413' '25-Nov-1999' 'BB' '00012126' '17-Feb-1985' 'CCC' '00012126' '08-Mar-1989' 'D' '00011692' '11-May-1984' 'BB'
If the 'NR'
rows are removed, the transition matrices will be
different. The probability of staying at 'CCC'
goes slightly up,
and so does the probability of staying at 'AA'
.
The transition matrices will be different. The probability of staying at
'CCC'
goes slightly up, and so does the probability of
staying at 'AA'
.
[MatrixCohort2,TotalsCohort2] = transprob(dataNR2,... 'Labels',RatingsLabels,... 'Algorithm','cohort'); fprintf('Transition probability, cohort, if NR rows are removed from data:\n') disp(array2table(MatrixCohort2,'VariableNames',RatingsLabels,... 'RowNames',RatingsLabels)) fprintf('Total transitions out of given rating, many more out of CCC and AA:\n') disp(array2table(TotalsCohort2.totalsVec,'VariableNames',RatingsLabels))
Transition probability, cohort, if NR rows are removed from data:
disp(array2table(MatrixCohort2,'VariableNames',RatingsLabels,... 'RowNames',RatingsLabels))
Transition probability, cohort, if NR rows are removed from data: AAA AA A BBB BB B CCC D ________ _______ ________ _______ ________ ________ ________ ________ AAA 93.135 5.9335 0.74557 0.15533 0.031066 0 0 0 AA 1.7346 92.945 4.541 0.58468 0.15592 0 0 0.038979 A 0.12683 2.9716 91.991 4.3124 0.4711 0.054358 0 0.072477 BBB 0.021048 0.37887 5.0726 89.771 4.0413 0.46306 0.042096 0.21048 BB 0.022099 0.1105 0.68508 6.232 88.376 3.6464 0.28729 0.64088 B 0 0 0.076161 0.72353 7.997 86.215 2.7037 2.2848 CCC 0 0 0 0.30888 1.8533 4.4788 81.004 12.355 D 0 0 0 0 0 0 0 100
fprintf('Total transitions out of given rating, many more out of CCC and AA:\n')
Total transitions out of given rating, many more out of CCC and AA:
disp(array2table(TotalsCohort2.totalsVec,'VariableNames',RatingsLabels))
AAA AA A BBB BB B CCC D ____ ____ ____ ____ ____ ____ ____ ____ 3219 5131 5519 4751 4525 2626 1295 4050
Estimate Point-in-Time and Through-the-Cycle Probabilities
Transition probability estimates are sensitive to the length of the estimation window. When the estimation window is small, the estimates only capture recent credit events, and these can change significantly from one year to the next. These are called point-in-time (PIT) estimates. In contrast, a large time window yields fairly stable estimates that average transition rates over a longer period of time. These are called through-the-cycle (TTC) estimates.
The estimation of PIT probabilities requires repeated calls to transprob
with a rolling estimation
window. Use transprobprep
every time repeated
calls to transprob
are required. transprobprep
performs a
preprocessing step on the raw dataset that is independent of the estimation window.
The benefits of transprobprep
are greater as the
number of repeated calls to transprob
increases. Also, the
performance gains from transprobprep
are more significant
for the cohort
algorithm.
load Data_TransProb prepData = transprobprep(data); Years = 1991:2000; nYears = length(Years); nRatings = length(prepData.ratingsLabels); transMatPIT = zeros(nRatings,nRatings,nYears); algorithm = 'duration'; sampleTotals(nYears,1) = struct('totalsVec',[],'totalsMat',[],... 'algorithm',algorithm); for t = 1:nYears startDate = ['31-Dec-' num2str(Years(t)-1)]; endDate = ['31-Dec-' num2str(Years(t))]; [transMatPIT(:,:,t),sampleTotals(t)] = transprob(prepData,... 'startDate',startDate,'endDate',endDate,'algorithm',algorithm); end
Here is the PIT transition matrix for 1993. Recall that the sample dataset contains simulated credit migrations so the PIT estimates in this example do not match actual historical transition rates.
transMatPIT(:,:,Years==1993)
ans = 95.3193 4.5999 0.0802 0.0004 0.0002 0.0000 0.0000 0.0000 2.0631 94.5931 3.3057 0.0254 0.0126 0.0002 0.0000 0.0000 0.0237 2.1748 95.5901 1.4700 0.7284 0.0131 0.0000 0.0000 0.0003 0.0372 3.2585 95.2914 1.3876 0.0250 0.0001 0.0000 0.0000 0.0005 0.0657 3.8292 92.7474 3.3459 0.0111 0.0001 0.0000 0.0001 0.0128 0.7977 8.0926 90.4897 0.5958 0.0113 0.0000 0.0000 0.0005 0.0459 0.5026 11.1621 84.9315 3.3574 0 0 0 0 0 0 0 100.0000
A structure array stores the sampleTotals
optional output from
transprob
. The
sampleTotals
structure contains summary information on the
total time spent on each rating, and the number of transitions out of each rating,
for each year under consideration. For more information on the
sampleTotals
structure, see transprob
.
As an example, the sampleTotals
structure for 1993 is used
here. The total time spent on each rating is stored in the
totalsVec
field of the structure. The total transitions out
of each rating are stored in the totalsMat
field. A third field,
algorithm
, indicates the algorithm used to generate the
structure.
sampleTotals(Years==1993).totalsVec sampleTotals(Years==1993).totalsMat sampleTotals(Years==1993).algorithm
ans = 144.4411 230.0356 262.2438 204.9671 246.1315 147.0767 54.9562 215.1479 ans = 0 7 0 0 0 0 0 0 5 0 8 0 0 0 0 0 0 6 0 4 2 0 0 0 0 0 7 0 3 0 0 0 0 0 0 10 0 9 0 0 0 0 0 1 13 0 1 0 0 0 0 0 0 7 0 2 0 0 0 0 0 0 0 0 ans = duration
To get the TTC transition matrix, pass the sampleTotals
structure array to transprobbytotals
. Internally,
transprobbytotals
aggregates the
information in the sampleTotals
structures to get the total time
spent on each rating over the 10 years considered in this example, and the total
number of transitions out of each rating during the same period. transprobbytotals
uses the
aggregated information to get the TTC matrix, or average one-year transition
matrix.
transMatTTC = transprobbytotals(sampleTotals)
transMatTTC = 92.8544 6.1068 0.7463 0.2761 0.0123 0.0009 0.0001 0.0032 2.9399 92.2329 3.8394 0.7349 0.1676 0.0050 0.0004 0.0799 0.2410 4.5963 90.3468 3.9572 0.6909 0.0521 0.0025 0.1133 0.0530 0.4729 7.9221 87.2751 3.5075 0.4650 0.0791 0.2254 0.0460 0.1636 1.1873 9.3442 85.4305 2.9520 0.1150 0.7615 0.0031 0.0152 0.2608 1.5563 10.4468 83.8525 1.9771 1.8882 0.0009 0.0041 0.0542 0.8378 2.9996 7.3614 82.4758 6.2662 0 0 0 0 0 0 0 100.0000
The same TTC matrix could be obtained with a direct call to transprob
, setting the estimation
window to the 10 years under consideration. But it is much more efficient to use the
sampleTotals
structures, whenever they are available. (Note,
for the duration
algorithm, these alternative workflows can
result in small numerical differences in the estimates whenever leap years are part
of the sample.)
In Estimate Transition Probabilities, a 1-year transition matrix is estimated using the 5-year time window from 1996
through 2000. This is another example of a TTC matrix and this can also be computed
using the sampleTotals
structure array.
transprobbytotals(sampleTotals(Years>=1996&Years<=2000))
ans = 90.6239 7.9048 1.0313 0.4123 0.0210 0.0020 0.0003 0.0043 4.4776 89.5565 4.5294 1.1224 0.2283 0.0094 0.0009 0.0754 0.3982 6.1159 87.0651 5.4797 0.7636 0.0892 0.0050 0.0832 0.1029 0.8571 10.7909 83.0218 3.9968 0.7001 0.1313 0.3991 0.1043 0.3744 2.2960 14.0947 78.9851 3.0012 0.0463 1.0980 0.0113 0.0544 0.7054 3.2922 15.4341 75.6004 1.8165 3.0858 0.0044 0.0189 0.1903 1.9742 6.2318 10.2332 75.9990 5.3482 0 0 0 0 0 0 0 100.0000
Estimate t-Year Default Probabilities
By varying the start and end dates, the amount of data considered for the
estimation is changed, but the output still contains, by default, one-year
transition probabilities. You can change the default behavior by specifying the
transInterval
argument, as illustrated in Estimate Transition Probabilities.
However, when t-year transition probabilities are required for
a whole range of values of t, for example, 1-year, 2-year,
3-year, 4-year, and 5-year transition probabilities, it is more efficient to call
transprob
once to get the optional
output sampleTotals
. You can use the same
sampleTotals
structure can be used to get the
t-year transition matrix for any transition interval
t. Given a sampleTotals
structure and a
transition interval, you can get the corresponding transition matrix by using
transprobbytotals
.
load Data_TransProb startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; [~,sampleTotals] = transprob(data,'startDate', ... startDate, 'endDate',endDate); DefProb = zeros(7,5); for t = 1:5 transMatTemp = transprobbytotals(sampleTotals,'transInterval',t); DefProb(:,t) = transMatTemp(1:7,8); end DefProb
DefProb = 0.0043 0.0169 0.0377 0.0666 0.1033 0.0754 0.1542 0.2377 0.3265 0.4213 0.0832 0.1936 0.3276 0.4819 0.6536 0.3992 0.8127 1.2336 1.6566 2.0779 1.0980 2.1189 3.0668 3.9468 4.7644 3.0860 5.6994 7.9281 9.8418 11.4963 5.3484 9.8053 13.5320 16.6599 19.2964
Estimate Bootstrap Confidence Intervals
transprob
also returns the
idTotals
structure array which contains, for each ID, or
company, the total time spent on each rating, and the total transitions out of each
rating. For more information on the idTotals
structure, see
transprob
. The
idTotals
structure is similar to the
sampleTotals
structures (see Estimate Point-in-Time and Through-the-Cycle Probabilities), but
idTotals
has the information at an ID level. Because most
companies only migrate between few ratings, the numeric arrays in
idTotals
are stored as sparse arrays to reduce memory
requirements.
You can use the idTotals
structure array to estimate confidence
intervals for the transition probabilities using a bootstrapping procedure, as the
following example demonstrates. To do this, call transprob
and keep the third output
argument, idTotals
. The idTotals
fields are
displayed for the last company in the sample. Within the estimation window, this
company spends almost a year as 'AA'
and it is then upgraded to
'AAA'
.
load Data_TransProb startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; [transMat,~,idTotals] = transprob(data,... 'startDate',startDate,'endDate',endDate); % Total time spent on each rating full(idTotals(end).totalsVec) % Total transitions out of each rating full(idTotals(end).totalsMat) % Algorithm idTotals(end).algorithm
ans = 4.0820 0.9180 0 0 0 0 0 0 ans = 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ans = duration
Next, use bootstrp
from Statistics and Machine Learning Toolbox™ with transprobbytotals
as the bootstrap
function and idTotals
as the data to sample from. Each bootstrap
sample corresponds to a dataset made of companies sampled with replacement from the
original data. However, you do not have to draw companies from the original data,
because a bootstrap idTotals
sample contains all the information
required to compute the transition probabilities. transprobbytotals
aggregates all
structures in each bootstrap idTotals
sample and finds the
corresponding transition matrix.
To estimate 95% confidence intervals for the transition matrix and display the probabilities of default together with its upper and lower confidence bounds:
PD = transMat(1:7,8);
bootstat = bootstrp(100,@(totals)transprobbytotals(totals),idTotals);
ci = prctile(bootstat,[2.5 97.5]); % 95% confidence
CIlower = reshape(ci(1,:),8,8);
CIupper = reshape(ci(2,:),8,8);
PD_LB = CIlower(1:7,8);
PD_UB = CIupper(1:7,8);
[PD_LB PD PD_UB]
ans = 0.0004 0.0043 0.0106 0.0028 0.0754 0.2192 0.0126 0.0832 0.2180 0.1659 0.3992 0.6617 0.5703 1.0980 1.7260 1.7264 3.0860 4.7602 1.7678 5.3484 9.5055
Group Credit Ratings
Credit rating scales can be more or less granular. For example, there are ratings
with qualifiers (such as, 'AA+'
, 'BB-'
, and so
on), whole ratings ('AA'
, 'BB'
, and so on),
and investment or speculative grade ('IG'
,
'SG'
) categories. Given a dataset with credit ratings at a
more granular level, transition probabilities for less granular categories can be of
interest. For example, you might be interested in a transition matrix for investment
and speculative grades given a dataset with whole ratings. Use transprobgrouptotals
for this
evaluation, as illustrated in the following examples. The sample dataset data has
whole credit ratings:
load Data_TransProb startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; data(1:5,:)
ans = '00010283' '10-Nov-1984' 'CCC' '00010283' '12-May-1986' 'B' '00010283' '29-Jun-1988' 'CCC' '00010283' '12-Dec-1991' 'D' '00013326' '09-Feb-1985' 'A'
A call to transprob
returns the transition
matrix and totals structures for the eight ('AAA'
to
'D'
) whole credit ratings. The array with number of
transitions out of each credit rating is displayed after the call to transprob
:
[transMat,sampleTotals,idTotals] = transprob(data,'startDate',startDate,... 'endDate',endDate); sampleTotals.totalsMat
ans = 0 67 7 3 0 0 0 0 67 0 68 15 3 0 0 1 4 101 0 93 11 1 0 1 1 7 163 0 62 10 2 5 1 3 16 168 0 37 0 11 0 0 2 10 83 0 10 14 0 0 0 2 8 16 0 7 0 0 0 0 0 0 0 0
Next, use transprobgrouptotals
to group whole
ratings into investment and speculative grades. This function takes a totals
structure as the first argument. The second argument indicates the edges between
rating categories. In this case, ratings 1 through 4 ('AAA'
through 'BBB'
) correspond to the first category
('IG'
), ratings 5 through 7 ('BB'
through
'CCC'
) to the second category ('SG'
), and
rating 8 ('D'
) is a category of its own. transprobgrouptotals
adds up the
total time spent on ratings that belong to the same category. For example, total
times spent on 'AAA'
through 'BBB'
are added
up as the total time spent on 'IG'
. transprobgrouptotals
also adds up
the total number of transitions between any 'IG'
rating and any
'SG'
rating, for example, a credit migration from
'BBB'
to 'BB'
.
The grouped totals can then be passed to transprobbytotals
to obtain the
transition matrix for investment and speculative grades. Both
totalsMat
and the new transition matrix are both
3
-by-3
, corresponding to the grouped
categories 'IG'
, 'SG'
, and
'D'
.
sampleTotalsIGSG = transprobgrouptotals(sampleTotals,[4 7 8]) transMatIGSG = transprobbytotals(sampleTotalsIGSG)
sampleTotalsIGSG = totalsVec: [4.8591e+003 1.5034e+003 1.1621e+003] totalsMat: [3x3 double] algorithm: 'duration' transMatIGSG = 98.1591 1.6798 0.1611 12.3228 85.6961 1.9811 0 0 100.0000
When a totals structure array is passed to transprobgrouptotals
, this function
groups each structure in the array individually and preserves sparsity, if the
fields in the input structures are sparse. One way to exploit this feature is to
compute confidence intervals for the investment grade default rate and the
speculative grade default rate (see also Estimate Bootstrap Confidence Intervals).
PDIGSG = transMatIGSG(1:2,3);
idTotalsIGSG = transprobgrouptotals(idTotals,[4 7 8]);
bootstat = bootstrp(100,@(totals)transprobbytotals(totals),idTotalsIGSG);
ci = prctile(bootstat,[2.5 97.5]); % 95% confidence
CIlower = reshape(ci(1,:),3,3);
CIupper = reshape(ci(2,:),3,3);
PDIGSG_LB = CIlower(1:2,3);
PDIGSG_UB = CIupper(1:2,3);
[PDIGSG_LB PDIGSG PDIGSG_UB]
ans = 0.0603 0.1611 0.2538 1.3470 1.9811 2.6195
Work with Nonsquare Matrices
Transition probabilities and the number of transitions between ratings are usually
reported without the 'D'
('Default'
) row. For
example, a credit report can contain the following table, indicating the number of
issuers starting in each rating (first column), and the number of transitions
between ratings (remaining columns):
Initial AAA AA A BBB BB B CCC D AAA 98 88 9 1 0 0 0 0 0 AA 389 0 368 19 2 0 0 0 0 A 1165 1 21 1087 56 0 0 0 0 BBB 1435 0 2 89 1289 45 8 0 2 BB 915 0 0 1 60 776 73 2 3 B 867 0 0 1 7 88 715 39 17 CCC 112 0 0 0 1 3 34 61 13
You can store the information in this table in a totals structure compatible with
the cohort
algorithm. For more information on the
cohort
algorithm and the totals structure, see transprob
. The
totalsMat
field is a nonsquare array in this case.
% Define totals structure totals.totalsVec = [98 389 1165 1435 915 867 112]; totals.totalsMat = [ 88 9 1 0 0 0 0 0; 0 368 19 2 0 0 0 0; 1 21 1087 56 0 0 0 0; 0 2 89 1289 45 8 0 2; 0 0 1 60 776 73 2 3; 0 0 1 7 88 715 39 17; 0 0 0 1 3 34 61 13]; totals.algorithm = 'cohort';
transprobbytotals
and transprobgrouptotals
accept totals
inputs with nonsquare totalsMat
fields. To get the transition
matrix corresponding to the previous table, and to group ratings into investment and
speculative grade with the corresponding matrix:
transMat = transprobbytotals(totals)
% Group into IG/SG and get IG/SG transition matrix
totalsIGSG = transprobgrouptotals(totals,[4 7]);
transMatIGSG = transprobbytotals(totalsIGSG)
transMat = 89.7959 9.1837 1.0204 0 0 0 0 0 0 94.6015 4.8843 0.5141 0 0 0 0 0.0858 1.8026 93.3047 4.8069 0 0 0 0 0 0.1394 6.2021 89.8258 3.1359 0.5575 0 0.1394 0 0 0.1093 6.5574 84.8087 7.9781 0.2186 0.3279 0 0 0.1153 0.8074 10.1499 82.4683 4.4983 1.9608 0 0 0 0.8929 2.6786 30.3571 54.4643 11.6071 transMatIGSG = 98.2183 1.7169 0.0648 3.6959 94.5618 1.7423
Remove Outliers
The idTotals
output from transprob
can also be exploited to
update the transition probability estimates after removing some outlier information.
For more information on idTotals
, see transprob
. For example, if you know
that the credit rating migration information for the 4th and 27th companies in the
data have problems, you can remove those companies and efficiently update the
transition probabilities as follows:
load Data_TransProb startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; [transMat,~,idTotals] = transprob(data,'startDate', ... startDate, 'endDate',endDate); transMat
transMat = 90.6236 7.9051 1.0314 0.4123 0.0210 0.0020 0.0003 0.0043 4.4780 89.5558 4.5298 1.1225 0.2284 0.0094 0.0009 0.0754 0.3983 6.1164 87.0641 5.4801 0.7637 0.0892 0.0050 0.0832 0.1029 0.8572 10.7918 83.0204 3.9971 0.7001 0.1313 0.3992 0.1043 0.3745 2.2962 14.0954 78.9840 3.0013 0.0463 1.0980 0.0113 0.0544 0.7055 3.2925 15.4350 75.5988 1.8166 3.0860 0.0044 0.0189 0.1903 1.9743 6.2320 10.2334 75.9983 5.3484 0 0 0 0 0 0 0 100.0000 nIDs = length(idTotals); keepInd = setdiff(1:nIDs,[4 27]); transMatNoOutlier = transprobbytotals(idTotals(keepInd)) transMatNoOutlier = 90.6241 7.9067 1.0290 0.4124 0.0211 0.0020 0.0003 0.0043 4.4917 89.5918 4.4779 1.1240 0.2288 0.0094 0.0009 0.0756 0.3990 6.1220 87.0530 5.4841 0.7643 0.0893 0.0050 0.0833 0.1030 0.8576 10.7909 83.0207 3.9971 0.7001 0.1313 0.3992 0.1043 0.3746 2.2960 14.0955 78.9840 3.0013 0.0463 1.0980 0.0113 0.0544 0.7054 3.2925 15.4350 75.5988 1.8166 3.0860 0.0044 0.0189 0.1903 1.9743 6.2320 10.2334 75.9983 5.3484 0 0 0 0 0 0 0 100.0000
Deciding which companies to remove is a case-by-case situation. Reasons to remove
a company can include a typo in one of the ratings histories, or an unusual
migration between ratings whose impact on the transition probability estimates must
be measured. transprob
does not reorder the
companies in any way. The ordering of companies in the input data is the same as the
ordering in the idTotals
array.
Estimate Probabilities for Different Segments
You can use idTotals
efficiently to get estimates over
different segments of the sample. For more information on
idTotals
, see transprob
. For example, assume that
the companies in the example are grouped into three geographic regions and that the
companies were grouped by geographic regions previously, so that the first 340
companies correspond to the first region, the next 572 companies to the second
region, and the rest to the third region. You can efficiently get transition
probabilities for each region as
follows:
load Data_TransProb startDate = '31-Dec-1995'; endDate = '31-Dec-2000'; [~,~,idTotals] = transprob(data,'startDate', ... startDate, 'endDate',endDate); n1 = 340; n2 = 572; transMatG1 = transprobbytotals(idTotals(1:n1)) transMatG2 = transprobbytotals(idTotals(n1+1:n1+n2)) transMatG3 = transprobbytotals(idTotals(n1+n2+1:end))
transMatG1 = 90.8299 7.6501 0.3178 1.1700 0.0255 0.0044 0.0021 0.0002 4.3572 89.0262 5.7838 0.8039 0.0245 0.0029 0.0013 0.0001 0.7066 6.7567 86.6320 5.4950 0.3721 0.0252 0.0101 0.0023 0.0626 1.3688 10.3895 83.5022 3.6823 0.6466 0.3084 0.0396 0.0256 0.7884 2.6970 13.7857 78.8321 2.8310 0.0561 0.9842 0.0026 0.1095 0.4280 3.5204 21.1437 72.9230 1.6456 0.2273 0.0005 0.0216 0.0730 0.4574 4.9586 4.2821 80.3062 9.9006 0 0 0 0 0 0 0 100.0000 transMatG2 = 90.5798 8.4877 0.8202 0.0884 0.0132 0.0011 0.0000 0.0096 4.1999 90.0371 3.8657 1.4744 0.2144 0.0128 0.0001 0.1956 0.3022 5.9869 86.7128 5.5526 1.0411 0.1902 0.0015 0.2127 0.0204 0.5606 10.9342 82.9195 4.0123 0.7398 0.0059 0.8073 0.0089 0.3338 2.1185 16.6496 76.2395 3.1241 0.0261 1.4995 0.0013 0.0465 0.6710 2.4731 14.7281 76.7378 1.2993 4.0428 0.0002 0.0080 0.0681 0.4598 4.1324 8.4380 80.9092 5.9843 0 0 0 0 0 0 0 100.0000 transMatG3 = 90.5655 7.5408 1.5288 0.3369 0.0258 0.0015 0.0003 0.0004 4.8073 89.3842 4.4865 0.9582 0.3509 0.0095 0.0009 0.0025 0.3153 5.8771 87.6353 5.4101 0.7160 0.0322 0.0052 0.0088 0.1995 0.8625 10.8682 82.8717 4.1423 0.6903 0.1565 0.2090 0.2465 0.1091 2.1558 12.0289 81.5803 3.0057 0.0616 0.8122 0.0227 0.0400 0.9380 4.3175 12.3632 75.9429 2.5766 3.7991 0.0149 0.0180 0.3414 3.6918 8.1414 13.6010 70.7254 3.4661 0 0 0 0 0 0 0 100.0000
Work with Large Datasets
This example shows how to aggregate estimates from two (or more) datasets. It is
possible that two datasets, coming from two different databases, must be considered
for the estimation of the transition probabilities. Also, if a dataset is too large
and cannot be loaded into memory, the dataset can be split into two (or more)
datasets. In these cases, it is simple to apply transprob
to each individual
dataset, and then get the final estimates corresponding to the aggregated data with
a call to transprobbytotals
at the
end.
For example, the dataset data is artificially split into two sections in this example. In practice the two datasets would come from different files or databases. When aggregating multiple datasets, the history of a company cannot be split across datasets. You can analyze that this condition is satisfied for the arbitrarily chosen cut-off point.
load Data_TransProb
cutoff = 2099;
data(cutoff-5:cutoff,:)
data(cutoff+1:cutoff+6,:)
ans = '00011166' '24-Aug-1995' 'BBB' '00011166' '25-Jan-1997' 'A' '00011166' '01-Feb-1998' 'AA' '00014878' '15-Mar-1983' 'B' '00014878' '21-Sep-1986' 'BB' '00014878' '17-Jan-1998' 'BBB' ans = '00012043' '09-Feb-1985' 'BBB' '00012043' '03-Jan-1988' 'A' '00012043' '15-Jan-1994' 'AAA' '00011157' '24-Jun-1984' 'A' '00011157' '09-Dec-1999' 'BBB' '00011157' '28-Mar-2001' 'A'
When working with multiple datasets, it is important to set the start and end
dates explicitly. Otherwise, the estimation window differs for each dataset because
the default start and end dates used by transprob
are the earliest and
latest dates found in the input data.
startDate = '31-Dec-1995'; endDate = '31-Dec-2000';
In practice, this is the point where you can read in the first dataset. Now, the
dataset is already obtained. Call transprob
with the first dataset
and the explicit start and end dates. Keep only the sampleTotals
output. For details on sampleTotals
, see transprob
.
[~,sampleTotals(1)] = transprob(data(1:cutoff,:),... 'startDate',startDate,'endDate',endDate);
Repeat for the remaining datasets. Note the different
sampleTotals
structures are stored in a structured
array.
[~,sampleTotals(2)] = transprob(data(cutoff+1:end,:),... 'startDate',startDate,'endDate',endDate);
To get the transition matrix corresponding to the aggregated dataset, use
transprobbytotals
. When the totals
input is a structure array, transprobbytotals
aggregates the
information over all structures, and returns a single transition matrix.
transMatAggr = transprobbytotals(sampleTotals)
transMatAggr = 90.6236 7.9051 1.0314 0.4123 0.0210 0.0020 0.0003 0.0043 4.4780 89.5558 4.5298 1.1225 0.2284 0.0094 0.0009 0.0754 0.3983 6.1164 87.0641 5.4801 0.7637 0.0892 0.0050 0.0832 0.1029 0.8572 10.7918 83.0204 3.9971 0.7001 0.1313 0.3992 0.1043 0.3745 2.2962 14.0954 78.9840 3.0013 0.0463 1.0980 0.0113 0.0544 0.7055 3.2925 15.4350 75.5988 1.8166 3.0860 0.0044 0.0189 0.1903 1.9743 6.2320 10.2334 75.9983 5.3484 0 0 0 0 0 0 0 100.0000
As a sanity check, for this example you can analyze that the aggregation procedure yields the same estimates (up to numerical differences) as estimating the probabilities directly over the entire sample:
transMatWhole = transprob(data,'startDate',startDate,'endDate',endDate) aggError = max(max(abs(transMatAggr - transMatWhole)))
transMatWhole = 90.6236 7.9051 1.0314 0.4123 0.0210 0.0020 0.0003 0.0043 4.4780 89.5558 4.5298 1.1225 0.2284 0.0094 0.0009 0.0754 0.3983 6.1164 87.0641 5.4801 0.7637 0.0892 0.0050 0.0832 0.1029 0.8572 10.7918 83.0204 3.9971 0.7001 0.1313 0.3992 0.1043 0.3745 2.2962 14.0954 78.9840 3.0013 0.0463 1.0980 0.0113 0.0544 0.7055 3.2925 15.4350 75.5988 1.8166 3.0860 0.0044 0.0189 0.1903 1.9743 6.2320 10.2334 75.9983 5.3484 0 0 0 0 0 0 0 100.0000 aggError = 2.8422e-014
See Also
transprob
| transprobprep
| transprobbytotals
| bootstrp
| transprobgrouptotals
| transprobtothresholds
| transprobfromthresholds