Cox Proportional Hazards Model with Time-Dependent Covariates
This example shows how to convert survival data to counting process form and then construct a Cox proportional hazards model with time-dependent covariates.
Step 1. Compare standard layout and counting process form.
A Cox model with time-dependent covariates requires survival data to be in counting process form and not in standard layout. To see the difference between survival data in standard layout and in counting process form, load the following sample data.
load simplesurvivaldata
This sample data contains two tables: relapseS
and relapseCP
. These two tables represent the same simple survival data in standard layout and in counting process form, respectively.
Display the data in standard layout.
relapseS
relapseS=2×5 table
ID Time Censoring Age StopTreatment
__ ____ _________ ___ _____________
1 5 0 20 NaN
2 20 1 30 12
This data represents two patients whose treatment status changes over time. Patient 1 was not taking treatment for the interval from week 0 to 5 and relapsed at the end of the interval. Patient 2 was taking treatment for the interval from week 0 to 12, but not for the interval from week 12 to 20. Patient 2 did not relapse and left the study after week 20.
Now display the same data in counting process form.
relapseCP
relapseCP=3×6 table
ID tStart tStop Censoring Age TreatmentStatus
__ ______ _____ _________ ___ _______________
1 0 5 0 20 0
2 0 12 1 30 1
2 12 20 1 30 0
In counting process form, each row represents the risk interval (tStart
,tStop
] instead of a single value of an event time. Censoring
is 0 if the event is observed at the end of the risk interval, and 1 if it is not. TreatmentStatus
corresponds to a time-dependent covariate, which represents the same information with StopTreatment
in standard layout. Note that a Cox model assumes time-dependent covariates to be constant in each risk interval.
Step 2. Load sample data.
Next, load sample data to convert.
load survivaldatacp
This sample data contains a table labS
, which is simulated survival data including repeated measurement for each patient in standard layout.
Display the simulated survival data in standard layout.
labS
labS=6×7 table
ID Time Censoring Sex Lab_0 Lab_50 Lab_100
__ ____ _________ ___ _____ ______ _______
1 46 0 1 0.3 NaN NaN
2 138 1 0 0.2 0.23 0.39
3 94 0 1 0.18 0.22 NaN
4 50 0 0 0.21 0.2 NaN
5 106 0 0 0.25 0.21 0.42
6 98 0 0 0.21 0.22 NaN
In standard layout, each row of the table shows information for one patient.
ID
indicates the ID of a patient. You do not includeID
as an input of a Cox model. IncludeID
in a data set to confirm that the data set is correctly converted to counting process form.Time
represents time to event in days, which corresponds to a response variable.Censoring
has the censorship information for each patient, where 1 indicates censored data and 0 indicates that the exact time to event is observed at the end of the observation period.Sex
is a time-independent predictor where 1 indicates female, and 0 indicates male.Lab_0
,Lab_50
, andLab_100
represent three consecutive laboratory results measured at day 0, 50, and 100, which correspond to a time-dependent predictor.
Step 3. Convert survival data to counting process form.
To convert the survival data labS
to counting process form, execute the code below. This code converts Time
to a risk interval (tStart
,tStop
] and combines three vectors of the time-dependent predictor, Lab_0
, Lab_50
, and Lab_100
, into one vector, Lab
.
mTime = [0 50 100]; % Measurement time threeLabs = [labS.Lab_0 labS.Lab_50 labS.Lab_100]; nLabMeasure = sum(sum(~isnan(threeLabs))); % Number of lab measurements data = zeros(nLabMeasure,6); % One row for each observation oID = 0; % Observation ID for i = 1 : size(labS,1) idx = find(mTime <= labS.Time(i)); for j = 1 : length(idx)-1 oID = oID + 1; data(oID,:) = [labS.ID(i) mTime(j:j+1) 1 labS.Sex(i) threeLabs(i,j)]; end oID = oID + 1; data(oID,:) = [labS.ID(i) mTime(length(idx)) labS.Time(i) ... labS.Censoring(i) labS.Sex(i) threeLabs(i,length(idx))]; end labCP = table(data(:,1),data(:,2),data(:,3),data(:,4),data(:,5),data(:,6), ... 'VariableNames', {'ID','tStart','tStop','Censoring','Sex','Lab'});
Display the survival data in counting process form.
labCP
labCP=13×6 table
ID tStart tStop Censoring Sex Lab
__ ______ _____ _________ ___ ____
1 0 46 0 1 0.3
2 0 50 1 0 0.2
2 50 100 1 0 0.23
2 100 138 1 0 0.39
3 0 50 1 1 0.18
3 50 94 0 1 0.22
4 0 50 1 0 0.21
4 50 50 0 0 0.2
5 0 50 1 0 0.25
5 50 100 1 0 0.21
5 100 106 0 0 0.42
6 0 50 1 0 0.21
6 50 98 0 0 0.22
In counting process form, each row of table labCP
shows information of one observation corresponding to one risk interval. Note that a Cox model assumes Lab
to be constant in the risk interval (tStart
,tStop
]. The value in Censoring
is 0 if an event is observed at the end of the risk interval, and 1 if an event is not observed.
For example, patient 3 has two laboratory measurements at day 0 and 50, so there are two rows of data for patient 3 in counting process form. A Cox model assumes the lab results 0.18 and 0.22 to be constant in the interval (0,50] and (50,94], respectively. Censoring
is 1 in (0,50] and 0 in (50,94] because the exact event time of patient 3 is observed at day 94.
Step 4. Adjust zero-length risk interval.
Find a patient who has a zero-length risk interval.
idxInvalid = labCP.ID(find(labCP.tStart == labCP.tStop))
idxInvalid = 4
Review the data for patient 4.
labCP(find(labCP.ID==idxInvalid),:)
ans=2×6 table
ID tStart tStop Censoring Sex Lab
__ ______ _____ _________ ___ ____
4 0 50 1 0 0.21
4 50 50 0 0 0.2
The time to event of patient 4 coincides with the measurement day 50. However, (50,50] is an invalid risk interval for a Cox model because the model does not accept a zero length interval. Adjust the risk interval to be valid. You can choose any value less than the time unit as an adjustment amount. The choice of an adjustment amount is arbitrary, and it does not change the result.
idxAdjust = find(labCP.ID==idxInvalid); labCP.tStop(idxAdjust(1)) = labCP.tStop(idxAdjust(1))-0.5; labCP.tStart(idxAdjust(2)) = labCP.tStart(idxAdjust(2))-0.5; labCP(idxAdjust,:)
ans=2×6 table
ID tStart tStop Censoring Sex Lab
__ ______ _____ _________ ___ ____
4 0 49.5 1 0 0.21
4 49.5 50 0 0 0.2
Step 5. Construct a Cox proportional hazards model.
Fit a Cox proportional hazards model with the time-independent variable Sex
and time-dependent variable Lab
.
X = [labCP.Sex labCP.Lab]; T = [labCP.tStart labCP.tStop]; b = coxphfit(X,T,'Censoring',labCP.Censoring,'Baseline',0)
b = 2×1
2.0054
29.7530
For details on how to assess a Cox proportional hazards model, see Cox Proportional Hazards Model for Censored Data.
See Also
Related Examples
- Hazard and Survivor Functions for Different Groups
- Survivor Functions for Two Groups
- Cox Proportional Hazards Model for Censored Data