1 Introduction and Set-Up

This document provides background information on the R code used in the survival analysis part of the Biostatistics II course.

We start by loading the packages that we will require during these illustrations. These are packages survival, splines, lattice and JM. The first three packages are recommended packages and exist by default in all R installations. Package JM though is an optional package. If you do not have it already installed, you will need to install it. The command to do this is install.packages("JM") and needs to be performed only once.

Assuming that all packages have been installed, the following commands load them into the active R session:

library("survival")
library("splines")
library("lattice")
library("JM")

Next, we load the datasets that we will need for our illustrations, namely the AIDS, PBC, Lung and Stanford datasets introduced in the main slides of the course. These datasets are available as objects aids.id, pbc2.id, lung and stanford2, respectively. The first two from package JM and the last two from package survival. To load the datasets from the packages we use the data() function, i.e.,

data("aids", package = "JM")
data("aids.id", package = "JM")
data("pbc2.id", package = "JM")

For the lung dataset we also need to perform two data management steps, in particular, to specify variable sex as a factor, and to remove the missing values from the variables we will later use. The relevant commands are:

lung$sex <- factor(lung$sex, levels = 1:2, labels = c("male", "female"))
lung <- with(lung, lung[complete.cases(time, status, sex, age, ph.karno), ])

2 Basic Survival Analysis

2.1 Estimators of the Survival Function

A key function for the analysis of survival data in R is function Surv(). This is used to specify the type of survival data that we have, namely, right censored, left censored, interval censored. For our illustrations, we will only consider right censored data. In this case, function Surv() accepts as first argument the observed survival times, and as second the event indicator. Function Surv() is used by many other functions we will see later.

The function that is used to calculate the Kaplan-Meier estimate is survfit(). Its first argument is an R formula. The left-hand side of this formula specifies the survival times information using function Surv(), and the right-hand side is used to specify grouping variables (e.g., if we would like to obtain separate estimates of the survival function per treatment group) – if we put ‘1’ as below, then one survival curve is estimated based on all the data. Argument data specifies the data frame that contains the variables of interest. For the Stanford data, we obtain the Kaplan-Meier estimator using the command:

KM_fit <- survfit(Surv(time, status) ~ 1, data = stanford2)
KM_fit

## Call: survfit(formula = Surv(time, status) ~ 1, data = stanford2)
## 
##        n events median 0.95LCL 0.95UCL
## [1,] 184    113    631     328    1232

The basic output of function survfit() gives the number of subjects, the number of events, the median survival time and its 95% confidence interval. Using the summary() method and its times argument we can obtain the survival probabilities at specific follow-up times. For example, the following code computes the survival probabilities at 1000 and 2000 days:

summary(KM_fit, times = c(1000, 2000))

## Call: survfit(formula = Surv(time, status) ~ 1, data = stanford2)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##  1000     52      94    0.443  0.0396        0.372        0.528
##  2000     15      15    0.274  0.0440        0.200        0.375

The specification of the times argument is in the scale of the original time variable, which, in this case, was in days. Hence, if we would like to obtain the survival probability at one year, we would use summary(KM_fit, times = 365).

The quantile() method computes the corresponding follow-up times at which the survival probability takes a specific value. For example, we want to find at how many days the survival probability equals 0.7 and at how many days it equals 0.6 – the code is:

quantile(KM_fit, probs = 1 - c(0.7, 0.6))

## $quantile
##  30  40 
## 136 274 
## 
## $lower
##  30  40 
##  65 148 
## 
## $upper
##  30  40 
## 227 623

Note that in the probs argument of quantile() we have to specify one minus our target survival probabilities; this is because the function works under the cumulative distribution function (CDF) convention, and the CDF equals one minus survival probability. Moreover, also note that we obtain the lower and upper limits of the 95% confidence intervals for the quantiles we have asked.

The plot() method produces the figure of the estimated curve; by default, the 95% confidence interval is included when we have only one curve:

plot(KM_fit, xlab = "Time to Death (days)", ylab = "Survival Probability", 
    main = "Kaplan-Meier Estimate of S(t) for the Stanford Data")

To calculate the Breslow estimator, we use again function survfit() with the same syntax as for the Kaplan-Meier estimator – the only difference is that we now need to set argument type to "fleming-harrington". For the Stanford data we have:

Br_fit <- survfit(Surv(time, status) ~ 1, data = stanford2, 
    type = "fleming-harrington")
Br_fit

## Call: survfit(formula = Surv(time, status) ~ 1, data = stanford2, type = "fleming-harrington")
## 
##        n events median 0.95LCL 0.95UCL
## [1,] 184    113    729     328    1232

plot(Br_fit, mark.time = FALSE, xlab = "Time to Death (days)", 
    ylab = "Survival Probability", 
    main = "Breslow Estimate of S(t) for the Stanford Data")

The interpretation of the output, as well as functions summary() and quantile() work in the same manner as in the examples we have seen above for the Kaplan-Meier estimator.

2.2 Statistical Tests

We have two standard statistical tests for testing simple hypothesis with one grouping variable (i.e., to test the hypothesis whether the survival functions of different groups of subjects differ statistically significantly). The first and most widely used test is the log-rank test. This test is performed in R using function survdiff(). For example, to test with the log-rank test whether there are differences in the survival rates in the Lung dataset between males and females, we use the code:

logrank <- survdiff(Surv(time, status == 2) ~ sex, data = lung)
logrank

## Call:
## survdiff(formula = Surv(time, status == 2) ~ sex, data = lung)
## 
##              N Observed Expected (O-E)^2/E (O-E)^2/V
## sex=male   137      111     90.9      4.42        10
## sex=female  90       53     73.1      5.51        10
## 
##  Chisq= 10  on 1 degrees of freedom, p= 0.002

We have found a strong indication of a difference between the two groups. To test the same hypothesis but with the Peto & Peto Gehan-Wilcoxon test, we use the survdiff() function again, but now we set argument rho to 1:

peto_peto <- survdiff(Surv(time, status == 2) ~ sex, data = lung, rho = 1)
peto_peto

## Call:
## survdiff(formula = Surv(time, status == 2) ~ sex, data = lung, 
##     rho = 1)
## 
##              N Observed Expected (O-E)^2/E (O-E)^2/V
## sex=male   137     69.7     55.2      3.83      12.3
## sex=female  90     28.8     43.4      4.88      12.3
## 
##  Chisq= 12.3  on 1 degrees of freedom, p= 4e-04

The conclusion remains the same with a p-value of a smaller magnitude.

The log-rank test is the most powerful test when the proportional hazards (PH) assumption is satisfied. To check this assumption, we can plot the cumulative hazard functions for the two groups; when PH is satisfied the two curves will be proportional to each other (i.e., the steadily grow away of each other). The cumulative hazards plot can be produced by the plot() method for survfit() objects using the fun argument, e.g.,

KM_fit <- survfit(Surv(time, status == 2) ~ sex, data = lung)
plot(KM_fit, fun = "cumhaz")

3 Accelerated Failure Time Models

3.1 Model Fitting

The function that fits AFT models from the survival package is survreg(). Its first argument is a formula and has a similar syntax as the function survfit(). The dist argument specifies the distribution for the survival times (Note: the dist argument specifies the distribution of the survival times not the log survival times). The default distribution (i.e., if you do not specify the dist argument yourself) is the Weibull distribution. As with other model fitting functions in R, the summary() function returns a detailed output of the fitted model. As an example, we fit an AFT model assuming the Weibull distribution for the PBC dataset. We control for drug, sex and age. The code is:

fit_weibull <- survreg(Surv(years, status2) ~ drug + sex + age, data = pbc2.id)

summary(fit_weibull)

## 
## Call:
## survreg(formula = Surv(years, status2) ~ drug + sex + age, data = pbc2.id)
##                  Value Std. Error     z      p
## (Intercept)    4.13862    0.48260  8.58 <2e-16
## drugD-penicil  0.13046    0.15576  0.84  0.402
## sexfemale      0.44472    0.20147  2.21  0.027
## age           -0.03874    0.00793 -4.89  1e-06
## Log(scale)    -0.10223    0.07423 -1.38  0.168
## 
## Scale= 0.903 
## 
## Weibull distribution
## Loglik(model)= -494.7   Loglik(intercept only)= -511.8
##  Chisq= 34.23 on 3 degrees of freedom, p= 1.8e-07 
## Number of Newton-Raphson Iterations: 5 
## n= 312

To fit the same model but with the exponential distribution, the code is:

fit_exp <- survreg(Surv(years, status2) ~ drug + sex + age, data = pbc2.id,
                   dist = "exponential")

summary(fit_exp)

## 
## Call:
## survreg(formula = Surv(years, status2) ~ drug + sex + age, data = pbc2.id, 
##     dist = "exponential")
##                 Value Std. Error     z       p
## (Intercept)    4.3282     0.5159  8.39 < 2e-16
## drugD-penicil  0.1438     0.1721  0.84    0.40
## sexfemale      0.4816     0.2217  2.17    0.03
## age           -0.0420     0.0084 -5.00 5.7e-07
## 
## Scale fixed at 1 
## 
## Exponential distribution
## Loglik(model)= -495.6   Loglik(intercept only)= -512.3
##  Chisq= 33.35 on 3 degrees of freedom, p= 2.7e-07 
## Number of Newton-Raphson Iterations: 4 
## n= 312

To fit the same model but with the log-normal distribution, the code is:

fit_lnorm <- survreg(Surv(years, status2) ~ drug + sex + age, data = pbc2.id,
                   dist = "lognormal")

summary(fit_lnorm)

## 
## Call:
## survreg(formula = Surv(years, status2) ~ drug + sex + age, data = pbc2.id, 
##     dist = "lognormal")
##                  Value Std. Error     z       p
## (Intercept)    4.41534    0.59602  7.41 1.3e-13
## drugD-penicil  0.21196    0.18775  1.13    0.26
## sexfemale      0.30242    0.27761  1.09    0.28
## age           -0.04841    0.00944 -5.13 2.9e-07
## Log(scale)     0.34711    0.06473  5.36 8.2e-08
## 
## Scale= 1.41 
## 
## Log Normal distribution
## Loglik(model)= -498.9   Loglik(intercept only)= -515.2
##  Chisq= 32.72 on 3 degrees of freedom, p= 3.7e-07 
## Number of Newton-Raphson Iterations: 3 
## n= 312

To fit the same model but with the log-logistic distribution, the code is:

fit_llogis <- survreg(Surv(years, status2) ~ drug + sex + age, data = pbc2.id,
                   dist = "loglogistic")

summary(fit_llogis)

## 
## Call:
## survreg(formula = Surv(years, status2) ~ drug + sex + age, data = pbc2.id, 
##     dist = "loglogistic")
##                  Value Std. Error     z       p
## (Intercept)    4.00019    0.54926  7.28 3.3e-13
## drugD-penicil  0.14532    0.17387  0.84 0.40325
## sexfemale      0.38446    0.24161  1.59 0.11155
## age           -0.04151    0.00884 -4.69 2.7e-06
## Log(scale)    -0.26129    0.07326 -3.57 0.00036
## 
## Scale= 0.77 
## 
## Log logistic distribution
## Loglik(model)= -497.4   Loglik(intercept only)= -513
##  Chisq= 31.08 on 3 degrees of freedom, p= 8.2e-07 
## Number of Newton-Raphson Iterations: 4 
## n= 312

3.2 Effect Plots

In cases we consider statistical models with complex terms (interactions and nonlinear terms), the regression coefficients we obtain in the output do not have a straightforward interpretation. In these settings, we can use effect plots to communicate the results of the model. As an example, we fit a model in the PBC dataset that contains the effect of drug, the effect of sex, the linear effect of age, the quadratic effect of age, and the interaction effects between sex and the linear and quadratic effects of age, and between drug and the linear and quadratic effects of age. We assume that the survival times follow the Weibull distribution. The code to fit the model is:

fit <- survreg(Surv(years, status2) ~ (drug + sex) * (age + I(age^2)), 
               data = pbc2.id)

summary(fit)

## 
## Call:
## survreg(formula = Surv(years, status2) ~ (drug + sex) * (age + 
##     I(age^2)), data = pbc2.id)
##                            Value Std. Error     z    p
## (Intercept)             8.342101   6.266023  1.33 0.18
## drugD-penicil           3.489874   3.353680  1.04 0.30
## sexfemale              -6.523710   6.189094 -1.05 0.29
## age                    -0.161596   0.217081 -0.74 0.46
## I(age^2)                0.000822   0.001843  0.45 0.66
## drugD-penicil:age      -0.148887   0.127901 -1.16 0.24
## drugD-penicil:I(age^2)  0.001549   0.001197  1.29 0.20
## sexfemale:age           0.240257   0.215642  1.11 0.27
## sexfemale:I(age^2)     -0.002015   0.001847 -1.09 0.28
## Log(scale)             -0.113198   0.074516 -1.52 0.13
## 
## Scale= 0.893 
## 
## Weibull distribution
## Loglik(model)= -492.4   Loglik(intercept only)= -511.8
##  Chisq= 38.81 on 8 degrees of freedom, p= 5.3e-06 
## Number of Newton-Raphson Iterations: 5 
## n= 312

As mentioned above, in this particular example, the linear and quadratic effects of age, and the corresponding interactions effects cannot be easily interpreted. To visualize the model, we will use an effects plot. The first step is to specify a dataset that contains combinations of values for the covariates of the model based on which we will create the plot. In our example, we can create a single figure with all possible combinations of the covariates, i.e., both treatment groups, both sexes and increasing age. To create the dataset that contains all possible combinations of these covariates we use function expand.grid(). The code is:

ND <- with(pbc2.id, expand.grid(
    age = seq(30, 65, length.out = 25),
    sex = levels(sex),
    drug = levels(drug)
))

head(ND, 10)

##         age  sex    drug
## 1  30.00000 male placebo
## 2  31.45833 male placebo
## 3  32.91667 male placebo
## 4  34.37500 male placebo
## 5  35.83333 male placebo
## 6  37.29167 male placebo
## 7  38.75000 male placebo
## 8  40.20833 male placebo
## 9  41.66667 male placebo
## 10 43.12500 male placebo

Function seq() creates a sequence of numbers, in our example from 30 to 65 years old, with the length of the sequence equal to 25. Function levels() extracts from the pbc2.id database the possible values of the sex and drug variables. Function head() prints the first ten rows of this dataset. Note: In the definition of the data frame ND, all variables that appear in the model need to have their values specified. Also, you need to respect the class of these variables, e.g., in the example above and in the original data frame pbc2.id, variable sex is a factor. This is why we define its levels in ND and not, for example, the values 0 and 1.

Next, based on this dataset we compute the predictions from the model using function predict():

prs <- predict(fit, ND, se.fit = TRUE, type = "lp")
ND$pred <- prs[[1]]
ND$se <- prs[[2]]
ND$lo <- ND$pred - 1.96 * ND$se
ND$up <- ND$pred + 1.96 * ND$se

head(ND, 10)

##         age  sex    drug     pred        se       lo       up
## 1  30.00000 male placebo 4.234429 1.4533787 1.385807 7.083051
## 2  31.45833 male placebo 4.072483 1.3055301 1.513644 6.631322
## 3  32.91667 male placebo 3.914035 1.1659479 1.628777 6.199293
## 4  34.37500 male placebo 3.759085 1.0347438 1.730987 5.787183
## 5  35.83333 male placebo 3.607634 0.9120677 1.819981 5.395287
## 6  37.29167 male placebo 3.459681 0.7981221 1.895362 5.024000
## 7  38.75000 male placebo 3.315227 0.6931838 1.956586 4.673867
## 8  40.20833 male placebo 3.174270 0.5976336 2.002909 4.345632
## 9  41.66667 male placebo 3.036812 0.5119939 2.033304 4.040321
## 10 43.12500 male placebo 2.902853 0.4369676 2.046396 3.759309

The output of the predict() function as specified above is the predictions (in the log survival times scale) and the corresponding standard errors. The second and third lines of the code include as extra columns in the dataset ND these predictions and standard errors. The fourth and fifth lines calculate the lower and upper limits of a 95% confidence interval for the log survival times. Finally, we create the plot using function xyplot() from package lattice:

xyplot(pred + lo + up ~ age | drug * sex, data = ND, type = "l", 
    lty = c(1,2,2), col = "black", lwd = 2, xlab = "Age", 
    ylab = "Log Survival Time")

To create the plot for only one level of a categorical covariate, we use function factor() in the definition of the dataset ND in order to denote that this variable has two levels but only one of them is used. As an example, we recreate the plot from above but only for females. The specification of ND becomes:

ND <- with(pbc2.id, expand.grid(
    age = seq(30, 65, length.out = 25),
    sex = factor("female", levels = levels(sex)),
    drug = levels(drug)
))

head(ND, 10)

##         age    sex    drug
## 1  30.00000 female placebo
## 2  31.45833 female placebo
## 3  32.91667 female placebo
## 4  34.37500 female placebo
## 5  35.83333 female placebo
## 6  37.29167 female placebo
## 7  38.75000 female placebo
## 8  40.20833 female placebo
## 9  41.66667 female placebo
## 10 43.12500 female placebo

The rest of the code remains the same; the only extra difference is that we now do the plot in the original timescale and not for the log survival times - note the use of function exp() in the code below, i.e.,

prs <- predict(fit, ND, se.fit = TRUE, type = "lp")
ND$pred <- prs[[1]]
ND$se <- prs[[2]]
ND$lo <- exp(ND$pred - 1.96 * ND$se)
ND$up <- exp(ND$pred + 1.96 * ND$se)
ND$pred <- exp(ND$pred)

xyplot(pred + lo + up ~ age | drug, data = ND, type = "l", 
    lty = c(1,2,2), col = "black", lwd = 2, xlab = "Age", 
    ylab = "Survival Time")

Because the logarithmic transformation we have used is nonlinear, we need to apply a further bias correction to get the average survival times on the original scale. This is achieved with the following syntax:

prs <- predict(fit, ND, se.fit = TRUE, type = "lp")
ND$pred <- prs[[1]]
ND$se <- prs[[2]]
ND$lo <- ND$pred - 1.96 * ND$se
ND$up <- ND$pred + 1.96 * ND$se
var_logT <- fit$scale^2 * pi^2 / 6
bias_adj <- function (x) {
    exp(x) * (1 + 0.5 * var_logT)
}

xyplot(bias_adj(pred) + bias_adj(lo) + bias_adj(up) ~ age | drug, data = ND, 
       type = "l", lty = c(1,2,2), col = "black", lwd = 2, xlab = "Age", 
       ylab = "Survival Time - Bias adjusted")

In case another distribution was used instead of the Weibull, we would need to adjust the specification of var_logT above.

3.3 Hypothesis Testing

A statistical comparison between two nested models can be performed with a likelihood ratio test calculated by function anova(). As an example, we use again the complex model that we used for the effect plot in the previous section, i.e.,

fit_alt <- survreg(Surv(years, status2) ~ (drug + sex) * (age + I(age^2)), data = pbc2.id)

Say that we want to test whether we can drop all the complex terms, that is, the interaction and nonlinear terms. Then, the model above is the model under the alternative hypothesis (i.e., the full model). The model under the null hypothesis is the one that does not contain any of these terms (i.e., the reduced model):

fit_null <- survreg(Surv(years, status2) ~ drug + sex + age, data = pbc2.id)

We compare the two models using anova(), i.e.,

anova(fit_null, fit_alt)

##                             Terms Resid. Df    -2*LL Test Df Deviance  Pr(>Chi)
## 1                drug + sex + age       307 989.4673      NA       NA        NA
## 2 (drug + sex) * (age + I(age^2))       302 984.8867    =  5 4.580586 0.4691742

Note that the user is responsible to supply appropriately nested AFT models such that the LRT to be valid. The result suggests that we could use the simplified model.

3.4 Residuals

In order to check whether the assumed distribution of survival times fits the observed data sufficiently well, we can use the residuals of the model. However, because the residuals are calculated based on the observed event times, they will also be censored. Hence, to investigate whether they follow a particular distribution, we will need to employ a graphical procedure accounting for censoring. One way to achieve this is by using the Kaplan-Meier estimator of the residuals. As an example, we would like to investigate whether the Weibull distribution is an appropriate distribution for the survival times in the AIDS dataset. We first fit the model of interest:

fit_weib <- survreg(Surv(Time, death) ~ drug * gender, data = aids.id)

Next, we construct the residuals based on their definition

fitted_values <- fit_weib$linear.predictors
resids <- (log(fit_weib$y[, 1]) - fitted_values) / fit_weib$scale

We then compute the Kaplan-Meier estimate of these residuals, and we plot it. Lines 3-5 of the code below are used to superimpose in the graph the assumed Weibull distribution.

resKM <- survfit(Surv(resids, death) ~ 1, data = aids.id)
plot(resKM, mark.time = FALSE, xlab = "AFT Residuals", ylab = "Survival Probability")
xx <- seq(min(resids), max(resids), length.out = 35)
yy <- exp(- exp(xx))
lines(xx, yy, col = "red", lwd = 2)
legend("bottomleft", c("KM estimate", "95% CI KM estimate", 
    "Survival function of Extreme Value distribution"), 
    lty = c(1,2,1), col = c(1,1,2), bty = "n")

In this example, we observe that the Weibull distribution provides a good fit to the data. For models that assume other distributions of the survival times, the only thing that we need to change is the line of the code where the yy values are defined. In particular, for AFT models assuming the log-normal or the Gaussian distribution, we use yy <- pnorm(xx, lower.tail = FALSE), and for AFT models assuming the log-logistic or the logistic distribution, you should use yy <- plogis(xx, lower.tail = FALSE).

4 Cox Proportional Hazards Models

4.1 Model Fitting

The function that fits Cox models from the survival package is coxph(). It has similar syntax to survreg() that we saw in the previous section, with only exception that it does not have the dist argument (i.e., the Cox model is a semi-parametric model that does not assume a particular distribution for the survival times). As in the majority of the model fitting functions in R, the summary() method returns a detailed output of the fitted model. As an example, we fit a Cox model to the PBC dataset in which we include as covariates the drug, sex and age; the code is:

fit <- coxph(Surv(years, status2) ~ drug + sex + age, data = pbc2.id)
summary(fit)

## Call:
## coxph(formula = Surv(years, status2) ~ drug + sex + age, data = pbc2.id)
## 
##   n= 312, number of events= 140 
## 
##                    coef exp(coef)  se(coef)      z Pr(>|z|)    
## drugD-penicil -0.146013  0.864146  0.172143 -0.848   0.3963    
## sexfemale     -0.470905  0.624437  0.221785 -2.123   0.0337 *  
## age            0.042842  1.043773  0.008505  5.037 4.72e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##               exp(coef) exp(-coef) lower .95 upper .95
## drugD-penicil    0.8641     1.1572    0.6167    1.2109
## sexfemale        0.6244     1.6014    0.4043    0.9644
## age              1.0438     0.9581    1.0265    1.0613
## 
## Concordance= 0.629  (se = 0.024 )
## Likelihood ratio test= 33.25  on 3 df,   p=3e-07
## Wald test            = 34.87  on 3 df,   p=1e-07
## Score (logrank) test = 35.31  on 3 df,   p=1e-07

The column coef in the output denotes the log hazard ratios, and exp(coef) the corresponding hazard ratios.

4.2 Effect Plots

Similarly to AFT models, when we consider a more complex model that includes interactions and nonlinear terms, it is more useful to communicate the results of the model using an effects plot. The procedure to construct such a plot is the same as in AFT models. For example, we want to produce an effect plot for the following Cox model for the Lung dataset that contains the sex the nonlinear effect of age and its interaction with sex, and the nonlinear effect of ph.karno:

fit <- coxph(Surv(time, status) ~ ns(age, 3) * sex + ns(ph.karno, 3), data = lung)

We first construct the dataset that gives the combinations of covariate values based on which the plot will be constructed. This done with function expand.grid():

ND <- with(lung, expand.grid(
    age = seq(45, 75, length.out = 25),
    sex = levels(sex),
    ph.karno = quantile(ph.karno, c(0.25, 0.5, 0.75))
))
head(ND)

##     age  sex ph.karno
## 1 45.00 male       75
## 2 46.25 male       75
## 3 47.50 male       75
## 4 48.75 male       75
## 5 50.00 male       75
## 6 51.25 male       75

In the definition of ND all variables that are in the model need to have their values specified - see the relevant note also in the corresponding section in the AFT models. As seen above, we will do the plot, for increasing age from 45 to 75 years old, both males and females, and for ph.karno increasing from the 1st quartile, to the median, to the 3rd quartile. Next, we obtain the predictions from the model (these are in the log-hazard scale), and we calculate the 95% confidence interval:

prs <- predict(fit, newdata = ND, type = "lp", se.fit = TRUE)
ND$pred <- prs[[1]]
ND$se <- prs[[2]]
ND$lo <- ND$pred - 1.96 * ND$se
ND$up <- ND$pred + 1.96 * ND$se
ND$ph.karno <- factor(ND$ph.karno, labels = paste("PH Karno =", sort(unique(ND$ph.karno))))

The last line of the code transforms the ph.karno variable to a factor in order to have a more visible label in the plot that is produced with the following code:

xyplot(pred + lo + up ~ age | ph.karno * sex, data = ND, 
       type = "l", col = "black", lwd = 2, lty = c(1, 2, 2),
       abline = list(h = 0, lty = 2, lwd = 2, col = "red"),
       xlab = "Age (years)", ylab = "log Hazard Ratio")

4.3 Hypothesis Testing

When a single categorical covariate is included in the Cox model, the summary() method returns the Wald, score (in this case equivalent to the log-rank test), and likelihood ratio tests. For example, for the PBC data these three tests for the sex variable are:

fit <- coxph(Surv(years, status2) ~ sex, data = pbc2.id)
summary(fit)

## Call:
## coxph(formula = Surv(years, status2) ~ sex, data = pbc2.id)
## 
##   n= 312, number of events= 140 
## 
##              coef exp(coef) se(coef)      z Pr(>|z|)   
## sexfemale -0.6502    0.5219   0.2183 -2.979   0.0029 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##           exp(coef) exp(-coef) lower .95 upper .95
## sexfemale    0.5219      1.916    0.3403    0.8006
## 
## Concordance= 0.537  (se = 0.016 )
## Likelihood ratio test= 7.73  on 1 df,   p=0.005
## Wald test            = 8.87  on 1 df,   p=0.003
## Score (logrank) test = 9.18  on 1 df,   p=0.002

For a more complex hypothesis, we can use the likelihood ratio test by comparing the models under the null and alternative hypothesis. This is done using the anova() function. For example, in the PBC dataset, and for the Cox model that contains drug, age and their interaction, we would like to test for the effect of drug. Hence, the model under the null hypothesis is the one that does not contain drug. The relevant code is:

fit_null <- coxph(Surv(years, status2) ~ age, data = pbc2.id)
fit_alt <- coxph(Surv(years, status2) ~ drug * age, data = pbc2.id)

anova(fit_null, fit_alt)

## Analysis of Deviance Table
##  Cox model: response is  Surv(years, status2)
##  Model 1: ~ age
##  Model 2: ~ drug * age
##    loglik  Chisq Df Pr(>|Chi|)
## 1 -712.41                     
## 2 -711.29 2.2259  2     0.3286

As an additional example, we illustrate how we can test for non-linearity using natural cubic splines. We refit the above model by now allowing the effect of age to be nonlinear using natural cubic splines with 3 degrees of freedom.

fit_alt <- coxph(Surv(years, status2) ~ drug + ns(age, 3), data = pbc2.id)

To test if the effect of age is nonlinear in the log hazard scale, we fit the model under the null hypothesis that assumes a linear effect for age, and we compare the two models using the likelihood ratio test:

fit_null <- coxph(Surv(years, status2) ~ drug + age, data = pbc2.id)

anova(fit_null, fit_alt)

## Analysis of Deviance Table
##  Cox model: response is  Surv(years, status2)
##  Model 1: ~ drug + age
##  Model 2: ~ drug + ns(age, 3)
##    loglik  Chisq Df Pr(>|Chi|)
## 1 -711.97                     
## 2 -711.90 0.1233  2     0.9402

The result shows that there does not seem to be any indication that the effect of age is nonlinear (i.e., the two models do not differ statistically). Hence, we could proceed with the linear model.

4.4 Proportional Hazards Assumption

For categorical covariates, we can check the proportional hazards assumption by appropriately transforming the Kaplan-Meier estimate in the log-log scale. In this scale we would expect to see parallel lines. We illustrate this procedure for the sex variable from the PBC dataset.

KM_fit <- survfit(Surv(years, status2) ~ sex, data = pbc2.id)

par(mfrow = c(1, 2))

plot(KM_fit, xlab = "Years", ylab = "Survival", col = 1:2)
legend("topright", levels(pbc2.id$sex), lty = 1, col = 1:2, bty = "n")

plot(KM_fit, fun = function (s) -log(-log(s)), xlab = "Years", 
    ylab = "-log(- log(Survival))", col = 1:2)

The plot in the left panel of the figure is the classical Kaplan-Meier estimator (i.e., on the y-axis we have survival probabilities). The plot in the right panel has on the y-axis the \(-\log[-\log\{S(t)\}]\) transformation of the survival function \(S(t)\). We observe some strong indications of non-parallel lines. As a possible solution, we could stratify by sex.

For continuous covariates, we can test the proportional hazards assumption using the Schoenfeld residuals. We should note that this procedure also can be used for categorical covariates. We illustrate their use in a Cox model for the PBC dataset that contains sex, age and their interaction. The model is fitted with the code:

fit <- coxph(Surv(years, status2) ~ sex * age, data = pbc2.id)

The function that calculates the Schoenfeld residuals is cox.zph(). The two primary arguments of this function are the fitted Cox model and the transformation of time to be used. The code below does the calculations for the KM scale.

check_PH <- cox.zph(fit, transform = "km")
check_PH

##          chisq df    p
## sex     2.1697  1 0.14
## age     0.0011  1 0.97
## sex:age 1.9059  1 0.17
## GLOBAL  2.1754  3 0.54

The output of the function reports a test for non-proportionality (i.e., the null hypothesis is that the corresponding variable satisfies PH). However, it is more useful to inspect the PH assumption by plotting the Schoenfeld residuals graphically. This figure is produced with the plot() method. We produce the plot separately for each predictor of the model in order also to include the red line corresponding to PH.

plot(check_PH, var = 1)
abline(h = coef(fit)[1], col = "red", lwd = 2)

plot(check_PH, var = 2)
abline(h = coef(fit)[2], col = "red", lwd = 2)

plot(check_PH, var = 3)
abline(h = coef(fit)[3], col = "red", lwd = 2)

We observe that for all variables PH seems to hold.

Notes

There is a contradiction with regard to whether the sex variable satisfies PH between the first method we have seen based on the transformation of the Kaplan-Meier estimator and the second method based on the Schoenfeld residuals. From the two methods, most trustworthy is the latter.
From the plots, we observe that the residuals are quite evenly spaced along the x-axis and therefore there is no need to consider other transformations of the timescale.
Function cox.zph() has a logical argument called terms, which is by default set to TRUE. When terms = TRUE, the cox.zph() combines the residuals of different coefficients that correspond to the same model term, and produces one plot and one test for zero correlation. This comes into play, for example, when you use splines for a continuous predictor. In this case, the default behavior of cox.zph() will combine all spline terms into one plot. If you would like to get the plot per coefficient, you will need to set terms = FALSE in the call to cox.zph().

5 Extensions of the Cox Model

5.1 Survival Probabilities from Cox Models

As we have previously explained, the Cox model does not estimate the baseline hazard, and therefore we cannot directly obtain survival probabilities from it. To achieve that we need to combine it with a non-parametric estimator of the baseline hazard function. The most popular method to do that is to use the Breslow estimator. For a fitted Cox model from package survival these probabilities are calculated by function survfit(). As an illustration, we would like to derive survival probabilities from the following Cox model for the AIDS dataset:

fit <- coxph(Surv(Time, death) ~ CD4 + drug + gender, data = aids.id)

Since this model contains three covariates, we need to specify for which combinations of values of these covariates we want to derive the probabilities. These combinations of values need to be specified in a dataset. As an example, we derive survival probabilities for both males and females, who receive both treatments and have square root CD4 equal to 3. The code to create this data uses the expand.grid() function that we have seen before:

ND <- with(aids.id, expand.grid(CD4 = 3, drug = levels(drug), gender = levels(gender)))
ND

##   CD4 drug gender
## 1   3  ddC female
## 2   3  ddI female
## 3   3  ddC   male
## 4   3  ddI   male

Then we give the fitted model and the data defined above as arguments to survfit():

surv_probs_Cox <- survfit(fit, newdata = ND)
surv_probs_Cox

## Call: survfit(formula = fit, newdata = ND)
## 
##     n events median 0.95LCL 0.95UCL
## 1 467    188   12.2     9.8      NA
## 2 467    188   10.6     8.0    16.6
## 3 467    188   14.1    11.9    17.8
## 4 467    188   11.6    10.4    14.1

The output gives the median survival times and the corresponding 95% confidence intervals for these groups of patients. Similarly to how we worked in Section 2.1, we can derive the survival probabilities at specific follow-up times using the summary() method and the times argument. For example, the following code extracts the survival probabilities for the groups above at eight months of follow-up:

summary(surv_probs_Cox, times = 8)

## Call: survfit(formula = fit, newdata = ND)
## 
##  time n.risk n.event survival1 survival2 survival3 survival4
##     8    379      86       0.7     0.627     0.747     0.683

To obtain the corresponding lower and upper limits of the corresponding 95% confidence intervals, we need to manually extract them from the output of summary(), i.e.,

summ <- summary(surv_probs_Cox, times = 8)
out <- t(rbind(summ$surv, summ$lower, summ$upper))
colnames(out) <- c("Surv_Prob", "Lower", "Upper")
out

##   Surv_Prob     Lower     Upper
## 1 0.6998629 0.5850566 0.8371977
## 2 0.6274689 0.4966114 0.7928075
## 3 0.7471514 0.6894090 0.8097300
## 4 0.6834015 0.6182431 0.7554271

Finally, the plot() method produces the plot with all four survival curves

plot(surv_probs_Cox, col = c("black", "red", "blue", "green"),
     xlab = "Follow-Up Time (months)", ylab = "Survival Probabilities")
legend("bottomleft", paste("drug =", ND$drug, ", gender = ", ND$gender), 
       lty = 1, col = c("black", "red", "blue", "green"), bty = "n")

5.2 Stratification

In cases, we have inherent heterogeneity in the data (e.g., a multi-center trial) or we have a categorical variable that does not satisfy the proportional hazards assumption, we can use stratification. To fit the stratified Cox model, we include the stratifying factor in the formula of the model using the strata() function. For example, the following code fits a stratified Cox model for the PBC dataset with different baseline hazard functions for each sex:

fit <- coxph(Surv(years, status2) ~ age + drug + log(serBilir) + strata(sex), 
             data = pbc2.id)

summary(fit)

## Call:
## coxph(formula = Surv(years, status2) ~ age + drug + log(serBilir) + 
##     strata(sex), data = pbc2.id)
## 
##   n= 312, number of events= 140 
## 
##                    coef exp(coef)  se(coef)      z Pr(>|z|)    
## age            0.048250  1.049433  0.008483  5.688 1.28e-08 ***
## drugD-penicil -0.105392  0.899971  0.177047 -0.595    0.552    
## log(serBilir)  1.084214  2.957114  0.094508 11.472  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##               exp(coef) exp(-coef) lower .95 upper .95
## age               1.049     0.9529    1.0321     1.067
## drugD-penicil     0.900     1.1111    0.6361     1.273
## log(serBilir)     2.957     0.3382    2.4571     3.559
## 
## Concordance= 0.816  (se = 0.021 )
## Likelihood ratio test= 157.3  on 3 df,   p=<2e-16
## Wald test            = 155.7  on 3 df,   p=<2e-16
## Score (logrank) test = 181.6  on 3 df,   p=<2e-16

Note, a feature of stratification is that we correct (in the most general manner) the analysis for sex but we do not obtain any coefficient for sex.

If we would like to allow that the effect of age on the log hazard differs between males and females, we include the interaction term between age and strata(sex), i.e.,

fit_int <- coxph(Surv(years, status2) ~ age * strata(sex) + drug + log(serBilir), 
             data = pbc2.id)

summary(fit_int)

## Call:
## coxph(formula = Surv(years, status2) ~ age * strata(sex) + drug + 
##     log(serBilir), data = pbc2.id)
## 
##   n= 312, number of events= 140 
## 
##                           coef exp(coef) se(coef)      z Pr(>|z|)    
## age                    0.03427   1.03487  0.01788  1.917   0.0552 .  
## drugD-penicil         -0.11869   0.88809  0.17757 -0.668   0.5039    
## log(serBilir)          1.09558   2.99091  0.09526 11.501   <2e-16 ***
## age:strata(sex)female  0.01800   1.01817  0.02035  0.884   0.3764    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                       exp(coef) exp(-coef) lower .95 upper .95
## age                      1.0349     0.9663    0.9992     1.072
## drugD-penicil            0.8881     1.1260    0.6271     1.258
## log(serBilir)            2.9909     0.3343    2.4815     3.605
## age:strata(sex)female    1.0182     0.9822    0.9783     1.060
## 
## Concordance= 0.817  (se = 0.021 )
## Likelihood ratio test= 158.1  on 4 df,   p=<2e-16
## Wald test            = 156.8  on 4 df,   p=<2e-16
## Score (logrank) test = 181.9  on 4 df,   p=<2e-16

When we calculate survival probabilities (see Section 5.1) from a stratified Cox model, we obtain a different curve per stratum. For example, using the last model, we would like to calculate survival probabilities for the patient with the following characteristics:

ND <- with(pbc2.id, expand.grid(
    age = 60, 
    drug = factor("placebo", levels = levels(drug)),
    serBilir = 15,
    sex = factor("male", levels = levels(sex))
))

To compute the survival probabilities we use survfit() and we plot them using the plot() method:

sfit <- survfit(fit_int, newdata = ND)
plot(sfit, col = c("black", "red"))
legend("topright", levels(pbc2.id$sex), lty = 1, lwd = 2, col = c("black", "red"), 
       bty = "n")

We observe that even though we have specified that the patient for whom we want to calculate survival probabilities is a male, we obtain two curves both for males and females. This is because sex is a stratifying factor and not a covariate.

5.2.1 Step Functions for Time-Varying Coefficients

The stratified Cox model can also be used in settings in which a continuous covariate does not satisfy the proportional hazards assumption, and we want to fit instead a Cox model with a time-varying coefficient. We illustrate this for the PBC dataset, the ph.karno variable

fit <- coxph(Surv(time, status) ~ sex + ph.karno, data = lung)

As we have seen before in the plot of the Schoenfeld residuals, i.e.,

# PH for sex
plot(cox.zph(fit), var = 1)
abline(h = coef(fit)[1], lty = 2, col = "red")

# PH for ph.karno
plot(cox.zph(fit), var = 2)
abline(h = coef(fit)[2], lty = 2, col = "red")

there is a very mild violation of PH assumption for ph.karno but not for sex. Hence, we are going to illustrate how we can relax the PH assumption for ph.karno by splitting the follow-up period. Namely, from the figure of the Schoenfeld residuals we see that the time-varying coefficient for ph.karno seems to be nearly constant in the time periods from 0 to 270 days, and from 270 days to the end of the study. Thus, we will consider these two intervals. We split the original lung dataset using the survSplit() function. We will need to provide a formula that includes the covariates used in the formula of the Cox model (additively with no interactions or nonlinear terms even if you have included such terms in the Cox model), provide the original lung data in the data argument, then define in the cut argument the partition of the time scale, i.e., give the vector c(270), and finally also define the name of the variable that will denote the different pieces of the data.

lung2 <- survSplit(Surv(time, status) ~ sex + ph.karno, data = lung,
                   cut = c(270), episode = "time_group")

head(lung2, 10)

##     sex ph.karno tstart time status time_group
## 1  male       90      0  270      0          1
## 2  male       90    270  306      1          2
## 3  male       90      0  270      0          1
## 4  male       90    270  455      1          2
## 5  male       90      0  270      0          1
## 6  male       90    270 1010      0          2
## 7  male       90      0  210      1          1
## 8  male      100      0  270      0          1
## 9  male      100    270  883      1          2
## 10 male       50      0  270      0          1

Next we fit a stratified Cox model using this newly created dataset. Two notes:

In the formula argument of coxph() we will need to include the tstart, time and status variables.
We will need to stratify the model by the time_group variable, and include the interactions of this variable with the covariates for which we want to relax the PH assumption, i.e., here for ph.karno

Hence, the code will be

fit_tv_coef <- coxph(Surv(tstart, time, status) ~ sex + ph.karno:strata(time_group),
                     data = lung2)

summary(fit_tv_coef)

## Call:
## coxph(formula = Surv(tstart, time, status) ~ sex + ph.karno:strata(time_group), 
##     data = lung2)
## 
##   n= 334, number of events= 164 
## 
##                                              coef exp(coef)  se(coef)      z
## sexfemale                               -0.520440  0.594259  0.167571 -3.106
## ph.karno:strata(time_group)time_group=1 -0.032412  0.968108  0.007881 -4.113
## ph.karno:strata(time_group)time_group=2  0.002752  1.002756  0.008629  0.319
##                                         Pr(>|z|)    
## sexfemale                                 0.0019 ** 
## ph.karno:strata(time_group)time_group=1 3.91e-05 ***
## ph.karno:strata(time_group)time_group=2   0.7498    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                                         exp(coef) exp(-coef) lower .95
## sexfemale                                  0.5943     1.6828    0.4279
## ph.karno:strata(time_group)time_group=1    0.9681     1.0329    0.9533
## ph.karno:strata(time_group)time_group=2    1.0028     0.9973    0.9859
##                                         upper .95
## sexfemale                                  0.8253
## ph.karno:strata(time_group)time_group=1    0.9832
## ph.karno:strata(time_group)time_group=2    1.0199
## 
## Concordance= 0.641  (se = 0.025 )
## Likelihood ratio test= 26.3  on 3 df,   p=8e-06
## Wald test            = 27.23  on 3 df,   p=5e-06
## Score (logrank) test = 27.72  on 3 df,   p=4e-06

We obtain two coefficients for ph.karno according to the two strata. These coefficients are the log hazard ratios for a unit increase in ph.karno for patients of the same sex in the two periods, namely, from 0 to 270 days, and from 270 days to the end of the study. We observe that the effect of the ph.karno seems to be limited in the first period.

5.3 Time-Varying Covariates

The association between exogenous time-varying covariates and the risk of an event can be studied using the time-varying Cox model. To fit this model we use the counting process notation that utilizes the intervals created by the time points at which the covariate was recorded. As an illustration, we show how this model can be fitted using the CD4 cell count from the AIDS dataset. We should explicitly note here that this is for illustration purposes since the CD4 cell count is actually an endogenous time-varying covariates. First, let’s have a look at the first rows of the dataset with the start-stop format:

head(aids[c("patient", "start", "stop", "event", "CD4")], 7)

##   patient start  stop event       CD4
## 1       1     0  6.00     0 10.677078
## 2       1     6 12.00     0  8.426150
## 3       1    12 16.97     0  9.433981
## 4       2     0  6.00     0  6.324555
## 5       2     6 12.00     0  8.124038
## 6       2    12 18.00     0  4.582576
## 7       2    18 19.00     0  5.000000

The first patient has three measurements, whereas the second on four. The time-varying Cox model is fitted with the code:

td_Cox <- coxph(Surv(start, stop, event) ~ CD4, data = aids)
summary(td_Cox)

## Call:
## coxph(formula = Surv(start, stop, event) ~ CD4, data = aids)
## 
##   n= 1405, number of events= 188 
## 
##         coef exp(coef) se(coef)      z Pr(>|z|)    
## CD4 -0.19040   0.82663  0.02423 -7.858 3.92e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## CD4    0.8266       1.21    0.7883    0.8668
## 
## Concordance= 0.691  (se = 0.018 )
## Likelihood ratio test= 90.15  on 1 df,   p=<2e-16
## Wald test            = 61.74  on 1 df,   p=4e-15
## Score (logrank) test = 69.04  on 1 df,   p=<2e-16

To more appropriately account for the endogenous nature of the CD4 cell count, we should fit a joint model. This can be done with function jointModel() from the JM package. This function requires first fitting a linear mixed effects model for the time-varying covariates, a Cox model that may contain other baseline covariates (here we have none), and the we give these two object in the function as main arguments, i.e.,

mixed_model_CD4 <- lme(CD4 ~ obstime, data = aids, random = ~ obstime | patient)
base_Cox <- coxph(Surv(Time, death) ~ 1, data = aids.id, x = TRUE)
joint_model <- jointModel(mixed_model_CD4, base_Cox, timeVar = "obstime",
                          method = "piecewise-PH-aGH")

exp(confint(joint_model, "Event"))

##            2.5 %      est.    97.5 %
## Assoct 0.6993049 0.7506215 0.8057038

We observe that from the Cox model the hazard ratio for a unit increase of the square root CD4 cell count is 0.83 (95% CI: 0.79; 0.87), whereas from the joint model 0.75 (95% CI: 0.70; 0.80). This is a noteworthy difference between the two approaches suggesting that the endogenous nature of the CD4 cell count needs to be taken into account. More information regarding joint models can be found in the end of this document.

5.4 Clustered Event Time Data

We have already seen in Section 5.1 that the stratified Cox model can be used when patients are clustered within groups. However, stratification works satisfactorily when there is sufficient information within each cluster (i.e., sufficient number of events). When this is not the case, we need to utilize different methods. Two general approaches to handle clustered event time data are the marginal approach and the conditional/frailty approach. The first focuses on inferences across clusters. For example, in a multi-center trial, we want to estimate the pooled treatment effect across centers. To account for clustering in a Cox model and obtain the pooled effect, we adjust the standard errors of the resulting estimates using the grouped jackknife variance estimate. This is automatically implemented when we use function cluster(). As an example, for the Lung data, we account for the fact that patients have been coming from different institutions using the code:

marginal_model <- coxph(Surv(time, status) ~ age + cluster(inst), data = lung)

summary(marginal_model)

## Call:
## coxph(formula = Surv(time, status) ~ age, data = lung, cluster = inst)
## 
##   n= 226, number of events= 163 
##    (1 observation deleted due to missingness)
## 
##         coef exp(coef) se(coef) robust se     z Pr(>|z|)   
## age 0.018726  1.018903 0.009239  0.007096 2.639  0.00831 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## age     1.019     0.9814     1.005     1.033
## 
## Concordance= 0.551  (se = 0.025 )
## Likelihood ratio test= 4.21  on 1 df,   p=0.04
## Wald test            = 6.96  on 1 df,   p=0.008
## Score (logrank) test = 4.12  on 1 df,   p=0.04,   Robust = 3.57  p=0.06
## 
##   (Note: the likelihood ratio and score tests assume independence of
##      observations within a cluster, the Wald and robust score tests do not).

We observe that there is an extra column in the output with the robust (jackknife-corrected) standard errors. The reported p-values are based on those robust standard errors.

When instead the focus is on inferences within the clusters, or we are interested in quantifying the heterogeneity between the clusters, the frailty/conditional approach should be used. In this approach, we assume that there is an unobserved variable which all members within a cluster share. Due to the sharing of this variable correlation is generated. To fit a frailty model for the Lung data, we use function frailty() and the following call to coxph():

frailty_model <- coxph(Surv(time, status) ~ age + frailty(inst), 
    data = lung)

summary(frailty_model)

## Call:
## coxph(formula = Surv(time, status) ~ age + frailty(inst), data = lung)
## 
##   n= 226, number of events= 163 
##    (1 observation deleted due to missingness)
## 
##               coef    se(coef) se2      Chisq DF p    
## age           0.01873 0.009239 0.009239 4.11  1  0.043
## frailty(inst)                           0.00  0  0.920
## 
##     exp(coef) exp(-coef) lower .95 upper .95
## age     1.019     0.9814     1.001     1.038
## 
## Iterations: 6 outer, 20 Newton-Raphson
##      Variance of random effect= 5e-07   I-likelihood = -737.3 
## Degrees of freedom for terms= 1 0 
## Concordance= 0.554  (se = 0.025 )
## Likelihood ratio test= 4.21  on 1 df,   p=0.04

In this particular example, we observe that the heterogeneity between the different institutions (quantified by the variance of the frailty term) is rather minimal.

5.5 Competing Risks

In settings in which we have multiple correlated types of events occurring to the patients, and we are not (only) interested in the composite event, we will need to account for the competing risks problem. In cases of competing risks, the classical Kaplan-Meier estimator (i.e., treating the other event as censored), gives biased estimates of cumulative incidences (cumulative incidence equals one minus the survival probability that we get from the Kaplan-Meier estimator). To obtain unbiased estimates of the cumulative incidence function per type of event, we will need to account for the competition between them. A non-parametric estimator of these functions is obtained using function survfit(). As an example, we illustrate its use in the PBC dataset in which some patients died, and some received a liver transplant. To appropriately account for competing risks we need two event indicators, the indicator of each risk, this is variable status in the pbc2.id dataset, and the indicator of any event. We construct this variable, named status3, with the following line of code:

pbc2.id$status3 <- as.numeric(pbc2.id$status != "alive")

Using these two variables, unbiased estimates of the cumulative incidence functions are obtained with the following call to survfit(),

fit <- survfit(Surv(years, status3) ~ 1, data = pbc2.id, etype = status)

We plot these estimated functions with the plot() method:

plot(fit, fun = "event", col = c("black", "red"), lwd = 2, ylim = c(0, 0.7), 
     xlab = "Follow-Up (years)", ylab = "Cumulative Incidence")
legend("topleft", levels(pbc2.id$status)[2:3], lty = 1, lwd = 2, col = c("black", "red"),
       bty = "n")

Contrary to the Kaplan-Meier estimator, the Cox model and because of the use of the partial likelihood provides correct hazard ratios in the context of competing risks by treating the other event as censored. Though these hazard ratios cannot be readily transformed to cumulative incidences by directly using the Breslow estimator of Section 5.1. Following on the PBC dataset, we fit cause-specific hazard regression models for transplanted and dead patients:

fit1 <- coxph(Surv(years, status == "transplanted") ~ drug + age, 
    data = pbc2.id)
summary(fit1)

## Call:
## coxph(formula = Surv(years, status == "transplanted") ~ drug + 
##     age, data = pbc2.id)
## 
##   n= 312, number of events= 29 
## 
##                   coef exp(coef) se(coef)      z Pr(>|z|)    
## drugD-penicil -0.23680   0.78915  0.37723 -0.628     0.53    
## age           -0.09649   0.90802  0.02265 -4.259 2.05e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##               exp(coef) exp(-coef) lower .95 upper .95
## drugD-penicil    0.7891      1.267    0.3768    1.6530
## age              0.9080      1.101    0.8686    0.9492
## 
## Concordance= 0.73  (se = 0.042 )
## Likelihood ratio test= 22.32  on 2 df,   p=1e-05
## Wald test            = 18.54  on 2 df,   p=9e-05
## Score (logrank) test = 20.1  on 2 df,   p=4e-05

fit2 <- coxph(Surv(years, status == "dead") ~ drug + age, data = pbc2.id)
summary(fit2)

## Call:
## coxph(formula = Surv(years, status == "dead") ~ drug + age, data = pbc2.id)
## 
##   n= 312, number of events= 140 
## 
##                    coef exp(coef)  se(coef)      z Pr(>|z|)    
## drugD-penicil -0.162071  0.850380  0.172501 -0.940    0.347    
## age            0.045718  1.046780  0.008487  5.387 7.16e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##               exp(coef) exp(-coef) lower .95 upper .95
## drugD-penicil    0.8504     1.1759    0.6064     1.192
## age              1.0468     0.9553    1.0295     1.064
## 
## Concordance= 0.629  (se = 0.024 )
## Likelihood ratio test= 29.15  on 2 df,   p=5e-07
## Wald test            = 29.04  on 2 df,   p=5e-07
## Score (logrank) test = 29.47  on 2 df,   p=4e-07

An alternative framework for competing risks analysis that directly gives results on the cumulative incidence functions scale is the Fine-Gray model. This can be fitted with package cmprsk. Additional information also in the package mstate.

Survival Analysis in R Companion

Dimitris Rizopoulos

October 11, 2023