1 Bio-SHiFT Data Analysis

1.1 Descriptive Figures

Sample Means: per longitudinal outcome; the red superimposed line is the loess curve.
Sample Patients: longitudinal trajectories from 25 randomly selected patients; patient numbers with an asterisk denote patient who experienced the composite endpoint.

1.1.1 Sample Means

1.1.2 Sample Patients

1.1.3 Kaplan Meier Estimate Composite Endpoint

1.2 Parameter Estimates and Information Criteria

The four models fitted to the Bio-SHiFT data are
- jFit_lin_val: linear fixed and random effects structure, current value functional form.
- jFit_lin_area: linear fixed and random effects structure, area/integral functional form.
- jFit_nonlin_val: nonlinear (splines) fixed and random effects structure, current value functional form.
- jFit_nonlin_area: nonlinear (splines) fixed and random effects structure, area/integral functional form.
We show parameter estimates and 95% credible intervals for each fitted joint model. For the survival submodel, hazard ratios (the exponent of the corresponding regression coefficients) are presented.
Comparison: The Deviance Information Criterion (DIC), the Watanabe-Akaike information criterion (WAIC), and the log pseudo marginal likelihood (LPML) for the fitted joint models. For DIC and WAIC smaller values are better, whereas for LPML larger values are better.

1.2.1 Linear Mixed Models - Value functional Form

##                             HR  2.5%  97.5%
## AGE                      0.993 0.964  1.024
## SEX                      0.944 0.466  1.909
## NYHA123                  1.963 1.392  2.785
## DM                       1.517 0.935  2.482
## IHD                      1.300 0.790  2.172
## Diuretics                3.110 0.830 17.985
## value(eGFR_CKDEPI_per10) 1.151 0.823  1.610
## value(Log2_NGAL_P)       3.054 1.699  5.483

## $eGFR_CKDEPI_per10
##                Mean   2.5%  97.5%
## (Intercept)  10.511  9.452 11.559
## OBSTIME_YEAR  0.049 -0.079  0.175
## AGE          -0.067 -0.081 -0.053
## SEX           0.918  0.532  1.302
## NYHA123      -0.006 -0.238  0.235
## DM           -0.368 -0.736 -0.002
## IHD           0.421  0.060  0.776
## Diuretics    -0.907 -1.465 -0.352
## sigma         1.608  1.553  1.664
## 
## $Log2_NGAL_P
##                Mean   2.5% 97.5%
## (Intercept)   5.660  5.216 6.112
## OBSTIME_YEAR  0.112  0.084 0.140
## AGE           0.019  0.013 0.025
## SEX           0.149 -0.017 0.313
## NYHA123       0.145  0.041 0.246
## DM            0.161  0.003 0.317
## IHD          -0.105 -0.257 0.046
## Diuretics     0.311  0.065 0.553
## sigma         0.393  0.379 0.406

1.2.2 Linear Mixed Models - Area functional Form

##                            HR  2.5%  97.5%
## AGE                     0.994 0.963  1.028
## SEX                     0.951 0.450  1.999
## NYHA123                 1.985 1.412  2.849
## DM                      1.534 0.944  2.510
## IHD                     1.286 0.758  2.183
## Diuretics               3.203 0.811 20.473
## area(eGFR_CKDEPI_per10) 1.159 0.818  1.680
## area(Log2_NGAL_P)       3.057 1.694  5.576

## $eGFR_CKDEPI_per10
##                Mean   2.5%  97.5%
## (Intercept)  10.509  9.460 11.549
## OBSTIME_YEAR  0.050 -0.078  0.175
## AGE          -0.067 -0.081 -0.053
## SEX           0.913  0.527  1.296
## NYHA123      -0.007 -0.238  0.234
## DM           -0.367 -0.737 -0.003
## IHD           0.417  0.052  0.778
## Diuretics    -0.908 -1.465 -0.345
## sigma         1.609  1.552  1.667
## 
## $Log2_NGAL_P
##                Mean   2.5% 97.5%
## (Intercept)   5.662  5.221 6.101
## OBSTIME_YEAR  0.110  0.084 0.137
## AGE           0.019  0.013 0.025
## SEX           0.149 -0.013 0.311
## NYHA123       0.144  0.042 0.244
## DM            0.162  0.007 0.315
## IHD          -0.105 -0.257 0.045
## Diuretics     0.310  0.070 0.547
## sigma         0.393  0.379 0.407

1.2.3 Nonlinear Mixed Models - Value functional Form

##                             HR  2.5%  97.5%
## AGE                      0.996 0.967  1.028
## SEX                      0.916 0.448  1.892
## NYHA123                  1.986 1.389  2.848
## DM                       1.547 0.938  2.519
## IHD                      1.291 0.770  2.148
## Diuretics                3.153 0.850 18.374
## value(eGFR_CKDEPI_per10) 1.203 0.860  1.705
## value(Log2_NGAL_P)       3.055 1.712  5.553

## $eGFR_CKDEPI_per10
##                                       Mean   2.5%  97.5%
## (Intercept)                         10.404  9.412 11.425
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))1  0.595  0.154  1.018
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))2 -0.471 -1.049  0.117
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))3 -0.623 -1.327  0.109
## AGE                                 -0.066 -0.080 -0.052
## SEX                                  0.945  0.572  1.314
## NYHA123                              0.016 -0.211  0.252
## DM                                  -0.373 -0.736 -0.009
## IHD                                  0.369  0.020  0.719
## Diuretics                           -0.843 -1.386 -0.304
## sigma                                1.582  1.523  1.641
## 
## $Log2_NGAL_P
##                                       Mean   2.5% 97.5%
## (Intercept)                          5.722  5.283 6.161
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))1  0.056 -0.045 0.159
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))2  0.358  0.217 0.511
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))3  0.529  0.369 0.706
## AGE                                  0.019  0.013 0.025
## SEX                                  0.147 -0.017 0.311
## NYHA123                              0.134  0.034 0.235
## DM                                   0.169  0.010 0.323
## IHD                                 -0.098 -0.247 0.052
## Diuretics                            0.304  0.060 0.544
## sigma                                0.384  0.371 0.399

1.2.4 Nonlinear Mixed Models - Area functional Form

##                            HR  2.5%  97.5%
## AGE                     0.992 0.964  1.023
## SEX                     1.000 0.492  2.046
## NYHA123                 1.982 1.401  2.843
## DM                      1.520 0.933  2.499
## IHD                     1.313 0.773  2.223
## Diuretics               3.077 0.797 18.751
## area(eGFR_CKDEPI_per10) 1.107 0.801  1.541
## area(Log2_NGAL_P)       2.799 1.575  5.017

## $eGFR_CKDEPI_per10
##                                       Mean   2.5%  97.5%
## (Intercept)                         10.419  9.411 11.460
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))1  0.589  0.149  1.015
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))2 -0.498 -1.067  0.105
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))3 -0.639 -1.321  0.089
## AGE                                 -0.066 -0.080 -0.052
## SEX                                  0.945  0.569  1.321
## NYHA123                              0.013 -0.218  0.245
## DM                                  -0.371 -0.737 -0.009
## IHD                                  0.366  0.027  0.721
## Diuretics                           -0.848 -1.389 -0.313
## sigma                                1.582  1.523  1.642
## 
## $Log2_NGAL_P
##                                       Mean   2.5% 97.5%
## (Intercept)                          5.720  5.277 6.154
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))1  0.056 -0.049 0.162
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))2  0.365  0.174 0.489
## ns(OBSTIME_YEAR, 3, B = c(0, 3.2))3  0.531  0.334 0.704
## AGE                                  0.019  0.013 0.025
## SEX                                  0.147 -0.017 0.308
## NYHA123                              0.133  0.035 0.235
## DM                                   0.168  0.010 0.324
## IHD                                 -0.098 -0.247 0.053
## Diuretics                            0.304  0.058 0.542
## sigma                                0.385  0.371 0.399

1.2.5 Comparison

## 
##                        DIC     WAIC      LPML
##  jFit_nonlin_area 10703.00 11023.70 -5612.088
##     jFit_lin_area 11120.25 11108.45 -5584.623
##      jFit_lin_val 11167.66 11120.05 -5588.302
##   jFit_nonlin_val 10704.55 11135.23 -5796.616
## 
## The criteria are calculated based on the marginal log-likelihood.

1.3 Longitudinal Models: Posterior-Posterior Checks

The four tabs show the empirical cumulative distribution, mean, variance, and semi-variogram functions for the two fitted joint models using the current value functional form. The results from the area/integral functional form were very similar.
In each figure and for the two longitudinal outcomes, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).

1.3.1 ECDF

1.3.2 Mean Function

1.3.3 Variance Function

1.3.4 Correlation Structure

1.4 Longitudinal Models: Posterior-Prior Checks

The four tabs show the empirical cumulative distribution, mean, variance, and semi-variogram functions for the two fitted joint models using the current value functional form. The results from the area/integral functional form were very similar.
In each figure and for the two longitudinal outcomes, and for the two longitudinal outcomes, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).

1.4.1 ECDF

1.4.2 Mean Function

1.4.3 Variance Function

1.4.4 Correlation Structure

1.5 Longitudinal Models: Cross-Validated Posterior-Prior Checks

The four tabs show the empirical cumulative distribution, mean, variance, and semi-variogram functions for the two fitted joint models using the current value functional form. The results from the area/integral functional form were very similar.
In each figure and for the two longitudinal outcomes, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).
The simulated datasets have been generated using the cross-validation procedure described in the main paper using \(V = 10\) folds.

1.5.1 ECDF

1.5.2 Mean Function

1.5.3 Variance Function

1.5.4 Correlation Structure

1.6 Longitudinal Models: Cross-Validated Dynamic-Posterior-Posterior Checks

The four tabs show the empirical cumulative distribution, mean, variance, and semi-variogram functions for the two fitted joint models using the current value functional form. The results from the area/integral functional form were very similar.
In each figure and for the two longitudinal outcomes, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines) using longitudinal information up to the landmark time \(t_L = 1\) year.
The simulated datasets have been generated using the cross-validation procedure described in the main paper using \(V = 10\) folds.

1.6.1 ECDF

1.6.2 Mean Function

1.6.3 Variance Function

1.6.4 Correlation Structure

1.7 Longitudinal Models: Individualized Posterior-Posterior Checks

Mean function for the two longitudinal outcomes and Patients 180 and 124 based on the two fitted joint models using the current value functional form. The results from the area/integral functional form were very similar. The asterisks denote the observed data points.
Patient 124 experienced the composite endpoint, whereas Patient 180 did not.
In each figure, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).

1.7.1 Patient 180

1.7.2 Patient 124

1.8 Event Time Model: Posterior-Posterior Checks

The two tabs show the empirical cumulative distribution function and the probability integral transform of the subject-specific cumulative distribution functions for the four fitted joint models.
In the tile of each figure, ‘Linear/Nonlinear’ refers to the specification of the linear mixed models (i.e., line time trends or splines in both the fixed- and random-effects), and ‘Value/Area’ refers to the functional form (i.e., current value or integral/area).
In each figure, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).

1.8.1 ECDF

1.8.2 Survival Transform to Uniform

1.9 Event Time Model: Posterior-Prior Checks

The two tabs show the empirical cumulative distribution function and the probability integral transform of the subject-specific cumulative distribution functions for the four fitted joint models.
In the tile of each figure, ‘Linear/Nonlinear’ refers to the specification of the linear mixed models (i.e., line time trends or splines in both the fixed- and random-effects), and ‘Value/Area’ refers to the functional form (i.e., current value or integral/area).
In each figure, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).

1.9.1 ECDF

1.9.2 Survival Transform to Uniform

1.10 Association: Posterior-Posterior Checks

The two tabs show the concordance statistic for the two longitudinal outcomes based on the four fitted joint models.
In the tile of each figure, ‘Linear/Nonlinear’ refers to the specification of the linear mixed models (i.e., line time trends or splines in both the fixed- and random-effects), and ‘Value/Area’ refers to the functional form (i.e., current value or integral/area).
In each figure, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).

1.10.1 eGFR

1.10.2 NGAL

2 Simulation

2.1 Model Comparison

The four models fitted to the simulated dataset are
- jmFit_true: the true model from which the dataset was simulated.
- jmFit_linear: the joint model with a misspecified the longitudinal submodel by assuming linear subject-specific trends.
- jmFit_exp: the joint model with a misspecified the longitudinal submodel by fitting the mixed model for the outcome variable \(y^* = \exp(y)\).
- jmFit_slope: the joint model with a misspecified the survival submodel using the current slope/velocity as the functional form.
Comparison: The Deviance Information Criterion (DIC), the Watanabe-Akaike information criterion (WAIC), and the log pseudo marginal likelihood (LPML) for the fitted joint models. For DIC and WAIC smaller values are better, whereas for LPML larger values are better.

## 
##                     DIC      WAIC      LPML
##   jmFit_slope  2817.764  5778.053 -4907.249
##    jmFit_true  2841.831  6088.138 -6935.651
##  jmFit_linear 10555.945 10579.751 -5292.838
##     jmFit_exp 14840.774 18343.870 -8201.606
## 
## The criteria are calculated based on the marginal log-likelihood.

2.2 Longitudinal Model

The four tabs show the empirical cumulative distribution, mean, variance, and semi-variogram functions for the four fitted joint models.
In each figure and for the two longitudinal outcomes, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).
The posterior-posterior checks are shown.

2.2.1 ECDF

2.2.2 Mean Function

2.2.3 Variance Function

2.2.4 Correlation Structure

2.3 Event Time Model

The two tabs show the empirical cumulative distribution function and the probability integral transform of the subject-specific cumulative distribution functions for the four fitted joint models.
In each figure, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).
The posterior-posterior checks are shown.

2.3.1 ECDF

2.3.2 Survival Transform to Uniform

2.4 Association

The concordance statistic for the four fitted joint models.
In each figure, we compare the observed data metric (black line) with the same metric calculated in 50 simulated datasets from the corresponding model (grey lines).
The posterior-posterior checks are shown.

Supplementary Material for Goodness-of-Fit Checks for Joint Models

Dimitris Rizopoulos, Jeremy M.G. Taylor and Isabella Kardys

1 Bio-SHiFT Data Analysis

1.1 Descriptive Figures

1.1.1 Sample Means

1.1.2 Sample Patients

1.1.3 Kaplan Meier Estimate Composite Endpoint

1.2 Parameter Estimates and Information Criteria

1.2.1 Linear Mixed Models - Value functional Form

1.2.2 Linear Mixed Models - Area functional Form

1.2.3 Nonlinear Mixed Models - Value functional Form

1.2.4 Nonlinear Mixed Models - Area functional Form

1.2.5 Comparison

1.3 Longitudinal Models: Posterior-Posterior Checks

1.3.1 ECDF

1.3.2 Mean Function

1.3.3 Variance Function

1.3.4 Correlation Structure

1.4 Longitudinal Models: Posterior-Prior Checks

1.4.1 ECDF

1.4.2 Mean Function

1.4.3 Variance Function

1.4.4 Correlation Structure

1.5 Longitudinal Models: Cross-Validated Posterior-Prior Checks

1.5.1 ECDF

1.5.2 Mean Function

1.5.3 Variance Function

1.5.4 Correlation Structure

1.6 Longitudinal Models: Cross-Validated Dynamic-Posterior-Posterior Checks

1.6.1 ECDF

1.6.2 Mean Function

1.6.3 Variance Function

1.6.4 Correlation Structure

1.7 Longitudinal Models: Individualized Posterior-Posterior Checks

1.7.1 Patient 180

1.7.2 Patient 124

1.8 Event Time Model: Posterior-Posterior Checks

1.8.1 ECDF

1.8.2 Survival Transform to Uniform

1.9 Event Time Model: Posterior-Prior Checks

1.9.1 ECDF

1.9.2 Survival Transform to Uniform

1.10 Association: Posterior-Posterior Checks

1.10.1 eGFR

1.10.2 NGAL

2 Simulation

2.1 Model Comparison

2.2 Longitudinal Model

2.2.1 ECDF

2.2.2 Mean Function

2.2.3 Variance Function

2.2.4 Correlation Structure

2.3 Event Time Model

2.3.1 ECDF

2.3.2 Survival Transform to Uniform

2.4 Association