Summary

The role of visit-to-visit variability of a biomarker in predicting related disease has been recognized in medical science. Existing measures of biological variability are criticized for being entangled with random variability resulted from measurement error or being unreliable due to limited measurements per individual. In this article, we propose a new measure to quantify the biological variability of a biomarker by evaluating the fluctuation of each individual-specific trajectory behind longitudinal measurements. Given a mixed-effects model for longitudinal data with the mean function over time specified by cubic splines, our proposed variability measure can be mathematically expressed as a quadratic form of random effects. A Cox model is assumed for time-to-event data by incorporating the defined variability as well as the current level of the underlying longitudinal trajectory as covariates, which, together with the longitudinal model, constitutes the joint modeling framework in this article. Asymptotic properties of maximum likelihood estimators are established for the present joint model. Estimation is implemented via an Expectation-Maximization (EM) algorithm with fully exponential Laplace approximation used in E-step to reduce the computation burden due to the increase of the random effects dimension. Simulation studies are conducted to reveal the advantage of the proposed method over the two-stage method, as well as a simpler joint modeling approach which does not take into account biomarker variability. Finally, we apply our model to investigate the effect of systolic blood pressure variability on cardiovascular events in the Medical Research Council elderly trial, which is also the motivating example for this article.

1 Introduction

Our research is motivated by the Medical Research Council (MRC) elderly trial in the United Kingdom which compared diuretic and β-blocker regimens versus placebo, in hypertensive patients aged 65–74 years. The clinical interest lies in the effects of active treatments as well as the impact of systolic blood pressure (SBP) profile on cardiovascular risk. Although it is widely believed in medicine that the underlying usual SBP is of great importance in predicting cardiovascular disease, more and more research has recently provided insight into the role of SBP variability. Instead of dismissing SBP variability as a random fluctuation, Howard and Rothwell (2009) showed that visit-to-visit variability of SBP is reproducible within individuals over time. That is, individuals with high (low) visit-to-visit variability at one time are more likely to possess high (low) variability at another time. This non-random nature of SBP variability is crucial as it warranted the exploration on the prognostic value of visit-to-visit variability. A pioneering piece of research on this topic was conducted by Rothwell and others (2010 b), in which the prognostic significance of visit-to-visit SBP variability was reliably established in predicting stroke and other vascular events. Their conclusion that SBP variability was a strong predictor, even stronger than mean SBP for the cohorts they studied, challenged the widespread belief (the usual blood-pressure hypothesis; Rothwell, 2010) in medicine and justified the incorporation of visit-to-visit SBP variability when analyzing cardiovascular disease. However, existing measures of SBP variability in clinical research (Yano, 2017), e.g., standard deviation (SD), coefficient of variation, and successive variation, are calculated from observed SBP measurements on a person-by-person basis, which are susceptible to the number of measurements involved in calculations. In fact, these sample-based measures tend to provide unreliable estimates of variability if they are calculated from a small number of measurements (Howard and Rothwell, 2009). Further, the association between visit-to-visit variability, when quantified by above measures, and the risk of disease might be underestimated due to regression dilution (Clarke and others, 1999).

To deal with issues involved in naive variability measures, a mixed-effects location scale model was proposed (Lyles and others, 1999), where a random within-individual variance was introduced to characterize the individual-specific variability of the longitudinal biomarker. Meanwhile, the error-free variability was incorporated as a predictor in a survival submodel to assess its effect on the event of interest. Unlike the case in sample-based variability measures, this modeling framework (i) uses the true within-individual variance rather than its estimated version as a predictor to avoid the regression dilution issue; (ii) allows borrowing strength across individuals by assuming that these within-individual variances come from a common population (Barrett and others, 2019; Gao and others, 2011). However, this modeling approach cannot distinguish the biological (systematic) variability with which we are concerned, from the measurement variability (fluctuation due to measurement errors) since the within-individual variance is a combination of both kinds of variability. If a complicated mean function is specified to capture the biomarker trend, the within-individual variance will be dominated by measurement variability. Therefore, the degree to which random within-individual variance reflects real biological variability is uncertain.

In this article, we propose a measure to characterize the inherent biological variability of a biomarker whose underlying trajectory possesses smooth and nonlinear shape. The idea comes from smoothing splines on quantifying the roughness of a curve. Specifically, we use the integrated squared second derivative, t0t{mi(s)}2ds, to capture the cumulative variability of a biomarker trajectory mi(·) for the i-th individual from t0 to current time t. The square root of the integral rather than the integral itself is used as a covariate in the survival submodel to ensure that the scale of our defined variability measure is consistent with that of mi(t). Figure 1 shows the fitted SBP trajectories (left, divided by 30) for six randomly selected patients in the MRC trial and their corresponding square root of cumulative variability (right). It is obvious that individuals with stable SBP level, e.g., individual 129 and 2295, have lower variability quantified by our proposed measure and that individuals with sharp fluctuation in SBP trajectory, e.g., individual 3800 and 1043, possess larger variability. From this perspective, it is reasonable to adopt this measure to characterize the biological variability behind the longitudinal measurements for each individual.

Fitted SBP trajectories (left) and the corresponding square root of cumulative variability (right). Numbers beside curves indicate patient IDs in the MRC trial. Solid circles (solid lines in the right) represent uncensored cases, while hollow circles (dashed lines in the right) represent censored cases.
Fig. 1

Fitted SBP trajectories (left) and the corresponding square root of cumulative variability (right). Numbers beside curves indicate patient IDs in the MRC trial. Solid circles (solid lines in the right) represent uncensored cases, while hollow circles (dashed lines in the right) represent censored cases.

Together with a mixed-effects model for longitudinal measurements, a survival hazard model which incorporates the proposed variability quantity as a predictor, constitutes the joint modeling framework in this article. Random effects are employed in both the mixed-effects model and the survival model to account for the dependence between the two types of outcomes. Based on the standard joint model which associates longitudinal and event outcomes through the current true level of longitudinal biomarker (see Tsiatis and Davidian, 2004; Tsiatis and others, 1995), several extensions have been proposed to characterize possibly more complex associations. Ye and others (2008) included the current slope of the biomarker trajectory when modeling survival hazard. Sylvestre and Abrahamowicz (2009) incorporated the integral of the longitudinal trajectory, representing the cumulative effects of longitudinal history. All of these extended associations share a common parameterization that they can be simplified to linear functions of random effects. Clearly, the association structure in our joint model is complicated by the inclusion of the proposed variability quantity and falls outside the linear forms. For estimation, the likelihood method is the most widely used approach in joint modeling literature (Hsieh and others, 2006). Allowing the baseline function to be totally unspecified, Zeng and Cai (2005) established the asymptotic properties for the semiparametric maximum likelihood estimator (MLE) in which the association structures were constrained to be linear forms of random effects. We here extend their theoretical result to the joint model with our derived association structure.

The remainder of this article is organized as follows. We illustrate the motivation and rationale of quantifying the biomarker variability by our proposed measure in Section 2. In Section sec:mod, we describe the joint modeling framework with longitudinal and survival submodels specified in detail. Section 4 gives the asymptotic properties of maximum likelihood estimates (MLEs) in the proposed joint model. Section 5 focuses on the technical issues in estimation when the expectation-maximization (EM) algorithm is applied, especially on the approximation method in the E-step. In Section 6, we compare the proposed joint modeling method with other existing approaches via simulation studies. An application of our methodology to analyze the MRC trial is presented in Section 7, followed by discussion in Section 8.

2 Motivation

In the context of smoothing splines and penalized splines, a roughness penalty is imposed to control the smoothness of the estimated regression function:
(2.1)
where {(ti,Yi)}i=1m are observations and m(t) is an unknown regression function. For smoothing splines, m(t) is almost totally unspecified with the only constraint that m(t) is a twice continuously differentiable function on [a,b], i.e., m(t)C2[a,b]. The minimizer of (2.1) in C2[a,b] is exactly a natural cubic spline with knots in all unique ti’s. For penalized splines, m(t) is prespecified by spline basis functions whose dimension is relatively large. The inclusion of the roughness penalty prevents overfitting even though a large number of knots are positioned in both contexts. By writing (2.1) in a matrix form
where B is the design matrix determined by the spline bases for m(t) and J is a matrix such that ||Jβ||22=abm(t)2dt, we can see that the penalty term definitely acts on the estimate of the regression coefficients β. Specifically, a small λ results in an estimate of β which closely fits the data (for smoothing splines, the estimate of β, when λ = 0, yields a regression function exactly interpolating all the observations), and a large λ leads to an estimate of β which corresponds to a smooth (less wiggly) fitted regression function. In other words, the penalty term abm(t)2dt, which characterizes the roughness (the degree of fluctuation) nature of m(t) over [a,b], is determined by β given fixed knots of basis functions. This inspired us to use a abmi(t)2dt-based quantity to measure the variability of the subject-specific biomarker trajectory mi(t). We illustrate our idea through a toy example as shown in Figure 2. Two trajectories (solid curves) are specified by the same knot sequence but different vectors of regression coefficients which result in different degrees of fluctuation. This difference, as demonstrated in the right figure, can be successfully captured by 0tmi(s)2ds, which can be interpreted as the cumulative variability of mi(·) up to time t.
Left: simulated trajectories mi(t) for i = 1, 2 (solid curves) and their first (dotted curves) and second (dashed lines) derivative functions. Right: values of ∫0tmi″(s)2ds over time for each trajectory. The dashed vertical lines in both subfigures denote the positions of interior knots.
Fig. 2

Left: simulated trajectories mi(t) for i = 1, 2 (solid curves) and their first (dotted curves) and second (dashed lines) derivative functions. Right: values of 0tmi(s)2ds over time for each trajectory. The dashed vertical lines in both subfigures denote the positions of interior knots.

3 Model

Let Ti* and Ci denote the event and censoring times, respectively, for the ith individual, i=1,,m. We can only observe Ti=min{Ti*,Ci} and δi=I{TiCi} in practice, where I(·) is the indicator function. The longitudinal measurements for the ith individual are denoted by Yi=(Yi1,,Yini), which are intermittently collected with measurement errors at time points (ti1,,tini). Moreover, we denote the hypothetical trajectory behind Yi by Yi(t), which, in this article, is assumed to be a smooth function of t. It is also assumed that Ci is independent of Ti* given covariates.

3.1 Longitudinal submodel

Suppose that the underlying trajectory behind the observed longitudinal measurements is a function which depends on baseline covariates linearly and on time nonparametrically, then we specify the following semiparametric model for Yi(t),
(3.1)
where η is the coefficient vector of baseline covariates, mi(t) is an unknown smooth function of time and might take somewhat different shapes for different individuals. Usually, the observed longitudinal value Yij is assumed to be Yi(tij) plus random error. Here, to accommodate the features of SBP measurements in the MRC trial (see Figure 3(a)), we allow Yij to be affected by some other covariates zij, which, in the MRC trial, is an indicator of whether the jth SBP was measured by a doctor. So we consider the following model
(3.2)
where εijiidN(0,σ2) is the random error and ξ is the regression coefficient of the outside stimulus zij. To characterize the subject-specific trajectories and the within-subject correlation, we model mi(t) using regression splines with multidimensional random effects,
(3.3)
where {Bk(·)}k=1q is a q-dimensional cubic B-spline basis on follow-up time [0,τ] with a fixed knot sequence; β=(β1,,βq) is a vector of fixed effects and bi=(bi1,,biq) is a vector of random effects for the ith individual. It is assumed that bi’s are independent and identically distributed with a common distribution Nq(0,D) and that the bi’s are also independent with the εi’s. Intuitively, {mi(t)}i=1m is a collection of random trajectories varying around a common mean trend k=1qβkBk(t) with random deviations specified by random effects. Moreover, (3.1) and (3.3) suggest that baseline covariates xi only affect the intercept of Yi(t). In fact, we can flexibly allow xi to affect the linear trend of the trajectory by decomposing the space of B-spline basis functions into a linear plane and its orthogonal space. To this end, a proper linear transformation is required to construct the modified B-spline basis functions which are orthogonal to the linear space from the original spline bases in (3.3). But for simplicity, we mainly focus on the longitudinal submodel specified by (3.1), (3.2) and (3.3) in subsequent sections.
Preliminary analysis of SBP observations and cardiovascular events. (a) mean level of SBP observations (divided by 30) by randomized trials. (b) Kaplan-Meier survival curves of cardiovascular events by randomized trials.
Fig. 3

Preliminary analysis of SBP observations and cardiovascular events. (a) mean level of SBP observations (divided by 30) by randomized trials. (b) Kaplan-Meier survival curves of cardiovascular events by randomized trials.

Note that (3.3) is a sieve approximation (Rice and Wu, 2001) to the unknown function of time. The performance of (3.3) depends on the choice for the number of knots and their locations. A pragmatic approach is to place interior knots at locations where there is high curvature in longitudinal responses and where more observations are collected, and then use AIC or BIC (calculated on longitudinal data only) to determine the number of knots (Rizopoulos and others, 2009; Shi and others, 1996).

3.2 Survival submodel

Motivated by the MRC trial, we allow the event time to depend on the true trajectory Yi(t) of longitudinal biomarker and also on its cumulative variability quantified with our proposed measure t0t{mi(s)}2ds. Therefore, the survival model is specified as
(3.4)
where λi(t)=limΔt01ΔtP(tTi<t+Δt|Tit) is the survival hazard at time t, λ0(·) is the unspecified baseline hazard function, γ is a coefficient vector of baseline covariates wi, α1, and α2 are parameters characterizing the effects of the current hypothetical biomarker level and the square root of cumulative variability. The lower limit of the integral, namely t0, is pre-specified in practice. Usually, t0 is set as 0 if we are interested in the cumulative variability from the start of the study to the current time. However, as is the case with the MRC trial in Section 7, t0 can be a time point other than 0 to exclude the early stage after treatment initiation. In this case (t0>0), (3.4) is not applicable to modeling the survival hazard at t<t0 and we assume individuals are not at risk before t0.
Under the formulation of mi(t) in (3.3), the integral t0t{mi(s)}2ds involved in (3.4) can be further simplified. Recall the definition of B-splines via Cox-de Boor recursion formula:
and
where Bk,l(s) is the kth B-spline basis function of order l (degree l – 1) for the given knot sequence {s1,,sk,}. In particular, the cubic B-spline functions in (3.3) is exactly Bk,4(s). Let b˜i=β+bi, a q-dimensional mixed effect vector and B˜(s)=(B3,2(s),,Bq,2(s)), a (q2)-dimensional vector consisting of linear B-spline basis functions. A straightforward calculation gives
(3.5)
where Q is a q×(q2) matrix with entries Qij, for i=1,,q and j=2,,q1, given by
with Qjj=(Qj1,j+Qj+1,j) and Qij = 0 for |ij|2; R(t0,t) is a (q2)×(q2) symmetric matrix with elements Rij(t0,t), for i and j running from 2 to (q1), given by
with Rij(t0,t)=0 for |ij|2 as Bk1,2(s) and Bk2,2(s) are nonoverlapping when |k1k2|2 (de Boor, 1978).
Let K(t0,t)=QR(t0,t)Q. Substituting (3.5) into (3.4) gives the following expression for the survival hazard
(3.6)
where B(t)=(B1(t),,Bq(t)). As b˜iK(t0,t)b˜iλmax||b˜i||2 with λmax denoting the maximum eigenvalue of the positive semidefinite matrix K(t0,t),(b˜iK(t0,t)b˜i)1/2 is therefore of order O(b˜i), the same as that of mi(t). This is the reason we use (b˜iK(t0,t)b˜i)1/2 instead of b˜iK(t0,t)b˜i in the survival hazard model.

4 Inference on semiparametric model

Let θ=(γ,α1,α2,β,η,ξ,σ2,Vec(D)) be the collection of unknown parameters to be estimated. Under the standard assumption that Ti and Yi are conditionally independent given random effects bi, the specification of submodels (3.2) and (3.6) leads to the following likelihood:
(4.1)
where S(t|bi)=exp{0tλi(s|bi)ds} is the survival function.
Let Λ0(t)=0tλ0(s)ds denote the cumulative baseline hazard. The likelihood (4.1) can be arbitrarily large by letting λ0(t) rise higher and higher at points close to the observed event times and vanish otherwise (Aalen and others, 2008, pp. 204–205). To circumvent the issue, Riemann–Stieltjes integral is employed with Λ0(t) being the integrator which is constrained to be a non-negative and nondecreasing function. It’s easy to verify that the likelihood is maximized if Λ0(t) is a step function with jumps at the observed event times {Ti:δi=1}. So instead of maximizing L(θ,λ0) directly, we work on the modified likelihood function (Zeng and Cai, 2005):
(4.2)
where Λ0{Ti} is the jump size of Λ0(·) at time Ti; Xi, Zi and Bi are design matrices corresponding to baseline covariates (Xi is constructed by repeating xi for ni times), external stimuli and B-spline bases with Bi=(B(ti1),,B(tini)).

The maximum likelihood estimate (θ^,Λ^0) is defined as (θ^,Λ^0)=argmaxθΘ,Λ0ZmlogLm(θ,Λ0), where Zm consists of all right-continuous step functions only with positive jumps at {Ti:δi=1}. Let θ denote the true parameter for θ and Λ0(·) be the true cumulative baseline function. Under assumptions given in Section SA of the supplementary material available at Biostatistics online, we establish the consistency and asymptotic normality of (θ^,Λ^0). Moreover, by showing the influence function (each coordinate) of θ^ falls into a linear span of score functions, θ^ is proved to be semiparametrically efficient.

 
Theorem 4.1 (Consistency).
Under conditions specified in Section SA of the supplementary material available at Biostatistics online, the MLE (θ^,Λ^0) is strongly consistent:
where ||·||2 is the Euclidean norm and ||·|| is the supremum norm on [0,τ].

Let l[0,τ] be the metric space of all bounded functions in [0,τ] and let d denote the dimension of θ. The next theorem states the asymptotic distribution of (θ^,Λ^0).

 
Theorem 4.2 (Asymptotic normality and efficiency).

Under conditions specified in Section SA of the supplementary material available at Biostatistics online, m(θ^θ,Λ^0Λ0) converges in distribution to a Gaussian random element in Rd×l[0,τ]. Further, m(θ^θ) weakly converges to a multivariate normal distribution with mean zero and variance equal to the semiparametric efficiency bound.

Proofs of the above theorems are similar in spirit to Zeng and Cai (2005) and can be found in Section SA of the supplementary material available at Biostatistics online. As the analytical form for the asymptotic covariance matrix is not available (Hsieh and others, 2006), we employ a boostrap method to numerically estimate the standard errors of θ^. Basically, we generate J bootstrap samples (each of size m) by sampling with replacement from the original observed data and estimate (θ,Λ0) based on each bootstrap sample. The SD of J bootstrap estimates of θ is then an estimate of the standard error of θ^.

 
Remark 4.3

The asymptotic theorems developed in this section aim to extend the asymptotic properties of the semiparametric MLE in standard join models (with a linear association structure in terms of random effects) to our proposed joint models with a complex association structure in (3.6). However, we should note that (3.3) is an approximation (a regression spline estimator) to an unknown smooth trajectory and the consequence of this approximation behavior has not been taken into account in Theorems 4.1 and 4.2. Consistency of the regression spline estimator definitely requires the dimension of spline bases q increasing to infinity at an appropriate rate of m (q should be written as qm). See Agarwal and Studden (1980) for the order of qm under which the regression spline estimator and its derivatives achieve the optimal global convergence rate. Similar to regression dilution with error-prone covariates, it is expected that the survival parameters, especially α1 and α2, will be estimated with bias. We illustrate this phenomenon in Section 6.3 where the true biomarker trajectory is set to be a smooth function but not exactly a spline function given by (3.3).

5 Estimation with EM algorithm

The MLEs in the joint modeling framework can be typically obtained using the EM algorithm which is implemented by iteratively applying the E-step and M-step, with random effects treated as missing data. For the likelihood given by (4.2), the E-step refers to the calculation of
where (θ(k),Λ0(k)) is the estimated value of (θ,Λ0) obtained from iteration k, and the M-step refers to the maximization of Q(θ,Λ0|θ(k),Λ0(k)) with respect to (θ,Λ0).

5.1 M-step

Let Ei(k)[·] and Vi(k)[·] denote the abbreviations of E[·|Ti,δi,Yi;θ(k),Λ0(k)] and Var[·|Ti,δi,Yi;θ(k),Λ0(k)], respectively, i.e., the expectation and variance with respect to the conditional distribution of bi given (Ti,δi,Yi) and the current value (θ(k),Λ0(k)) of parameters. Denote N=i=1mni as the total number of longitudinal measurements. It is not difficult to find that some components of (θ,Λ0),ϕ2:=(ξ,σ2,D,Λ0), can be updated in closed forms. That is, ϕ2(k+1)=f(ϕ1(k+1)|θ(k),Λ0(k)) for some function f(·), where ϕ1=(γ,α1,α2,η,β) are components with no closed updating form. Specifically, the jump size of Λ0(·) at time point t is updated by
where r(bi,t,ϕ1)=exp[γwi+α1{xiη+b˜iB(t)}+α2{b˜iK(t0,t)b˜i}1/2] is the relative risk for the ith individual at time t. The specific updating forms for ξ,σ2 and D are presented in Section SB of the supplementary material available at Biostatistics online. For ϕ1, the update ϕ1(k+1) is a solution to the equation Sϕ1:=Q(θ,Λ0|θ(k),Λ0(k))/ϕ1|(ϕ1,ϕ2)=(ϕ1,f(ϕ1|θ(k),Λ0(k)))=0. As no closed-form solution is available, we resort to a one-step Newton–Raphson method in which the updating at the (k+1)th iteration is performed as ϕ1(k+1)=ϕ1(k)+νkIϕ1(k)1Sϕ1(k) with the step size νk determined by backtracking line search. For Iϕ1(k),Sϕ1/ϕ1|ϕ1=ϕ1(k) is used when updating γ,α1, and α2, while i=1nSiϕ1(k)Siϕ1(k) is recommended when updating β and η so as to avoid complicated calculations involved in calculating second derivatives (Griffiths and others, 1987).

5.2 E-step

It is worth noting that the expectations involved in the M-step are of the following form
(5.1)
where A(bi) denotes a general function of bi, which may take the forms in Section SB of the supplementary material available at Biostatistics online. In view of the highly nonlinear shape of the longitudinal subject-specific trajectories, the random effects introduced in (3.3) are usually not of low dimension, which makes calculating the above integrals a challenge as the (adaptive) Gauss-Hermite method suffers from the curse of dimensionality. To balance the computation burden and approximation accuracy, we use a fully exponential Laplace approximation method. The idea behind this method is to apply the standard Laplace method in both the numerator and denominator of (5.1), which makes the O(ni1) terms canceled and leads to an O(ni2) approximation (Rizopoulos and others, 2009; Tierney and others, 1989). Specifically, under the fully exponential Laplace approximation, integrals in form (5.1) can be calculated as follows
(5.2)
where b^i=argmaxbilogp(Ti,δi,Yi,bi;θ(k),Λ0(k)) and Ii=Ii(c)|(c,b)=(0,b^i) with

As indicated in (5.2), only the mode of logp(Ti,δi,Yi,bi;θ(k),Λ0(k)), namely b^i, and second derivatives of logp(Ti,δi,Yi,bi;θ(k),Λ0(k)) and A(bi) around b^i are required to achieve a second order approximation of Ei(k)[A(bi)]. Given that in the present model b^i’s cannot be obtained analytically, Newton–Raphson method is employed to find the value numerically. See Section B of the supplementary material available at Biostatistics online for specific calculations.

5.3 Impact on asymptotic properties

Note that the asymptotic properties, including consistency of (θ^,Λ^0) and semiparametric efficiency of θ^, established in Section 4, may be affected due to approximation method used in the E-step. Based on investigations in Vonesh (1996) and Rizopoulos and others (2009), the resulting convergence rate of (θ^,Λ^0) when a fully exponential Laplace approximation is applied becomes (θ^θ,Λ^0Λ0)=Op(max{m1/2,min(ni)2}), where min(ni)2 comes from the approximation error in the E-step. Therefore, (θ^,Λ^0) will be consistent only as both m and ni,i=1,m go to infinity, and the consistency rate will still be Op(m1/2) if min(ni) grows at a rate greater than m1/4. Moreover, if min(ni) grows faster than m1/2, the asymptotic behavior remains the same as shown in Theorem 4.2. This can be illustrated from the difference between QFEL(θ,Λ0|θ(k),Λ0(k)), the fully exponential Laplace based Q function, and Q(θ,Λ0|θ(k),Λ0(k)), the true Q function. Since QFEL(θ,Λ0|θ(k),Λ0(k))=Q(θ,Λ0|θ(k),Λ0(k))+O(mmin(ni)2), the resulting estimate, when O(mmin(ni)2) is negligible, will act as if it is calculated from the true Q function or exact likelihood function. Semiparametric efficiency of θ^ still holds in this case.

6 Simulation studies

6.1 Case 1: General simulation study

We specify a longitudinal submodel with three evenly spaced interior knots (q = 7) within the follow-up time interval [0,10] and observation times tij=(0,1,,20)/2. Regression coefficients are set as β=(6,3,7,1,8,5,4). The variance (covariance matrix) of measurement error and random effects are specified as σ2=1 and D=diag(3,4,4,5,4,3,4), respectively. The baseline covariates xi and outside stimulus zi are not incorporated in the longitudinal model for simplicity. For the survival submodel, we consider a binary covariate w taking values 0 or 1, each with probability 1/2 and its effect is assumed to be γ=2. The baseline hazard function λ0(t) is specified as exp{2} if t > 5 and 0 otherwise. Censoring times Ci,i=1,,m are independently generated from Exp(0.05), leading to a 50% censoring rate. Truncated by event times and censoring times, the number of longitudinal observations ni varies between 11 and 21. The simulation setting described above aims to define a simple case to numerically examine the performance of estimation methods. 500 Monte Carlo (MC) replications are conducted and the sample size is fixed as m = 1000.

Table 1 shows the performance of the proposed joint modeling (JM) method against the two-stage (TS) method (Wu and others, 2012). Both methods are evaluated in terms of bias, Monte Carlo SD, bootstrap SD (SDb) and 95% coverage probability (CP). The estimates of survival parameters, namely, γ,α1, and α2, show larger biases under TS than those obtained from the JM method. In particular, the effects of the longitudinal level and variability on survival hazard are underestimated when the TS method is employed. CPs for the TS estimates of γ and α2 suffer from severe under-coverage due to large biases. Both methods provide good performance in estimating parameters related to the longitudinal submodel. Moreover, SDbs are quite close to SDs, which implies that bootstrap SD performs well in estimating the standard error of θ^ obtained from both JM and TS methods and hence can be used in the real data analysis. For survival parameters, the standard errors of the JM estimates are slightly larger than those of the TS estimates. This is mainly due to the fact that the randomness of bi in the survival submodel is removed when the TS method is used. Additionally, the standard errors of the β^j’s and D^jj’s increase with index j since the latter elements in β and D correspond to the basis functions with support over later time intervals in which longitudinal observations are sparse.

Table 1

Simulation results of the JM and TS methods in case 1. Average θ^ is the average value of parameter estimates in 500 MC replications; SD is the MC standard deviation of the estimates across simulated data sets; SDb is the mean of bootstrap standard deviations calculated on bootstrap samples for each MC sample; CP is the 95% coverage probability defined as the percentage of repetitions in which true value falls into the calculated confidence interval.

JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ=2–1.97860.10110.09900.946–1.82180.08780.08980.458
α1=0.20.19760.03100.03030.9400.16670.02700.02730.760
α2=0.30.29170.03010.02900.9400.23890.02670.02560.364
Longitudinal
β1=65.95840.06740.06630.9125.99660.06700.06630.932
β2=33.09020.07970.07940.9103.01240.07880.07930.938
β3=76.95840.08030.08260.9206.97710.08060.08240.942
β4=10.98240.08750.08630.9481.03850.08730.08620.938
β5=87.95000.11400.12200.9307.88220.11290.12050.820
β6=54.97980.17170.16780.9524.88840.17130.16700.900
β7=44.00110.22380.22850.9444.00360.21970.22590.942
D11=33.01390.18190.18680.9502.99780.18300.18680.950
D22=44.04550.25180.25690.9423.99030.25130.25530.948
D33=43.99460.24180.25170.9483.99330.24330.25370.944
D44=54.99660.27200.27850.9464.99190.27230.27820.948
D55=43.94820.38250.37030.9403.93260.38300.36910.944
D66=32.98120.50570.52200.9422.97050.50230.61960.946
D77=43.89980.73970.73660.9423.89140.73380.73240.946
σ2=10.99830.01410.01420.9400.99980.01420.01430.942
JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ=2–1.97860.10110.09900.946–1.82180.08780.08980.458
α1=0.20.19760.03100.03030.9400.16670.02700.02730.760
α2=0.30.29170.03010.02900.9400.23890.02670.02560.364
Longitudinal
β1=65.95840.06740.06630.9125.99660.06700.06630.932
β2=33.09020.07970.07940.9103.01240.07880.07930.938
β3=76.95840.08030.08260.9206.97710.08060.08240.942
β4=10.98240.08750.08630.9481.03850.08730.08620.938
β5=87.95000.11400.12200.9307.88220.11290.12050.820
β6=54.97980.17170.16780.9524.88840.17130.16700.900
β7=44.00110.22380.22850.9444.00360.21970.22590.942
D11=33.01390.18190.18680.9502.99780.18300.18680.950
D22=44.04550.25180.25690.9423.99030.25130.25530.948
D33=43.99460.24180.25170.9483.99330.24330.25370.944
D44=54.99660.27200.27850.9464.99190.27230.27820.948
D55=43.94820.38250.37030.9403.93260.38300.36910.944
D66=32.98120.50570.52200.9422.97050.50230.61960.946
D77=43.89980.73970.73660.9423.89140.73380.73240.946
σ2=10.99830.01410.01420.9400.99980.01420.01430.942
Table 1

Simulation results of the JM and TS methods in case 1. Average θ^ is the average value of parameter estimates in 500 MC replications; SD is the MC standard deviation of the estimates across simulated data sets; SDb is the mean of bootstrap standard deviations calculated on bootstrap samples for each MC sample; CP is the 95% coverage probability defined as the percentage of repetitions in which true value falls into the calculated confidence interval.

JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ=2–1.97860.10110.09900.946–1.82180.08780.08980.458
α1=0.20.19760.03100.03030.9400.16670.02700.02730.760
α2=0.30.29170.03010.02900.9400.23890.02670.02560.364
Longitudinal
β1=65.95840.06740.06630.9125.99660.06700.06630.932
β2=33.09020.07970.07940.9103.01240.07880.07930.938
β3=76.95840.08030.08260.9206.97710.08060.08240.942
β4=10.98240.08750.08630.9481.03850.08730.08620.938
β5=87.95000.11400.12200.9307.88220.11290.12050.820
β6=54.97980.17170.16780.9524.88840.17130.16700.900
β7=44.00110.22380.22850.9444.00360.21970.22590.942
D11=33.01390.18190.18680.9502.99780.18300.18680.950
D22=44.04550.25180.25690.9423.99030.25130.25530.948
D33=43.99460.24180.25170.9483.99330.24330.25370.944
D44=54.99660.27200.27850.9464.99190.27230.27820.948
D55=43.94820.38250.37030.9403.93260.38300.36910.944
D66=32.98120.50570.52200.9422.97050.50230.61960.946
D77=43.89980.73970.73660.9423.89140.73380.73240.946
σ2=10.99830.01410.01420.9400.99980.01420.01430.942
JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ=2–1.97860.10110.09900.946–1.82180.08780.08980.458
α1=0.20.19760.03100.03030.9400.16670.02700.02730.760
α2=0.30.29170.03010.02900.9400.23890.02670.02560.364
Longitudinal
β1=65.95840.06740.06630.9125.99660.06700.06630.932
β2=33.09020.07970.07940.9103.01240.07880.07930.938
β3=76.95840.08030.08260.9206.97710.08060.08240.942
β4=10.98240.08750.08630.9481.03850.08730.08620.938
β5=87.95000.11400.12200.9307.88220.11290.12050.820
β6=54.97980.17170.16780.9524.88840.17130.16700.900
β7=44.00110.22380.22850.9444.00360.21970.22590.942
D11=33.01390.18190.18680.9502.99780.18300.18680.950
D22=44.04550.25180.25690.9423.99030.25130.25530.948
D33=43.99460.24180.25170.9483.99330.24330.25370.944
D44=54.99660.27200.27850.9464.99190.27230.27820.948
D55=43.94820.38250.37030.9403.93260.38300.36910.944
D66=32.98120.50570.52200.9422.97050.50230.61960.946
D77=43.89980.73970.73660.9423.89140.73380.73240.946
σ2=10.99830.01410.01420.9400.99980.01420.01430.942

6.2 Case 2: Simulation study based on the MRC trial

A second simulation is conducted to investigate the performance of the proposed method in analyzing the MRC trial by generating data similar to the real data. To this end, parameters are set around their estimated values. Covariates and observation schedules for longitudinal and survival outcomes are designed to mimic the real data (see Section SC.1 of the supplementary material available at Biostatistics online for details). Truncated by event times and censoring times, ni varies between 10 and 26 and thus min(ni)>m1/4 with m = 3700. The approximation error in the E-step is expected to have little effect on the JM estimation. Before showing the final estimation result, we first illustrate the validity of the knots selection criteria mentioned in Section 3.1. By calculating AIC and BIC based on longitudinal data only, the true model can be effectively selected as indicated in Table 2.

Table 2

Knots selection criteria with longitudinal data only.

NumberLocationAICBIC
q = 50.2597 06597 285
1101 295101 515
1.5102 593102 813
q = 6(0.25, 0.5)96 27396 529
(0.5, 1)96 03496 290
(0.25, 1)95 65795 913
(0.25, 1.5)95 50895 765
q = 7(0.25, 1, 1.5)95 94796 240
NumberLocationAICBIC
q = 50.2597 06597 285
1101 295101 515
1.5102 593102 813
q = 6(0.25, 0.5)96 27396 529
(0.5, 1)96 03496 290
(0.25, 1)95 65795 913
(0.25, 1.5)95 50895 765
q = 7(0.25, 1, 1.5)95 94796 240
Table 2

Knots selection criteria with longitudinal data only.

NumberLocationAICBIC
q = 50.2597 06597 285
1101 295101 515
1.5102 593102 813
q = 6(0.25, 0.5)96 27396 529
(0.5, 1)96 03496 290
(0.25, 1)95 65795 913
(0.25, 1.5)95 50895 765
q = 7(0.25, 1, 1.5)95 94796 240
NumberLocationAICBIC
q = 50.2597 06597 285
1101 295101 515
1.5102 593102 813
q = 6(0.25, 0.5)96 27396 529
(0.5, 1)96 03496 290
(0.25, 1)95 65795 913
(0.25, 1.5)95 50895 765
q = 7(0.25, 1, 1.5)95 94796 240

Table 3 shows the estimation results in case 2. Due to the high censoring rate, biases and standard errors increase for estimates of survival parameters compared with those in case 1. For parameters of particular interest (α1,α2), estimates obtained from the JM method perform slightly better than those from the TS method in terms of bias and CP, but the improvement is not as significant as that in case 1. An important reason is that the predicted random effects from the longitudinal model, b^iL, are close to bi, as the measurement error involved in longitudinal observations is small (σ2=0.1764) in case 2.

Table 3

Simulation results of the JM and TS methods in case 2. Average θ^ is the average value of parameter estimates in 500 MC replications; SD is the MC standard deviation of the estimates across simulated data sets; SDb is the mean of bootstrap standard deviations calculated on bootstrap samples for each MC sample; CP is the 95% coverage probability defined as the percentage of repetitions in which true value falls into the calculated confidence interval.

JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ1=0.4–0.41900.15640.15490.948–0.42220.15560.15370.936
γ2=0.1–0.11800.13970.14170.944–0.12020.13930.14100.942
γ3=0.710.71950.10160.10130.9460.72880.10150.10120.940
γ4=0.220.20250.13270.13190.9500.19940.13260.13190.948
γ5=0.360.32210.20360.21380.9460.31570.20320.21370.944
α1=0.370.34160.14410.13840.9340.32360.13890.13320.920
α2=0.230.20370.08440.08270.9320.17580.08450.08230.870
Longitudinal
β1=6.26.19320.12410.12420.9466.19440.12420.12400.948
β2=4.54.49540.12120.12210.9464.49470.12110.12210.948
β3=5.04.99460.11990.12230.9404.99500.11990.12220.940
β4=4.74.69510.12310.12260.9444.69360.12310.12280.944
β5=5.04.99500.12600.12640.9444.99070.12620.12620.946
β6=4.54.48600.12780.12760.9464.48510.12780.12770.946
β1(1)=6.86.79880.13010.12760.9426.78870.13010.12750.942
β2(1)=4.24.19390.12140.12170.9404.19350.12140.12160.940
β3(1)=4.54.49350.12120.12220.9364.49390.12120.12140.934
β4(1)=4.14.09760.12340.12400.9484.09680.12340.12380.944
β5(1)=4.54.49110.13070.13160.9444.49220.13080.13150.946
β6(1)=4.24.19550.13530.13570.9484.19500.13520.13540.946
β1(2)=6.46.39590.12770.12780.9506.39690.12780.12770.948
β2(2)=4.54.49510.12090.12180.9343.01240.12090.12180.934
β3(2)=4.64.59250.12080.12170.9364.59310.12080.12170.936
β4(2)=4.14.09720.12460.12430.9404.09610.12460.12450.940
β5(2)=4.54.49010.13310.13300.9424.49150.13320.13290.946
β6(2)=4.34.29200.13310.13350.9364.29140.13310.13350.938
η1=0.05–0.04950.00840.00850.952–0.04960.00840.00850.952
η2=0.030.02970.01240.01110.9480.02970.01240.01120.948
η3=0.090.09070.01720.01750.9420.09080.01720.01750.940
ξ=0.080.08010.00420.00420.9480.07990.00420.00420.978
D22=0.20.19930.00620.00630.9420.19950.00620.00640.946
D33=0.20.19940.00700.00690.9540.19970.00710.00690.958
D44=0.40.39880.01600.01700.9520.39930.01600.01700.952
D55=0.50.49750.02680.02630.9540.49710.02680.02640.950
D66=0.30.29750.03200.03220.9520.29760.03200.03230.952
σ2=0.17640.17640.00100.00100.9300.17640.00100.00100.930
JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ1=0.4–0.41900.15640.15490.948–0.42220.15560.15370.936
γ2=0.1–0.11800.13970.14170.944–0.12020.13930.14100.942
γ3=0.710.71950.10160.10130.9460.72880.10150.10120.940
γ4=0.220.20250.13270.13190.9500.19940.13260.13190.948
γ5=0.360.32210.20360.21380.9460.31570.20320.21370.944
α1=0.370.34160.14410.13840.9340.32360.13890.13320.920
α2=0.230.20370.08440.08270.9320.17580.08450.08230.870
Longitudinal
β1=6.26.19320.12410.12420.9466.19440.12420.12400.948
β2=4.54.49540.12120.12210.9464.49470.12110.12210.948
β3=5.04.99460.11990.12230.9404.99500.11990.12220.940
β4=4.74.69510.12310.12260.9444.69360.12310.12280.944
β5=5.04.99500.12600.12640.9444.99070.12620.12620.946
β6=4.54.48600.12780.12760.9464.48510.12780.12770.946
β1(1)=6.86.79880.13010.12760.9426.78870.13010.12750.942
β2(1)=4.24.19390.12140.12170.9404.19350.12140.12160.940
β3(1)=4.54.49350.12120.12220.9364.49390.12120.12140.934
β4(1)=4.14.09760.12340.12400.9484.09680.12340.12380.944
β5(1)=4.54.49110.13070.13160.9444.49220.13080.13150.946
β6(1)=4.24.19550.13530.13570.9484.19500.13520.13540.946
β1(2)=6.46.39590.12770.12780.9506.39690.12780.12770.948
β2(2)=4.54.49510.12090.12180.9343.01240.12090.12180.934
β3(2)=4.64.59250.12080.12170.9364.59310.12080.12170.936
β4(2)=4.14.09720.12460.12430.9404.09610.12460.12450.940
β5(2)=4.54.49010.13310.13300.9424.49150.13320.13290.946
β6(2)=4.34.29200.13310.13350.9364.29140.13310.13350.938
η1=0.05–0.04950.00840.00850.952–0.04960.00840.00850.952
η2=0.030.02970.01240.01110.9480.02970.01240.01120.948
η3=0.090.09070.01720.01750.9420.09080.01720.01750.940
ξ=0.080.08010.00420.00420.9480.07990.00420.00420.978
D22=0.20.19930.00620.00630.9420.19950.00620.00640.946
D33=0.20.19940.00700.00690.9540.19970.00710.00690.958
D44=0.40.39880.01600.01700.9520.39930.01600.01700.952
D55=0.50.49750.02680.02630.9540.49710.02680.02640.950
D66=0.30.29750.03200.03220.9520.29760.03200.03230.952
σ2=0.17640.17640.00100.00100.9300.17640.00100.00100.930
Table 3

Simulation results of the JM and TS methods in case 2. Average θ^ is the average value of parameter estimates in 500 MC replications; SD is the MC standard deviation of the estimates across simulated data sets; SDb is the mean of bootstrap standard deviations calculated on bootstrap samples for each MC sample; CP is the 95% coverage probability defined as the percentage of repetitions in which true value falls into the calculated confidence interval.

JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ1=0.4–0.41900.15640.15490.948–0.42220.15560.15370.936
γ2=0.1–0.11800.13970.14170.944–0.12020.13930.14100.942
γ3=0.710.71950.10160.10130.9460.72880.10150.10120.940
γ4=0.220.20250.13270.13190.9500.19940.13260.13190.948
γ5=0.360.32210.20360.21380.9460.31570.20320.21370.944
α1=0.370.34160.14410.13840.9340.32360.13890.13320.920
α2=0.230.20370.08440.08270.9320.17580.08450.08230.870
Longitudinal
β1=6.26.19320.12410.12420.9466.19440.12420.12400.948
β2=4.54.49540.12120.12210.9464.49470.12110.12210.948
β3=5.04.99460.11990.12230.9404.99500.11990.12220.940
β4=4.74.69510.12310.12260.9444.69360.12310.12280.944
β5=5.04.99500.12600.12640.9444.99070.12620.12620.946
β6=4.54.48600.12780.12760.9464.48510.12780.12770.946
β1(1)=6.86.79880.13010.12760.9426.78870.13010.12750.942
β2(1)=4.24.19390.12140.12170.9404.19350.12140.12160.940
β3(1)=4.54.49350.12120.12220.9364.49390.12120.12140.934
β4(1)=4.14.09760.12340.12400.9484.09680.12340.12380.944
β5(1)=4.54.49110.13070.13160.9444.49220.13080.13150.946
β6(1)=4.24.19550.13530.13570.9484.19500.13520.13540.946
β1(2)=6.46.39590.12770.12780.9506.39690.12780.12770.948
β2(2)=4.54.49510.12090.12180.9343.01240.12090.12180.934
β3(2)=4.64.59250.12080.12170.9364.59310.12080.12170.936
β4(2)=4.14.09720.12460.12430.9404.09610.12460.12450.940
β5(2)=4.54.49010.13310.13300.9424.49150.13320.13290.946
β6(2)=4.34.29200.13310.13350.9364.29140.13310.13350.938
η1=0.05–0.04950.00840.00850.952–0.04960.00840.00850.952
η2=0.030.02970.01240.01110.9480.02970.01240.01120.948
η3=0.090.09070.01720.01750.9420.09080.01720.01750.940
ξ=0.080.08010.00420.00420.9480.07990.00420.00420.978
D22=0.20.19930.00620.00630.9420.19950.00620.00640.946
D33=0.20.19940.00700.00690.9540.19970.00710.00690.958
D44=0.40.39880.01600.01700.9520.39930.01600.01700.952
D55=0.50.49750.02680.02630.9540.49710.02680.02640.950
D66=0.30.29750.03200.03220.9520.29760.03200.03230.952
σ2=0.17640.17640.00100.00100.9300.17640.00100.00100.930
JM
TS
Trueθ^SDSDbCPθ^SDSDbCP
Survival
γ1=0.4–0.41900.15640.15490.948–0.42220.15560.15370.936
γ2=0.1–0.11800.13970.14170.944–0.12020.13930.14100.942
γ3=0.710.71950.10160.10130.9460.72880.10150.10120.940
γ4=0.220.20250.13270.13190.9500.19940.13260.13190.948
γ5=0.360.32210.20360.21380.9460.31570.20320.21370.944
α1=0.370.34160.14410.13840.9340.32360.13890.13320.920
α2=0.230.20370.08440.08270.9320.17580.08450.08230.870
Longitudinal
β1=6.26.19320.12410.12420.9466.19440.12420.12400.948
β2=4.54.49540.12120.12210.9464.49470.12110.12210.948
β3=5.04.99460.11990.12230.9404.99500.11990.12220.940
β4=4.74.69510.12310.12260.9444.69360.12310.12280.944
β5=5.04.99500.12600.12640.9444.99070.12620.12620.946
β6=4.54.48600.12780.12760.9464.48510.12780.12770.946
β1(1)=6.86.79880.13010.12760.9426.78870.13010.12750.942
β2(1)=4.24.19390.12140.12170.9404.19350.12140.12160.940
β3(1)=4.54.49350.12120.12220.9364.49390.12120.12140.934
β4(1)=4.14.09760.12340.12400.9484.09680.12340.12380.944
β5(1)=4.54.49110.13070.13160.9444.49220.13080.13150.946
β6(1)=4.24.19550.13530.13570.9484.19500.13520.13540.946
β1(2)=6.46.39590.12770.12780.9506.39690.12780.12770.948
β2(2)=4.54.49510.12090.12180.9343.01240.12090.12180.934
β3(2)=4.64.59250.12080.12170.9364.59310.12080.12170.936
β4(2)=4.14.09720.12460.12430.9404.09610.12460.12450.940
β5(2)=4.54.49010.13310.13300.9424.49150.13320.13290.946
β6(2)=4.34.29200.13310.13350.9364.29140.13310.13350.938
η1=0.05–0.04950.00840.00850.952–0.04960.00840.00850.952
η2=0.030.02970.01240.01110.9480.02970.01240.01120.948
η3=0.090.09070.01720.01750.9420.09080.01720.01750.940
ξ=0.080.08010.00420.00420.9480.07990.00420.00420.978
D22=0.20.19930.00620.00630.9420.19950.00620.00640.946
D33=0.20.19940.00700.00690.9540.19970.00710.00690.958
D44=0.40.39880.01600.01700.9520.39930.01600.01700.952
D55=0.50.49750.02680.02630.9540.49710.02680.02640.950
D66=0.30.29750.03200.03220.9520.29760.03200.03230.952
σ2=0.17640.17640.00100.00100.9300.17640.00100.00100.930

6.3 Case 3: Simulation study for a nonparametric setting

In this final simulation study, we examine the performance of our proposed method (the spline based sieve approximation to a biomarker trajectory and the EM algorithm in estimation) when the true biomarker trajectory is not exactly a spline function as specified in (3.3). We set the true longitudinal trajectory for the i-th individual as mi(t)=ν1i{(1/6)(t3)3+(t3)}+ν2i for i=1,,1000, where ν1i and ν2i are independent random parameters generated from U(0.5,2) and U(4, 8), respectively. Obviously, ν1i corresponds to the degree of fluctuation for the i-th trajectory and ν2i is related to the biomarker level. The measurement times are specified as tij={0,0.4,0.8,1.2,1.6,2,2.5,…6} (every 0.5 year after 2 years) during the follow-up interval [0,6] and the measurement error is generated from N(0,0.42), independently of ν1i and ν2i. The baseline covariates xi and outside stimulus zi are not incorporated for simplicity. For the survival submodel, the baseline hazard function λ0(t) is specified as exp{3} if t > 2 and 0 otherwise. We generate a binary baseline covariate wi from Bernoulli distribution with probability 0.5. Censoring times Ci, i=1,,1000 are independently generated from U(3, 10), which, together with the end of study (τ = 6), lead to approximately 40% censoring. Truncated by event times and right censoring, the number of longitudinal observations ni varies between 6 and 14.

We fit the joint model given by (3.1), (3.2), (3.3) and (3.6) to the simulated data. Note that the spline regression model (3.3), in this case, is an approximation or working model. To determine the number and location of interior knots, we first specify several candidate knot patterns based on quantiles of measurement times and the follow-up interval (see Section SC.2 of the supplementary material available at Biostatistics online), and then select the best pattern among them via AIC and BIC, which are calculated by simply fitting the longitudinal submodel. Table 4 shows the estimation result for survival parameters obtained from different methods. In addition to JM and TS mentioned before, the performance of the true model (a benchmark) where the biomarker level and variability are specified from the true mi(t) as in the data generating process, and that of the standard joint model (called “No Variability” in Table 4) without biomaker variability, are also reported in Table 4. It is expected that JM and TS may not perform as well as in case 1 and case 2 due to the extra bias resulting from the approximation in (3.3). Compared with γ and α1, the estimation of α2, in both JM and TS, suffers from significant bias and thus under-coverage of CP. However, if the biomarker variability is removed, as implemented with the standard joint model, both γ and α1 are estimated with substantial bias. In this sense, taking the biomarker variability into account benefits the estimation of effects of the other predictors. Table 5 shows the estimation results when the biomarker variability is not involved in generating event times, i.e., the true value of α2 is zero. These three models produce similar estimates and the null effect of biomarker variability can be successfully detected. A more challenging case (case 4) where mi(t) is no longer a polynomial function but a sine function instead, is considered in Section SC.3 of the supplementary material available at Biostatistics online. JM and TS show very similar performance in this scenario since approximating a sine function via a spline function is the dominant source of bias.

Table 4

Simulation results in case 3 among 500 MC replications. SD is the MC standard deviation of the estimates across simulated data sets; CP is the 95% coverage probability.

JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.98840.07780.944–0.96640.07610.920–0.89830.07650.726–0.99780.07600.948
α1=0.30.29180.03190.9360.28050.03030.8940.22600.03190.3560.30200.03140.952
α2=0.30.26740.03010.7200.24380.02750.6640.30090.02820.960
JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.98840.07780.944–0.96640.07610.920–0.89830.07650.726–0.99780.07600.948
α1=0.30.29180.03190.9360.28050.03030.8940.22600.03190.3560.30200.03140.952
α2=0.30.26740.03010.7200.24380.02750.6640.30090.02820.960
Table 4

Simulation results in case 3 among 500 MC replications. SD is the MC standard deviation of the estimates across simulated data sets; CP is the 95% coverage probability.

JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.98840.07780.944–0.96640.07610.920–0.89830.07650.726–0.99780.07600.948
α1=0.30.29180.03190.9360.28050.03030.8940.22600.03190.3560.30200.03140.952
α2=0.30.26740.03010.7200.24380.02750.6640.30090.02820.960
JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.98840.07780.944–0.96640.07610.920–0.89830.07650.726–0.99780.07600.948
α1=0.30.29180.03190.9360.28050.03030.8940.22600.03190.3560.30200.03140.952
α2=0.30.26740.03010.7200.24380.02750.6640.30090.02820.960
Table 5

Simulation results in case 3 among 500 MC replications when the true value of α2 is set as zero. SD is the MC standard deviation of the estimates across simulated data sets; CP is the 95% coverage probability.

JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.99950.08930.946–1.00130.08970.946–1.00160.09140.946–1.00490.08970.948
α1=0.30.28980.03920.9260.28270.03740.9240.28750.03820.9240.30190.03400.950
α2=00.00220.03310.948–0.00470.03450.9500.00040.03050.952
JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.99950.08930.946–1.00130.08970.946–1.00160.09140.946–1.00490.08970.948
α1=0.30.28980.03920.9260.28270.03740.9240.28750.03820.9240.30190.03400.950
α2=00.00220.03310.948–0.00470.03450.9500.00040.03050.952
Table 5

Simulation results in case 3 among 500 MC replications when the true value of α2 is set as zero. SD is the MC standard deviation of the estimates across simulated data sets; CP is the 95% coverage probability.

JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.99950.08930.946–1.00130.08970.946–1.00160.09140.946–1.00490.08970.948
α1=0.30.28980.03920.9260.28270.03740.9240.28750.03820.9240.30190.03400.950
α2=00.00220.03310.948–0.00470.03450.9500.00040.03050.952
JM
TS
No Variability
True model
Trueθ^SDCPθ^SDCPθ^SDCPθ^SDCP
γ=1–0.99950.08930.946–1.00130.08970.946–1.00160.09140.946–1.00490.08970.948
α1=0.30.28980.03920.9260.28270.03740.9240.28750.03820.9240.30190.03400.950
α2=00.00220.03310.948–0.00470.03450.9500.00040.03050.952

7 Real-data analysis

In the MRC trial, 4396 patients aged 65–74 were randomized to receive diuretic, β blocker or placebo. SBP values were measured fornightly for the first month, then monthly to 3 months and every 3 months thereafter. Cardiovascular (CV) events including stroke, whether non-fatal or fatal; and myocardial infarction, whether fatal or nonfatal were recorded during the follow-up. Figure 3(a) illustrates the change in group average of SBP observations (divided by 30) over time. Several points are highlighted here. (i) The average SBP trends are different in different treatment groups and therefore a treatment-dependent longitudinal model is preferred for modeling SBP trajectories. (ii) The regular peaks, occurring at the end of each year (1, 2, 3, 4, 5 years) as well as the entry visit (0 year), correspond to the measurements made by doctors with the other measurements made by nurses. The marked increase at these time points can be explained by the “white coat” effect, a pressor response to the presence of doctors or other kind of stressful circumstances (Lever and Brennan, 1993; Party and others, 1992), which, in our model, is treated as an outside stimulus and is excluded from the inherent SBP trend (Carr and others, 2012). (iii) All groups experienced immediate drop-off after entry. The drop was steepest in the first 2-weeks, then continued, more gently, for about 3 months (0.25 year) followed by a comparatively stable stage. Therefore global polynomials of lower degree may not perform well in capturing the SBP trend over time and so cubic B-spline basis functions are used instead.

In this article, we focus on the relationship between longitudinal SBP measurements and time to the first cardiovascular event, especially on the question whether the cumulative variability of the SBP profile exhibits significant association with the CV hazard. To ensure the approximation method (in the E-step) used in our EM algorithm does not lead to non-ignorable biases, we restrict our analysis to the cohort of 3784 patients who had at least 10 SBP measurements, and 310 patients among this cohort experienced the event during the follow-up period. The longitudinal submodel for SBP observations is given below:
where xi=(Sexi,Smokei,Agei) is a vector of baseline covariates with Agei denoted as original age divided by 10; mi(t) is the subject-specific SBP trend over time with fixed effects being treatment-dependent (tmt=1 or 2 corresponds to diuretic or β blocker treatment); WCij, the “white coat” indicator, takes value 1 if the j-th measurement was measured by a doctor and εijN(0,σ2) is the measurement error independent of random effect bi. In light of SBP profiles (see Figure 3(a)) together with AIC and BIC criteria, we set q = 6 with interior knots located at 0.25 and 1.5. Preliminary analysis based on longitudinal data only (ignoring event times) shows that the variance of bi1 is very close to zero. So bi=(bi2,,biq1)Nq1(0,D) with bi1 removed. To explore the impact of inherent SBP value and our defined SBP variability on the risk of CV, we consider the following survival submodel,
where wi=(Itmti=1,Itmti=2,Sexi,Smokei,Agei) are baseline covariates, which are determined according to prior knowledge and naive analysis of event times (see Figure 3(b)). Usually in medicine, the initial reduction in SBP is attributed to treatment initiation and dose adjustment as clinical trials aim to achieve good early control of blood pressure (Rothwell and others, 2010 a). Therefore, the square root of cumulative variability from the 3-month (0.25-year) visit is considered and incorporated in the survival submodel as a predictor.

Table 6 shows the estimation results from the proposed joint model (JM), the standard joint model (No Variability) and the naive proportional hazard (PH) model. Both the proposed and standard joint models are estimated with the algorithm in Section 5 and the standard errors are obtained via bootstrap. The association between CV and the defined measure of SBP variability (α2), after adjustment for underlying SBP level and baseline covariates, is significant under JM. Specifically, the hazard ratio associated with 1-SD increase in SBP variability (SD for defined SBP variability: 0.535) is 1.14 (95% confidence interval: 1.02–1.27) and 1.20 (95% confidence interval: 1.02–1.40) for underlying SBP level (SD for SBP value: 0.485). The effect of diuretic treatment (γ1), after adjusting for other baseline predictors, SBP level and variability, still appears to be highly significant in decreasing CV risk, while the effect of β blocker (γ2) is weak. Both joint models lead to similar conclusion on this point. Compared with the estimation results given by PH, the treatment effects (γ1 and γ2) are attenuated by incorporating SBP relevant predictors in joint models. It seems that the effects of diuretic and β blocker treatments on CV risk can be better captured (explained) through their impact on SBP level and variability together since the treatment effects diminish more in the proposed model than those in the standard joint model. In terms of model performance, the AIC (BIC) values for the proposed and standard joint models are 121 003.6 (121 221.9) and 121 204.3 (121 416.4), respectively.

Table 6

Estimation results for MRC data under different models. JM refers to the proposed joint model; No Variability refers to the standard joint model with the restriction α2=0 and PH is the proportional hazards model involving baseline covariates only. SDb in both JM and No Variability denote the standard deviation based on 100 bootstrap samples and SD’s in PH are obtained from R function “coxph”.

JM
No Variability
PH
Parametersθ^SDbθ^SDbθ^SD
Survival
γ1–0.40950.1389–0.44250.1386–0.59740.1581
γ2–0.09670.1701–0.12630.1658–0.26790.1386
γ30.70810.12280.71210.12640.68270.1159
γ40.22190.14300.22150.14050.22800.1406
γ50.35570.21420.35840.21100.35810.2092
α10.36930.16590.30150.1372
α20.23780.1035
Longitudinal
η1–0.04660.0106–0.04710.0102
η20.03300.01460.03280.0147
η30.09280.02080.09270.0204
ξ0.07890.00490.07960.0047
D220.21890.00690.22190.0068
D330.21610.00820.22060.0080
D440.40640.01750.40900.0178
D550.53620.02210.53950.0218
D660.30370.01280.30360.0125
σ20.17980.00160.17980.0016
JM
No Variability
PH
Parametersθ^SDbθ^SDbθ^SD
Survival
γ1–0.40950.1389–0.44250.1386–0.59740.1581
γ2–0.09670.1701–0.12630.1658–0.26790.1386
γ30.70810.12280.71210.12640.68270.1159
γ40.22190.14300.22150.14050.22800.1406
γ50.35570.21420.35840.21100.35810.2092
α10.36930.16590.30150.1372
α20.23780.1035
Longitudinal
η1–0.04660.0106–0.04710.0102
η20.03300.01460.03280.0147
η30.09280.02080.09270.0204
ξ0.07890.00490.07960.0047
D220.21890.00690.22190.0068
D330.21610.00820.22060.0080
D440.40640.01750.40900.0178
D550.53620.02210.53950.0218
D660.30370.01280.30360.0125
σ20.17980.00160.17980.0016
Table 6

Estimation results for MRC data under different models. JM refers to the proposed joint model; No Variability refers to the standard joint model with the restriction α2=0 and PH is the proportional hazards model involving baseline covariates only. SDb in both JM and No Variability denote the standard deviation based on 100 bootstrap samples and SD’s in PH are obtained from R function “coxph”.

JM
No Variability
PH
Parametersθ^SDbθ^SDbθ^SD
Survival
γ1–0.40950.1389–0.44250.1386–0.59740.1581
γ2–0.09670.1701–0.12630.1658–0.26790.1386
γ30.70810.12280.71210.12640.68270.1159
γ40.22190.14300.22150.14050.22800.1406
γ50.35570.21420.35840.21100.35810.2092
α10.36930.16590.30150.1372
α20.23780.1035
Longitudinal
η1–0.04660.0106–0.04710.0102
η20.03300.01460.03280.0147
η30.09280.02080.09270.0204
ξ0.07890.00490.07960.0047
D220.21890.00690.22190.0068
D330.21610.00820.22060.0080
D440.40640.01750.40900.0178
D550.53620.02210.53950.0218
D660.30370.01280.30360.0125
σ20.17980.00160.17980.0016
JM
No Variability
PH
Parametersθ^SDbθ^SDbθ^SD
Survival
γ1–0.40950.1389–0.44250.1386–0.59740.1581
γ2–0.09670.1701–0.12630.1658–0.26790.1386
γ30.70810.12280.71210.12640.68270.1159
γ40.22190.14300.22150.14050.22800.1406
γ50.35570.21420.35840.21100.35810.2092
α10.36930.16590.30150.1372
α20.23780.1035
Longitudinal
η1–0.04660.0106–0.04710.0102
η20.03300.01460.03280.0147
η30.09280.02080.09270.0204
ξ0.07890.00490.07960.0047
D220.21890.00690.22190.0068
D330.21610.00820.22060.0080
D440.40640.01750.40900.0178
D550.53620.02210.53950.0218
D660.30370.01280.30360.0125
σ20.17980.00160.17980.0016

Figure 4 shows the fitted SBP trajectory and variability in each treatment group. Both diuretic and β blocker treatments significantly reduce SBP level compared with that in the placebo group. Although the diuretic group exhibits a more immediate drop in SBP value after entry, both active treatment groups reach similar SBP level after two years. However, the square root of cumulative SBP variability appears to be larger in the β blocker group than the other two groups, which concurs with existing findings in medicine that β blockers tend to induce excessive intraindividual SBP variability over time (Carr and others, 2012; Rothwell and others, 2010 a).

Fitted SBP trajectories and group average SBP variability under JM. Left: fitted SBP mean trends in different treatment groups ∑k=1q(β^k+I{tmti=1}β^k(1)+I{tmti=2}β^k(2))Bk(t) and corresponding 95% confidence intervals. Right: group average of estimated SBP variability (∫0.25t{m^i″(s)}2ds)1/2,i=1,…,m by different treatments.
Fig. 4

Fitted SBP trajectories and group average SBP variability under JM. Left: fitted SBP mean trends in different treatment groups k=1q(β^k+I{tmti=1}β^k(1)+I{tmti=2}β^k(2))Bk(t) and corresponding 95% confidence intervals. Right: group average of estimated SBP variability (0.25t{m^i(s)}2ds)1/2,i=1,,m by different treatments.

8 Discussion

This article focuses on how to characterize the biological variability or instability of underlying biomarker trajectories in joint modeling of longitudinal and time-to-event data. Borrowing ideas from the roughness penalty in smoothing splines and penalized splines, we propose a similar quantity to evaluate the fluctuation of subject-specific longitudinal trajectories which are modeled by mixed-effects regression splines. To support the use of spline bases in modeling longitudinal data and further our proposed variability measure, longitudinal observations should present nonlinear profiles and observation times should not be very sparse. To ensure the m convergence rate of MLEs, it is required that min(ni) grows at a rate greater than m1/4. Otherwise, approximation errors resulting from calculating expectations in the EM algorithm will dominate the estimator’s asymptotic behavior.

The performance of the proposed variability measure depends heavily on the number and locations of knots. For example, trajectories in the right-hand graph of Figure 1 tend to flatten out after time 1 due to the fact that no interior knots are placed in the later period (knot locations in this case are 0.25 and 1.5), whereas similar phenomena do not occur in Figure 2 as the interior knots, in the toy example, are evenly spaced. Increasing the number of knots may alleviate the sensitivity to the locations of knots at the price of increasing model complexity. Although standard methods, e.g., AIC and BIC, provide a way to choose the knot sequence, further exploration and more attention should be given to this issue.

In some cases, it might be more reasonable to use the average variability or the variability over a recent period as a predictor. More generally, we can consider a weighted variability (t0tω(ts){mi(s)}2ds)1/2 where the weight function can be specified within a parametric family with unknown parameters or be modeled in a nonparametric way. The time window over which the biomarker variability is more relevant in predicting survival events can be implied from the estimate of ω(ts). Since t0tω(ts){mi(s)}2ds is still a quadratic form of random effects, the methodology developed in this article can be straightforwardly applied to the weighted form.

9 Supplementary Material

R code for the simulations is available at https://github.com/cywang0315/R_simulations. Supplementary material is available at http://biostatistics.oxfordjournals.org.

Acknowledgments

We are grateful to the two anonymous reviewers and the associate editor for their very helpful comments and suggestions. We thank the Medical Research Council (MRC) of the UK for access to the MRC elderly trial data.

Conflict of Interest: The authors have declared no conflict of interest.

Funding

Our work was supported in part by the National Natural Science Foundation of China (12271047), and in part by the Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College (2022B1212010006).

References

Aalen
O.
,
Borgan
O.
,
Gjessing
H.
(
2008
).
Survival and Event History Analysis: A Process Point of View
.
New York
:
Springer Science & Business Media
.

Agarwal
G. G
,
Studden
W. J.
(
1980
).
Asymptotic integrated mean square error using least squares and bias minimizing splines
.
The Annals of Statistics
8
,
1307
1325
.

Barrett
J. K.
,
Huille
R.
,
Parker
R.
,
Yano
Y.
,
Griswold
M.
(
2019
).
Estimating the association between blood pressure variability and cardiovascular disease: an application using the aric study
.
Statistics in Medicine
38
,
1855
1868
.

Carr
M.J.
,
Bao
Y.
,
Pan
J.
,
Cruickshank
K.
,
McNamee
R.
(
2012
).
The predictive ability of blood pressure in elderly trial patients
.
Journal of Hypertension
30
,
1725
1733
.

Clarke
R.
,
Shipley
M.
,
Lewington
S.
,
Youngman
L.
,
Collins
R.
,
Marmot
M.
,
Peto
R.
(
1999
).
Underestimation of risk associations due to regression dilution in long-term follow-up of prospective studies
.
American Journal of Epidemiology
150
,
341
353
.

de Boor
C.
(
1978
).
A Practical Guide to Splines
.
New York
:
Springer-Verlag
.

Gao
F.
,
Miller
J. P.
,
Xiong
C.
,
Beiser
J. A.
,
Gordon
M.
(
2011
).
A joint-modeling approach to assess the impact of biomarker variability on the risk of developing clinical outcome
.
Statistical Methods & Applications
20
,
83
100
.

Griffiths
W. E.
,
Hill
R. C.
,
Pope
P.J.
(
1987
).
Small sample properties of probit model estimators
.
Journal of the American Statistical Association
82
,
929
937
.

Howard
S.C.
,
Rothwell
P.M.
(
2009
).
Reproducibility of measures of visit-to-visit variability in blood pressure after transient ischaemic attack or minor stroke
.
Cerebrovascular Diseases
28
,
331
340
.

Hsieh
F
,
Tseng
Y.K.
,
Wang
J.L.
(
2006
).
Joint modeling of survival and longitudinal data: likelihood approach revisited
.
Biometrics
62
,
1037
1043
.

Lever
A.F.
,
Brennan
P.J.
(
1993
).
MRC trial of treatment in elderly hypertensives
.
Clinical and Experimental Hypertension
15
,
941
952
.

Lyles
R. H.
,
Munõz
A.
,
Xu
J.
,
Taylor
J. M.
,
Chmiel
J. S.
(
1999
).
Adjusting for measurement error to assess health effects of variability in biomarkers
.
Statistics in Medicine
18
,
1069
1086
.

Party, MRC Working
and
others
. (
1992
).
Medical research council trial of treatment of hypertension in older adults: principal results
.
BMJ
304
,
405
412
.

Rice
J. A.
,
Wu
C. O.
(
2001
).
Nonparametric mixed effects models for unequally sampled noisy curves
.
Biometrics
57
,
253
259
.

Rizopoulos
D.
,
Verbeke
G.
,
Lesaffre
E.
(
2009
).
Fully exponential laplace approximations for the joint modeling of survival and longitudinal data
.
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
71
,
637
654
.

Rothwell
P. M.
(
2010
).
Limitations of the usual blood-pressure hypothesis and importance of variability, instability, and episodic hypertension
.
The Lancet
375
,
938
948
.

Rothwell
P. M.
,
Howard
S. C.
,
Dolan
E.
,
O’Brien
E.
,
Dobson
J. E.
,
Dahlöf
B.
,
Poulter
N. R.
,
Sever
P. S.
(
2010
a).
Effects of β blockers and calcium-channel blockers on within-individual variability in blood pressure and risk of stroke
.
The Lancet Neurology
9
,
469
480
.

Rothwell
P. M.
,
Howard
S. C
,
Dolan
E.
,
O’Brien
E.
,
Dobson
J. E.
,
Dahlöf
B.
,
Sever
P. S.
,
Poulter
N. R.
(
2010
b).
Prognostic significance of visit-to-visit variability, maximum systolic blood pressure, and episodic hypertension
.
The Lancet
375
,
895
905
.

Shi
M.
,
Weiss
R. E.
,
Taylor
J. M.
(
1996
).
An analysis of paediatric CD4 counts for acquired immune deficiency syndrome using flexible random curves. Journal of the Royal Statistical Society: Series C
(
Applied Statistics
)
45
,
151
163
.

Sylvestre
M. P.
,
Abrahamowicz
M.
(
2009
).
Flexible modeling of the cumulative effects of time-dependent exposures on the hazard
.
Statistics in Medicine
28
,
3437
3453
.

Tierney
L.
,
Kass
R. E.
,
Kadane
J.B.
(
1989
).
Fully exponential Laplace approximations to expectations and variances of nonpositive functions
.
Journal of the American Statistical Association
84
,
710
716
.

Tsiatis
A.A.
,
Degruttola
V.
,
Wulfsohn
M.S.
(
1995
).
Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and CD4 counts in patients with aids
.
Journal of the American Statistical Association
90
,
27
37
.

Tsiatis
A. A.
,
Davidian
M.
(
2004
).
Joint modeling of longitudinal and time-to-event data: an overview
.
Statistica Sinica
24
,
809
834
.

Vonesh
E. F.
(
1996
).
A note on the use of Laplace’s approximation for nonlinear mixed-effects models
.
Biometrika
83
,
447
452
.

Wu
L.
,
Liu
W.
,
Yi
G. Y.
,
Huang
Y.
(
2012
).
Analysis of longitudinal and survival data: joint modeling, inference methods, and issues
.
Journal of Probability and Statistics
2012
. https://doi.org/10.1155/2012/640153.

Yano
Y.
(
2017
).
Visit-to-visit blood pressure variability—what is the current challenge?
American Journal of Hypertension
30
,
112
114
.

Ye
W.
,
Lin
X.
,
Taylor
J. M.
(
2008
).
Semiparametric modeling of longitudinal measurements and time-to-event data—a two-stage regression calibration approach
.
Biometrics
64
,
1238
1246
.

Zeng
D.
,
Cai
J.
(
2005
).
Asymptotic results for maximum likelihood estimators in joint analysis of repeated measurements and survival time
.
The Annals of Statistics
33
,
2132
2163
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data