Joint modeling and multiple comparisons with the best of data from a SMART with survival outcomes Free

Simulation results assuming design II in Figure 1: estimated log hazard ratios and standard errors (SE) assuming (2.1) and (2.2)

		Response		Death first		Death after 2nd randomization
N	Statistics	\|$\beta_1$\|	\|$\beta_2$\|	\|$\beta_3$\|	\|$\beta_4$\|	\|$\beta_5$\|	\|$\beta_6$\|	\|$\beta_7$\|	\|$\beta_8$\|	\|$\beta_9$\|	\|$\beta_{10}$\|	\|$\beta_{11}$\|
	Truth	\|$-$\|1.5	0.6	\|$-$\|0.5	\|$-$\|1.3	\|$-$\|1.2	\|$-$\|0.7	\|$-$\|1	1.8	\|$-$\|2.4	\|$-$\|0.5	\|$-$\|1
200	Bias	\|$-$\|0.01	0.01	\|$-$\|0.02	\|$-$\|0.02	\|$-$\|0.05	\|$-$\|0.12	\|$-$\|0.04	0.14	\|$-$\|1.12	\|$-$\|0.05	\|$-$\|0.11
	SD(⁠\|$\hat{\beta}$\|⁠)	0.24	0.25	0.25	0.29	0.54	0.60	0.51	0.76	2.73	0.61	0.51
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.24	0.24	0.24	0.30	0.47	0.57	0.48	0.73	5.54	0.55	0.51
	95% CP	94.8	94.2	93.7	95.9	94.5	94.4	94.4	94.0	97.1	95.4	94.8
500	Bias	\|$-$\|0.01	0.01	0.00	\|$-$\|0.01	\|$-$\|0.01	\|$-$\|0.04	\|$-$\|0.02	0.05	\|$-$\|0.17	\|$-$\|0.02	\|$-$\|0.04
	SD(⁠\|$\hat{\beta}$\|⁠)	0.15	0.15	0.15	0.18	0.26	0.35	0.30	0.46	0.86	0.31	0.31
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.15	0.15	0.15	0.19	0.27	0.34	0.29	0.43	0.73	0.32	0.31
	95% CP	95.7	95.3	94.7	95.5	94.7	94.4	93.3	93.2	96.8	94.8	95.0
1000	Bias	0.00	0.01	\|$-$\|0.01	0.00	0.00	\|$-$\|0.03	\|$-$\|0.01	0.02	\|$-$\|0.06	\|$-$\|0.01	\|$-$\|0.02
	SD(⁠\|$\hat{\beta}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.21	0.29	0.41	0.22	0.21
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.20	0.30	0.39	0.22	0.22
	95% CP	93.7	95.4	95.1	94.9	95.3	94.7	94.9	96.4	95.4	94.3	95.5

		Response		Death first		Death after 2nd randomization
N	Statistics	\|$\beta_1$\|	\|$\beta_2$\|	\|$\beta_3$\|	\|$\beta_4$\|	\|$\beta_5$\|	\|$\beta_6$\|	\|$\beta_7$\|	\|$\beta_8$\|	\|$\beta_9$\|	\|$\beta_{10}$\|	\|$\beta_{11}$\|
	Truth	\|$-$\|1.5	0.6	\|$-$\|0.5	\|$-$\|1.3	\|$-$\|1.2	\|$-$\|0.7	\|$-$\|1	1.8	\|$-$\|2.4	\|$-$\|0.5	\|$-$\|1
200	Bias	\|$-$\|0.01	0.01	\|$-$\|0.02	\|$-$\|0.02	\|$-$\|0.05	\|$-$\|0.12	\|$-$\|0.04	0.14	\|$-$\|1.12	\|$-$\|0.05	\|$-$\|0.11
	SD(⁠\|$\hat{\beta}$\|⁠)	0.24	0.25	0.25	0.29	0.54	0.60	0.51	0.76	2.73	0.61	0.51
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.24	0.24	0.24	0.30	0.47	0.57	0.48	0.73	5.54	0.55	0.51
	95% CP	94.8	94.2	93.7	95.9	94.5	94.4	94.4	94.0	97.1	95.4	94.8
500	Bias	\|$-$\|0.01	0.01	0.00	\|$-$\|0.01	\|$-$\|0.01	\|$-$\|0.04	\|$-$\|0.02	0.05	\|$-$\|0.17	\|$-$\|0.02	\|$-$\|0.04
	SD(⁠\|$\hat{\beta}$\|⁠)	0.15	0.15	0.15	0.18	0.26	0.35	0.30	0.46	0.86	0.31	0.31
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.15	0.15	0.15	0.19	0.27	0.34	0.29	0.43	0.73	0.32	0.31
	95% CP	95.7	95.3	94.7	95.5	94.7	94.4	93.3	93.2	96.8	94.8	95.0
1000	Bias	0.00	0.01	\|$-$\|0.01	0.00	0.00	\|$-$\|0.03	\|$-$\|0.01	0.02	\|$-$\|0.06	\|$-$\|0.01	\|$-$\|0.02
	SD(⁠\|$\hat{\beta}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.21	0.29	0.41	0.22	0.21
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.20	0.30	0.39	0.22	0.22
	95% CP	93.7	95.4	95.1	94.9	95.3	94.7	94.9	96.4	95.4	94.3	95.5

SD(⁠|$\hat{\beta}$|⁠): Empirical standard error of estimated |$\hat{\beta}$| coefficients.

Avg(⁠|$\widehat{\rm SE}$|⁠): Average of 1000 estimated standard errors (using Hessian matrix).

95% CP: 95% coverage probability.

Table 1.

Open in new tab Download slide

Simulation results assuming design II in Figure 1: estimated log hazard ratios and standard errors (SE) assuming (2.1) and (2.2)

		Response		Death first		Death after 2nd randomization
N	Statistics	\|$\beta_1$\|	\|$\beta_2$\|	\|$\beta_3$\|	\|$\beta_4$\|	\|$\beta_5$\|	\|$\beta_6$\|	\|$\beta_7$\|	\|$\beta_8$\|	\|$\beta_9$\|	\|$\beta_{10}$\|	\|$\beta_{11}$\|
	Truth	\|$-$\|1.5	0.6	\|$-$\|0.5	\|$-$\|1.3	\|$-$\|1.2	\|$-$\|0.7	\|$-$\|1	1.8	\|$-$\|2.4	\|$-$\|0.5	\|$-$\|1
200	Bias	\|$-$\|0.01	0.01	\|$-$\|0.02	\|$-$\|0.02	\|$-$\|0.05	\|$-$\|0.12	\|$-$\|0.04	0.14	\|$-$\|1.12	\|$-$\|0.05	\|$-$\|0.11
	SD(⁠\|$\hat{\beta}$\|⁠)	0.24	0.25	0.25	0.29	0.54	0.60	0.51	0.76	2.73	0.61	0.51
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.24	0.24	0.24	0.30	0.47	0.57	0.48	0.73	5.54	0.55	0.51
	95% CP	94.8	94.2	93.7	95.9	94.5	94.4	94.4	94.0	97.1	95.4	94.8
500	Bias	\|$-$\|0.01	0.01	0.00	\|$-$\|0.01	\|$-$\|0.01	\|$-$\|0.04	\|$-$\|0.02	0.05	\|$-$\|0.17	\|$-$\|0.02	\|$-$\|0.04
	SD(⁠\|$\hat{\beta}$\|⁠)	0.15	0.15	0.15	0.18	0.26	0.35	0.30	0.46	0.86	0.31	0.31
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.15	0.15	0.15	0.19	0.27	0.34	0.29	0.43	0.73	0.32	0.31
	95% CP	95.7	95.3	94.7	95.5	94.7	94.4	93.3	93.2	96.8	94.8	95.0
1000	Bias	0.00	0.01	\|$-$\|0.01	0.00	0.00	\|$-$\|0.03	\|$-$\|0.01	0.02	\|$-$\|0.06	\|$-$\|0.01	\|$-$\|0.02
	SD(⁠\|$\hat{\beta}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.21	0.29	0.41	0.22	0.21
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.20	0.30	0.39	0.22	0.22
	95% CP	93.7	95.4	95.1	94.9	95.3	94.7	94.9	96.4	95.4	94.3	95.5

		Response		Death first		Death after 2nd randomization
N	Statistics	\|$\beta_1$\|	\|$\beta_2$\|	\|$\beta_3$\|	\|$\beta_4$\|	\|$\beta_5$\|	\|$\beta_6$\|	\|$\beta_7$\|	\|$\beta_8$\|	\|$\beta_9$\|	\|$\beta_{10}$\|	\|$\beta_{11}$\|
	Truth	\|$-$\|1.5	0.6	\|$-$\|0.5	\|$-$\|1.3	\|$-$\|1.2	\|$-$\|0.7	\|$-$\|1	1.8	\|$-$\|2.4	\|$-$\|0.5	\|$-$\|1
200	Bias	\|$-$\|0.01	0.01	\|$-$\|0.02	\|$-$\|0.02	\|$-$\|0.05	\|$-$\|0.12	\|$-$\|0.04	0.14	\|$-$\|1.12	\|$-$\|0.05	\|$-$\|0.11
	SD(⁠\|$\hat{\beta}$\|⁠)	0.24	0.25	0.25	0.29	0.54	0.60	0.51	0.76	2.73	0.61	0.51
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.24	0.24	0.24	0.30	0.47	0.57	0.48	0.73	5.54	0.55	0.51
	95% CP	94.8	94.2	93.7	95.9	94.5	94.4	94.4	94.0	97.1	95.4	94.8
500	Bias	\|$-$\|0.01	0.01	0.00	\|$-$\|0.01	\|$-$\|0.01	\|$-$\|0.04	\|$-$\|0.02	0.05	\|$-$\|0.17	\|$-$\|0.02	\|$-$\|0.04
	SD(⁠\|$\hat{\beta}$\|⁠)	0.15	0.15	0.15	0.18	0.26	0.35	0.30	0.46	0.86	0.31	0.31
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.15	0.15	0.15	0.19	0.27	0.34	0.29	0.43	0.73	0.32	0.31
	95% CP	95.7	95.3	94.7	95.5	94.7	94.4	93.3	93.2	96.8	94.8	95.0
1000	Bias	0.00	0.01	\|$-$\|0.01	0.00	0.00	\|$-$\|0.03	\|$-$\|0.01	0.02	\|$-$\|0.06	\|$-$\|0.01	\|$-$\|0.02
	SD(⁠\|$\hat{\beta}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.21	0.29	0.41	0.22	0.21
	Avg(⁠\|$\widehat{\rm SE}$\|⁠)	0.11	0.11	0.10	0.13	0.19	0.24	0.20	0.30	0.39	0.22	0.22
	95% CP	93.7	95.4	95.1	94.9	95.3	94.7	94.9	96.4	95.4	94.3	95.5

SD(⁠|$\hat{\beta}$|⁠): Empirical standard error of estimated |$\hat{\beta}$| coefficients.

Avg(⁠|$\widehat{\rm SE}$|⁠): Average of 1000 estimated standard errors (using Hessian matrix).

95% CP: 95% coverage probability.

In order to examine the robustness of our method under the scenarios with dependent censoring, we repeated the previous simulation study with either response-dependent censoring or covariate-dependent censoring. With response-dependent censoring, for each first-stage responder, we generated a censoring indicator following a Bernoulli distribution with the probability of |$0.15$|⁠, meaning that about 15% of the responders drop out from the study as soon as they respond to the first-stage treatment. The bias of the estimates from this dependent censoring scenario is similar to the bias from the scenarios without dependent censoring mechanism. The standard errors of the estimates from response-dependent censoring scenario slightly increase, which is expected because the observed number of deaths decreases. The results from the response-dependent censoring scenarios are shown in the Table S3 of the Supplementary material available at Biostatistics online.

With covariate-dependent censoring, the censoring time followed a truncated exponential distribution, and its cumulative distribution function can be defined as |$G(t)=\frac{e^{-(\lambda a)}-e^{-(\lambda t)}}{e^{-(\lambda a)}-e^{-(\lambda b)}}$| (⁠|$t\in(a,b)$|⁠), where |$\lambda=0.03+0.03V$|⁠, |$a=3,$| and |$b=20$|⁠. The values of |$a$| and |$b$| define the support of the censoring time, which is comparable to the censoring time from Unif(⁠|$3, 20$|⁠) under the scenarios with independent censoring. The individuals with |$V=1$| were more likely to be censored in this simulation study because the coefficient of |$V$| is positive in the formulation of |$\lambda$|⁠. Therefore, |$37\%$| of individuals with |$V=1$| were censored before death versus |$28\%$| among individuals with |$V=0$|⁠. The patterns of bias and standard error from this covariate-dependent censoring scenario were similar to the patterns from the response-dependent censoring scenario. When sample size was 200, the bias and estimated standard error of |$\beta_9$| were even larger because of the small number of deaths observed from the individuals with |$V=1$|⁠. The results are shown in the Table S4 of the Supplementary material available at Biostatistics online.

The appropriate sample size for this SMART design is determined by many factors, including the randomization probabilities, first-stage response rate, censoring mechanisms, and death rates. In our scenario, where randomization probabilities are equal, first-stage response rate is about 38%, death rates are 33% before time |$t^{nr}$| and 62% among people who received second-stage treatment, the sample size of |$n=400$| can guarantee that the relative biases are less than 10% (see the result in the Table S2 of the Supplementary material available at Biostatistics online). For relative biases less than 3%, |$n=1000$| is necessary.

3.2. Predicted survival rates: comparison of the joint model with existing methods

As noted in Section 1, a DTR includes both responders and non-responders that are consistent with the regimens. In design II, regimen |$\{A_1B_1C_1\}$| includes the |$A_1$| responders who get |$B_1$| and the |$A_1$| non-responders who get |$C_1$|⁠. However, the traditional Kaplan–Meier estimation based on the combined sample of these two subgroups is biased because not all responders received |$B_1$| (some got |$B_2$|⁠), thus naive pooled analysis overweights the non-responders. To correct for this bias, the responders who receives |$B_1$| have to represent themselves and account for those that are re-randomized to |$B_2$|⁠. The same bias has to be corrected in design I if randomization probabilities |$P$| or |$Q$| are not equal (1/2 for each treatment arm).

Lunceford and others (2002) corrected this bias by using the randomization probabilities to assign weight for each patient’s contribution to DTR {|$A_jB_kC_l$|} as |$W_{jkl}={\rm 1}\kern-0.24em{\rm I}(A_j)\left\lbrace\frac{\Delta^r{\rm 1}\kern-0.24em{\rm I}(B_k)}{P(B_k\mid A_j)} +\frac{\Delta^{nr}{\rm 1}\kern-0.24em{\rm I}(C_l)}{P(C_l\mid A_j)}\right\rbrace$|⁠. Patients who died before response were classified as non-responders. Using these inverse probability weights, they proposed a survival estimator for each regimen as

$$\begin{equation} \hat{S}_{jkl}^{\rm IPWE}(t) = 1- n^{-1}\sum_{i=1}^{n}\frac{\Delta_i^dW_{jkli}} {\hat{K}(t)} {\rm 1}\kern-0.24em{\rm I}(T_{2i}\leq t)+\frac{\alpha}{n}\sum_{i=1}^{n}\frac{\Delta_{i}^{d}} {\hat{K}(t)} (W_{jkli}-1), \end{equation}$$

(3.1)

here |$\hat{K}(t)$| is the Kaplan–Meier estimate from censoring time to adjust for censored patients and |$\alpha$| is chosen so that the estimator has minimum variance.

Guo and Tsiatis (2005) made this weighting scheme time-dependent by defining the response indicator |$\Delta_i^r(t)=\Delta^r_i{\rm 1}\kern-0.24em{\rm I}(T_i^r \leq t)$|⁠. This means that before the time of response, responders to |$A_1$| who would eventually be re-randomized to |$B_2$| are also consistent with DTR {|$A_1B_1C_1$|}. Using this weight, they proposed the weighted risk set estimator (WRSE) for the survival of regimen |$A_jB_kC_l$| as follows:

$$\begin{equation} \hat{S}_{jkl}^{\rm WRSE}(t) = \exp\left\lbrace -\int_{0}^{t} \frac{\sum_{i=1}^{n}W_{jkli}(u){\rm d}N_i(u)}{\sum_{i=1}^{n}W_{jkli}(u)Y_i(u)}\right\rbrace\!. \end{equation}$$

(3.2)

It is easy to see that if we ignore weights |$W_{jkl}$| and |$\hat{K}(t)$|⁠, IPWE is the proportion of subjects that have not yet failed before |$t$|⁠, and WRSE is a Nelson–Aalen survival estimator. The disadvantage of both estimates is that they can only adjust for auxiliary covariates, such as |$V$|⁠, via weighting using propensity scores. Our joint model can evaluate the effect modification of |$V$|⁠.

Tang and Wahed (2015) proposed a model based on Cox and IPW that can incorporate baseline covariates. Using their proposed cumulative hazards, we calculate the survival prediction for each DTR and compare against our model in the following simulation. However, this model requires the assumption that the effect of |$V$| is constant across different DTRs (i.e. there are no interactions between |$V$| and DTRs), which limits the use of this model.

The advantage of our proposed joint model is that not only do we have a model to parameterize how auxiliary covariates and treatments in each stage affect time-to-response and time-to-death and their interaction, but also that we can derive closed-form expressions for the marginal survival function (3.3) of each DTR, analogous to (3.1) and (3.2), without having to reconfigure the weighting scheme for different SMART designs. Thus, we specify the joint survival model as:

$$\begin{align} \hat{S}^{\rm joint}(t)&= \begin{cases} \displaystyle \int_{0}^{t}\widehat{f_rS_d}(r,t){\rm d}r+\widehat{S_rS_d}(t,t), & t<t^{nr}\\ \displaystyle \int_{0}^{t^{nr}}\widehat{f_rS_d}(r,t){\rm d}r+\widehat{S_rS_d}(t^{nr},t), & t\geq t^{nr} \end{cases} \nonumber \\ &= \begin{cases} \displaystyle \int_{0}^{t} \widehat{h_{r}}(r)\widehat{\theta} e^{-\widehat{\theta} \widehat{H_{r}}(r)-(\widehat{\gamma}-\widehat{\eta})\widehat{H_{d}}(r)-\widehat{\eta} \widehat{H_{d}}(t)}{\rm d}r+ e^{-\widehat{\theta} \widehat{H_{r}}(t)-\widehat{\gamma} \widehat{H_{d}}(t)}, & t<t^{nr}\\ \displaystyle \int_{0}^{t^{nr}} \widehat{h_{r}}(r)\widehat{\theta} e^{-\widehat{\theta} \widehat{H_{r}}(r)-(\widehat{\gamma}-\widehat{\eta})\widehat{H_{d}}(r)-\widehat{\eta} \widehat{H_{d}}(t)}{\rm d}r + e^{-\widehat{\theta} \widehat{H_{r}}(t^{nr})-\widehat{H_{d}}(t^{nr})(\widehat{\gamma}-\widehat{\mu})-\widehat{H_{d}}(t)\widehat{\mu}}, & t\geq t^{nr}. \end{cases} \end{align}$$

(3.3)

here the |$\hat{\theta}$|⁠, |$\hat{\gamma}$|⁠, |$\hat{\eta,}$| and |$\hat{\mu}$| can be calculated from |$\boldsymbol{\hat{\beta}}$|⁠. |$\boldsymbol{\hat{\beta}}$|⁠, |$\widehat{H_{d}}$|⁠, and |$\widehat{H_{r}}$| are obtained using the profile likelihood approach mentioned previously.

Another limitation of these previous models is that non-responders are assumed not to receive any treatment in the second stage (Tang and Melguizo, 2015). However, in our joint model, the inclusion of |$\beta_{10}$| and |$\beta_{11}$| allows for an additional effect of second-stage treatment. To match this model assumption of the previous approaches in our simulation studies, we set |$\beta_{10}=\beta_{11}=0$|⁠.

Figure 2(a) plots the survival estimates for the four DTRs in Design II using our proposed model and non-parametric estimators from Lunceford and others (2002) and Guo and Tsiatis (2005). The latter two are implemented in R package “DTR” (Tang and Melguizo, 2015). The solid lines (in dark and light gray colors) denote the predicted survival rates given each DTR and covariate |$V$| (3.3) from our proposed model. The dashed and dotted lines denote the IPWE (3.1) and WRSE (3.2), respectively. The latter two estimators do not have a way to explicitly model |$V$| outside the weights, thus the estimators only have an overall prediction for each regimen.

$(a) Predicted survival probabilities for four dynamic treatment regimens (DTRs) in Design II shown in Figure 1: Joint modeling (solid lines, 3.3) compared with the inverse probability of treatment weighting estimator (IPWE, black dashed lines, 3.1) and the weighted risk set estimator (WRSE, black dotted lines, 3.2). (b) Predicted survival rates for four dynamic treatment regimens (DTRs) in Design II when a covariate $V=1$. The left plot shows the survival curves from our proposed joint model and the model from Tang and Wahed (2015). Each pair of the overlapping curves of the DTRs is labeled with {$A_1B_1C_1$}, {$A_1B_2C_1$} and {$A_2B_1C_1$}, respectively. The non-overlapping curves of {$A_2B_2C_1$} are labeled with arrows. The right plot shows the survival curves from our proposed joint model and the subset analysis using WRSE and IPWE (i.e. subset data to $V=1$ before model fit). The labeling is similar to the previous plot.$

Fig. 2.

(a) Predicted survival probabilities for four dynamic treatment regimens (DTRs) in Design II shown in Figure 1: Joint modeling (solid lines, 3.3) compared with the inverse probability of treatment weighting estimator (IPWE, black dashed lines, 3.1) and the weighted risk set estimator (WRSE, black dotted lines, 3.2). (b) Predicted survival rates for four dynamic treatment regimens (DTRs) in Design II when a covariate |$V=1$|⁠. The left plot shows the survival curves from our proposed joint model and the model from Tang and Wahed (2015). Each pair of the overlapping curves of the DTRs is labeled with {|$A_1B_1C_1$|}, {|$A_1B_2C_1$|} and {|$A_2B_1C_1$|}, respectively. The non-overlapping curves of {|$A_2B_2C_1$|} are labeled with arrows. The right plot shows the survival curves from our proposed joint model and the subset analysis using WRSE and IPWE (i.e. subset data to |$V=1$| before model fit). The labeling is similar to the previous plot.

It is easy to see why the ability to adjust for auxiliary covariates such as |$V$| at baseline (and potentially other relevant covariates collected up until second randomization) is important, especially when effect modification of |$V$| and treatments exists. In the top left panel of Figure 2(a), the effect of DTR {|$A_1B_1C_1$|} is identical for both subgroups of |$V$|⁠, thus our survival prediction conditioning on |$V$| overlaps that of IPWE and WRSE. However, the survival curves from our joint model and the curves from IPWE and WRSE no longer overlap for DTRs {|$A_1B_2C_1$|}, {|$A_2B_1C_1$|}, and {|$A_2B_2C_1$|}. We can see that different strata of |$V$| perform differently across the DTRs. Group |$V=0$| has the highest survival rates for DTRs {|$A_1B_2C_1$|} and {|$A_2B_1C_1$|} in general, while group |$V=1$| benefits most from DTR {|$A_2B_2C_1$|}. This information would be lost if we ignored |$V$| and just used IPWE and WRSE. If we average our model’s survival prediction over |$V$|⁠, we get consistent prediction with WRSE. IPWE uses time-independent weight and ignores the information from people who follow {|$A_2B_1C_1$|} (e.g. assigning weight 0) in the survival estimate for {|$A_2B_2C_1$|}. WRSE, on the other hand, assigns non-zero weight for {|$A_2B_1C_1$|} during the period before second randomization (i.e. when only |$A_2$| is administered). Our joint model naturally achieves this advantage of WRSE.

Here, we note that the survival curves from our joint model are determined by both hazards of response and death. We first refer to the curves of {|$A_1B_1C_1$|} from Figure 2(a) to explain this. |$\beta_3=-0.5$| suggests that the people with |$V=1$| have a lower hazard of death, implying that their survival curve should be above the survival curve of |$V=0$|⁠. However, |$\beta_1=-1.5$| also suggests that the people with |$V=1$| have a lower hazard of response, meaning that they tend to respond more slowly to the first-stage treatment than the people with |$V=0$|⁠. Slower response to first-stage treatment also implies that people are more likely to be considered non-responders. As a result, the people with |$V=1$| receive second-stage treatment later than the people with |$V=0$|⁠, which leads to a temporarily higher hazard of death. The two opposite effects cancel out and result in the overlapping curves of |$V=1$| and |$V=0$| from our joint model. We can also observe this phenomenon using the survival curves from DTR {|$A_2B_1C_1$|}. At |$t\leq 3$|⁠, the survival curve of |$V=1$| is above the other curve of |$V=0$| because of the lower hazard of death for |$V=1$|⁠. However, at |$t\geq 3$|⁠, since the people with |$V=0$| tend to receive second-stage treatment |$B_1$| or |$B_2$| earlier, so their survival curve is above that of |$V=1$| despite that the hazard of death of |$V=1$| is lower than that of |$V=0$|⁠. When the interaction of the covariate |$V$| and treatments exists for the DTR {|$A_2B_2C_1$|} (⁠|$\beta_9=-2.4$|⁠), the hazard of death quickly reduces if people respond to the first-stage treatment such that the survival curve of |$V=1$| is above |$V=0$| (Figure 2(a) bottom right plot). This interaction between |$V$| and treatment cancels out the effect of slower response of |$V=1$| on survival probability.

Unlike our joint model, the Cox model proposed by Tang and Wahed (2015) only performs well under certain conditions where there is no interaction between the covariate |$V$| and treatments. In simulations where there is no interaction, the survival prediction from Tang and Wahed (2015) is consistent with our joint model (Figure S1 of the Supplementary material available at Biostatistics online). When there is an interaction, the model in Tang and Wahed (2015) leads to inaccurate survival predictions shown in the left plot of Figure 2(b). The two curves of the DTR {|$A_2B_2C_1$|} from our proposed joint model and Tang and Wahed’s model, marked by the arrows, do not overlap because an interaction between |$V$| and treatments exists.

We also investigated whether the curves from our proposed joint model overlap with the curves from the WRSE and IPWE obtained from subset analysis when |$V=1$| (Figure 2(b) right). The curves from WRSE are consistent with the curves from our joint model for all DTRs, but the curve from the IPWE can deviate from the curve of that from our joint model when a strong second-stage treatment effect exists compared to the first-stage treatment, which can be seen from the expression |$(\beta_{5}+\beta_{6}X+\beta_{7}Z^r+\beta_{8}XZ^r+\beta_{9}XZ^rV)$| in (2.2). In our simulation study, the true values of this expression for four DTRs {|$A_1B_1C_1$|}, {|$A_1B_2C_1$|}, {|$A_2B_1C_1$|}, and {|$A_2B_2C_1$|} when |$V=1$| are |$-1.2$|⁠, |$-1.9$|⁠, |$-2.2$|⁠, |$-2.8$|⁠, respectively, which agrees with the Figure 2(b) right that the IPWE curve of {|$A_2B_2C_1$|} deviates most from the other curves of the same DTR. The underlying reason is that IPWE does not consider the time to the first-stage responses, which results in its weaker ability to account for the additional second-stage treatment effect.

Although subset analyses from the WRSE and IPWE may agree with our proposed model, the survival estimates from our proposed model are more efficient. Using the simulation described in Section 3.1 with |$\beta_1=\beta_3=\beta_9=0$| (as WRSE and IPWE do not adjust for baseline covariate |$V$|⁠) at sample size |$n=500$|⁠, we estimate the variance of survival probabilities from the joint model using 1000 bootstrapped samples and compared them against the variance estimation from Lunceford and others (2002) and Guo and Tsiatis (2005). Table 2 shows the relative efficiency of the proposed joint model compared to the other methods at seven timepoints. The relative efficiency is less than 1 for all 4 DTRs across time |$t$| (a few survive after |$t=6$|⁠), strengthening the use of our proposed model.

Table 2.

Relative efficiency |$\widehat{\rm Var}[\hat{S}_{jkl}^{\rm joint}(t)]/\widehat{\rm Var}[\hat{S}_{jkl}^{\rm WRSE}(t)]$| and |$\widehat{\rm Var}[\hat{S}_{jkl}^{\rm joint}(t)]/\widehat{\rm Var}[\hat{S}_{jkl}^{\rm IPWE}(t)]$| for each DTR {|$A_jB_kC$|}, j,k = 1,2, where joint represents the proposed joint model, WRSE represents the weighted risk set estimator (Guo and Tsiatis, 2005) and IPWE represents the inverse probability of treatment weight estimators (Lunceford and others, 2002) at time |$t$|⁠.

	Joint model vs. WRSE				Joint model vs. IPWE
t	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|
1	0.79	0.78	0.97	0.74	0.78	0.77	0.96	0.73
2	0.80	0.76	0.51	0.33	0.78	0.74	0.50	0.33
3	0.81	0.73	0.53	0.33	0.77	0.70	0.52	0.33
4	0.84	0.73	0.62	0.40	0.78	0.69	0.60	0.38
5	0.83	0.78	0.72	0.51	0.79	0.74	0.69	0.48
6	0.74	0.82	0.80	0.69	0.75	0.79	0.77	0.64

	Joint model vs. WRSE				Joint model vs. IPWE
t	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|
1	0.79	0.78	0.97	0.74	0.78	0.77	0.96	0.73
2	0.80	0.76	0.51	0.33	0.78	0.74	0.50	0.33
3	0.81	0.73	0.53	0.33	0.77	0.70	0.52	0.33
4	0.84	0.73	0.62	0.40	0.78	0.69	0.60	0.38
5	0.83	0.78	0.72	0.51	0.79	0.74	0.69	0.48
6	0.74	0.82	0.80	0.69	0.75	0.79	0.77	0.64

Table 2.

Open in new tab Download slide

Relative efficiency |$\widehat{\rm Var}[\hat{S}_{jkl}^{\rm joint}(t)]/\widehat{\rm Var}[\hat{S}_{jkl}^{\rm WRSE}(t)]$| and |$\widehat{\rm Var}[\hat{S}_{jkl}^{\rm joint}(t)]/\widehat{\rm Var}[\hat{S}_{jkl}^{\rm IPWE}(t)]$| for each DTR {|$A_jB_kC$|}, j,k = 1,2, where joint represents the proposed joint model, WRSE represents the weighted risk set estimator (Guo and Tsiatis, 2005) and IPWE represents the inverse probability of treatment weight estimators (Lunceford and others, 2002) at time |$t$|⁠.

	Joint model vs. WRSE				Joint model vs. IPWE
t	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|
1	0.79	0.78	0.97	0.74	0.78	0.77	0.96	0.73
2	0.80	0.76	0.51	0.33	0.78	0.74	0.50	0.33
3	0.81	0.73	0.53	0.33	0.77	0.70	0.52	0.33
4	0.84	0.73	0.62	0.40	0.78	0.69	0.60	0.38
5	0.83	0.78	0.72	0.51	0.79	0.74	0.69	0.48
6	0.74	0.82	0.80	0.69	0.75	0.79	0.77	0.64

	Joint model vs. WRSE				Joint model vs. IPWE
t	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|	\|$A_1B_1C_1$\|	\|$A_1B_2C_1$\|	\|$A_2B_1C_1$\|	\|$A_2B_2C_1$\|
1	0.79	0.78	0.97	0.74	0.78	0.77	0.96	0.73
2	0.80	0.76	0.51	0.33	0.78	0.74	0.50	0.33
3	0.81	0.73	0.53	0.33	0.77	0.70	0.52	0.33
4	0.84	0.73	0.62	0.40	0.78	0.69	0.60	0.38
5	0.83	0.78	0.72	0.51	0.79	0.74	0.69	0.48
6	0.74	0.82	0.80	0.69	0.75	0.79	0.77	0.64

4. Multiple comparisons with the best (MCB)

4.1. Methods

In SMART designs, DTRs are embedded within the design and their comparisons are of interest. For example, there are four embedded DTRs in design II which could lead to six pairwise comparisons. Design I has eight DTRs which would require 28 pairwise comparisons. As the number of DTRs increases, standard multiple comparison methods (e.g. Bonferroni correction) begin to lose statistical power. The ultimate goal of a SMART may be to identify the best DTR at time |$t$| (i.e. the DTR with the highest estimated survival) of all embedded DTRs or to find a set of best DTRs if their differences are minimal. Multiple comparisons with the best (MCB) (Hsu, 1996) was introduced to address this issue using SMART data with continuous outcomes (Ertefaie and others, 2016), but this method has not been extended to address survival outcomes in the SMART setting.

For ease of notation, we use |$\hat{S}_m$| to denote |$\hat{S}(t)$| for DTR |$m$|⁠, where |$m=1,\ldots,M$| and |$t$| is the time at which we want to compare all DTR survival probabilities. Let |$\mathfrak{B}$| be the set including the DTR that results in best |$\hat{S}(t)$| and other DTRs that cannot be differentiated from the former at a certain threshold |$c$|⁠. If we plot the survival rate from each DTR at a particular time of interest (Figure S2 of the Supplementary material available at Biostatistics online), regimen |$l$| with highest survival rate should be selected for inclusion in set |$\mathfrak{B}$|⁠. Regimen |$m$| that is very close to the best is also of interest (e.g. it may be less expensive, easier to administer, or have fewer side effects). Therefore, we want to implement MCB to identify DTRs at time |$t$| within a distance |$c$| from the best one.

In Hsu (1996), the threshold |$c_m$| for each DTR |$m$| can be calculated given known covariance matrix |$\Sigma$|⁠. In a SMART, this structure is unknown due to post-baseline variables (response status or another variable available at second-line treatment initiation). This was addressed in Ertefaie and others (2016) Theorem 4.1, justifying using estimated |$\hat{\Sigma}$| in the derivation of |$c_m$|⁠. The steps are as follows: since regimens with different first-line treatments are independent, their covariances are 0. Thus, the vector of survival probabilities at time |$t$| for all regimens |$\boldsymbol{\hat{S}}=(\hat{S}_1(t),\ldots,\hat{S}_M(t))$| has the following covariance matrix

$$\begin{equation} \Sigma= \begin{bmatrix} \Sigma_1 & 0 \\ & \Sigma_2 \\ \end{bmatrix}\!, \end{equation}$$

(4.1)

here |$\Sigma_1$| is the covariance matrix for |$\hat{S}(t)$| from DTRs that start with first-line treatment |$A_1$| and |$\Sigma_2$| is from DTRs that start with |$A_2$|⁠. In design I, |$\Sigma_1$| and |$\Sigma_2$| are |$4 \times 4$| matrices. In design II, they are both |$2 \times 2$|⁠. In design III, |$\Sigma_1$| is |$2 \times 2$| and |$\Sigma_2$| is a scalar. Denote |$\sigma_{m,l}$| as the element of |$\Sigma_1$| and |$\Sigma_2$| (i.e. the covariance between |$\hat{S}_{m}(t)$| and |$\hat{S}_{l}(t)$|⁠).

The set of best DTRs is |$\hat{\mathfrak{B}}=\{m\mid (\hat{S}_{m} - \hat{S}_{l})/\sigma_{ml}^{\star} \geq c_{m},\forall l\neq m\}$|⁠, where

$$\begin{equation} \hat{\sigma}_{ml}^{\star}=\sqrt{\text{var}(\hat{S}_m-\hat{S}_l)}=\sqrt{(\hat{\sigma}_m^2+\hat{\sigma}_l^2-2\hat{\sigma}_{m,l})}. \end{equation}$$

(4.2)

A natural choice for |$c_m$| is a value that satisfies |$\alpha$| Type I error rate (i.e. if the differences between DTRs are 0, each has |$(1-\alpha)$| chance of being included in |$\hat{\mathfrak{B}}$|⁠). This means |$c_m$| satisfies

$$\begin{align*} -\alpha=&P(Z_m \geq Z_l - c_m\sigma_{ml}^{\star}, \forall l\neq m) \\ &P(c_m \geq (Z_l-Z_m)/\sigma_{ml}^{\star}, \forall l\neq m) \\ &\int P\left[c_m \geq \max_{l\neq m}\left(\frac{Z_l-z}{\sigma_{ml}^{\star}}\right)\right]{\rm d}\phi(z), \end{align*}$$

here |$Z_{1},\ldots,Z_{M}$| are |$\text{MVN}(0,\Sigma)$|⁠, and |$\phi(z)$| is the cumulative distribution function of |$Z_m$|⁠. The procedure to find |$c_m$| is:

(1) Estimate the bootstrapped |$\hat{\Sigma}$| between DTR survival rates at any time |$t$| of interest.
(2) Simulate |$B=1000$| samples of |$Z_{1b},\ldots,Z_{Mb}\sim \text{MVN}(0,\hat{\Sigma})$|⁠.
(3) For each |$b=1,\ldots,B$|⁠, calculate |$\max_{l\neq m}\left[(Z_{lb}-Z_{mb})/\sigma_{ml}^{\star}\right]$|⁠, where |$\sigma_{ml}^{\star}$| can be calculated from |$\hat{\Sigma}$| and (4.2).
(4) |$c_m$| is the |$(1-\alpha){\rm th}$| quantile of the |$(1\times B)$| vector found in previous step.

4.2. Simulation study

We applied MCB to simulated data from SMART design II with four possible DTRs. Implementation of MCB for designs I and III is identical.

The simulation study parameters are as follows: sample size of 500; randomization probabilities were set to 0.5 for both first-line |$X$| and second-line treatments |$Z^r$|⁠; censoring time |$C\sim\text{Unif}(3,20)$|⁠. The hazard of response was |$\lambda_{r}(t)=h_{r}(t)\exp\{\delta X\}$|⁠, where |$h_{r}(t)\sim \text{Weibull}(0.75,4)$|⁠. With |$h_{d}(t)\sim \text{Weibull}(3,4)$|⁠, the hazard of death was

$$\lambda_{d}(t)=h_{d}(t)\exp \left\lbrace -\delta X + \rm 1\kern-0.24em{\rm I}(t>t^r<t_{nr})\left(-\frac{\delta}{4}X-\frac{\delta}{2}Z^r\right)\right\rbrace\!.$$

When |$\delta=0$|⁠, all four DTRs were equally effective. As |$\delta$| increased, the individuals following DTR {|$A_2B_2C_1$|} |$(X=1, Z^r=1)$| responded faster and had lower risk of death (both before and after response) compared to those following other DTRs. For each |$\delta$| between 0 and 1, we simulated 1000 datasets and applied MCB to identify the set of best DTRs. Using survival rates at time |$t=4$| as the metric of comparison, the average number of DTRs in |$\hat{\mathfrak{B}}$| over 1000 datasets was recorded.

Figure S3 of the Supplementary material available at Biostatistics online shows the probabilities that each of the four DTRs was chosen to be in the set of “best” DTRs for different |$\delta$| values. The sum of these probabilities is the expected size of set |$\hat{\mathfrak{B}}$| (ESS). As expected, ESS converged to 1 as the treatment effect, |$\delta$|⁠, increased. At |$\delta=0$|⁠, all regimens were equally beneficial; thus all four DTRs were chosen to be included in |$\hat{\mathfrak{B}}$|⁠. As |$\delta$| increased, {|$A_2B_2C_1$|} became the optimal regimen with the lowest risk of death, making |$\hat{\mathfrak{B}}$| a size of set 1. For interest in the comparison of DTRs at time |$t=4$|⁠, the plot shows that the ESS quickly converged to 1 with {|$A_2B_2C_1$|} being the chosen regimen. At larger |$\delta$|⁠, {|$A_2B_2C_1$|} had better survival rates compared to the other three DTRs.

Lastly, we examined the Type I error rate of not including a DTR in the set |$\mathfrak{B}$| when |$\delta=0$| with sample size of 500. A total of 1000 datasets were generated for this simulation study. The results are shown in the Table S5 of the Supplementary material available at Biostatistics online. The Type I error rates are controlled around the nominal |$\alpha=0.05$|⁠. In particular, when the survival probability is close to either 0 or 1 (i.e. when |$t=1$| or |$t>6$| in the simulation), the Type I error rate is close to 0, meaning that all DTRs are very likely to be included in |$\mathfrak{B}$|⁠. Thus, we should avoid comparing the effect of DTRs at very early or late time points.

5. Application to trial

Stone and others (2001) described a SMART following design II (Figure 1) conducted by Cancer and Leukemia Group B Protocol 8923 (CALGB 8923) to examine the initial treatment effect of granulocyte-macrophage colony-stimulating factor (GM-CSF) following standard chemotherapy on acute myeloid leukemia patients and the effect of mitoxantrone on patients who responded to the initial treatment. A total of 388 patients were randomized to receive either therapy with GM-CSF (⁠|$A_1$|⁠) or standard chemotherapy (⁠|$A_2$|⁠). Patients with complete remission and informed consent to second stage were then re-randomized to only cytarabine (⁠|$B_1$|⁠) or cytarabine plus mitoxantrone (⁠|$B_2$|⁠). The patients who did not experience complete remission were considered non-responders and not given any further treatment (⁠|$C_1$|⁠). The patients were considered censored if they dropped out from the trial during treatment or failed to provide informed consent to second-stage treatment assignment.

The goal of our analysis is to compare survival probabilities of the four DTRs {|$A_1B_1C_1$|}, {|$A_2B_1C_1$|}, {|$A_1B_2C_1$|}, and {|$A_2B_2C_1$|}, where for example, {|$A_1B_1C_1$|} stands for “receive therapy with GM-CSF in the first stage, and if patient achieves complete remission, follow with only cytarabine, otherwise there is no further treatment.” Figure 3(a) shows survival probability curves of the four DTRs stratified by sex, where the probabilities were computed using the joint model. Figure 3(b) shows the survival probability curves of four DTRs from all patients using the WRSE by Guo and Tsiatis (2005). In Figure 3(b), the survival probability of the DTR {|$A_2B_2C_1$|} is slightly higher than the rest of the DTRs which is consistent with the results from Wahed and Tsiatis (2004) that the optimal DTR is “receive standard chemotherapy in the first stage, and if patient achieves complete remission, follow with cytarabine plus motoxantrone, otherwise there is no further treatment”. However, with our joint model, we further notice from Figure 3 that the DTR {|$A_2B_2C_1$|} may be the best only in males because the survival curves of the DTRs {|$A_2B_2C_1$|} and {|$A_2B_1C_1$|} overlap in females. The incorporation of baseline covariates provides us with more information about the factors that may impact DTR effects.

Fig. 3.

(a) The predicted survival probabilities of four dynamic treatment regimens (DTRs) stratified by sex using the proposed joint stage model from CALGB 8923 data. (b) The predicted survival probabilities of four dynamic treatment regimens (DTRs) for all patients using the weighted risk set estimator (WRSE) from CALGB 8923 data.

6. Discussion

We proposed a joint modeling framework to evaluate DTRs that include tailored sequences of treatments resulting from a SMART. Many previously proposed methods for data from a SMART with survival outcomes employ inverse probability weights and either treat survival time as a continuous outcome or compare survival rates between DTRs using non-parametric estimators. Our proposed model is based on Cox models with flexible non-parametric baseline hazards for time-to-response and time on second-line treatment. We demonstrated that our model provides equally unbiased survival estimates as the non-parametric methods, while offering many other advantages.

Compared to the non-parametric estimators, our model offers the following features: (i) adjusts for auxiliary covariates that includes not only baseline information but also potentially new covariates at the end of first-line treatment, (ii) has a mechanism to parameterize/measure each treatment’s effect within the DTR, their interactions, and interactions with auxiliary covariates (without assumptions of these effects within or between DTRs), (iii) easy adaptation between different SMART designs by setting certain |$\beta$|’s to 0, and (iv) the ability to estimate conditional survival functions conditioning on not only the particular DTR of interest, but also on what has been observed so far (e.g. |$\hat{S}(t \mid \text{response at }s)$| or |$\hat{S}(t \mid \text{no response by }s)$|⁠).

The limitation of our model is that the non-parametric estimation of baseline hazards requires larger sample sizes (depending on the total number of unique time points from all subjects in the data) and longer computational time compared to other models where a distribution is assumed. In the setting of small samples, we can fit our model assuming some parametric distributions for the baseline hazards (e.g. Weibull). This will decrease the number of parameters needed to be estimated, yet still offers all of the advantages discussed.

Extension of the method can be made in a few directions. First, the proposed model has common covariates for time-to-response and time-to-death. This assumes that their dependence can be explained by the observed covariates. Further extension to this model such as inclusion of a common latent frailty term can be done to account for the unmeasured association between these two time-to-events. Second, the current model allows for new covariates that are available at second-line treatment initiation. Other time-dependent covariates can also be incorporated. Third, DTRs may have more than two stages (e.g. oncology trials) where multiple line of salvage therapies are available. The joint modeling approach has been implemented for 3 stages, so an analogous extension in the SMART setting is feasible.

The issue of multiple comparisons had not yet been fully addressed for survival data in the SMART setting. This may, in part, be due to the limited number of SMARTs having a survival endpoint that have been conducted and the lack of open access data for those that have. SMART designs with survival endpoints may be particularly challenging to implement based on requiring multiple drug treatments, an intermediate tailoring variable, and the time required to provide a sequence of treatments and monitor the time to event. Moreover, clinical collaborators may not fully understand the concept of DTRs and worry about multiple comparison issues in SMART design. It is especially important to motivate SMART design by interest in DTRs and to clearly communicate the definition of a DTR to clinical collaborators. Specifically, to clarify that each DTR includes two pathways in the SMART design, so that interest is not in the estimation and comparison of each individual treatment pathway, but rather in combinations of pathways which allow for more powerful and clinically meaningful comparisons. In order to identify the optimal DTR(s), SMARTs with many embedded DTRs may still require a large number of comparisons resulting in a loss of statistical power. We, therefore, adapted the previously proposed approach using multiple comparisons with the best for continuous outcomes to compare DTRs with survival outcomes which could help address clinical collaborators’ worries about multiple comparisons and facilitate a discussion on identifying the best DTR to inform clinical practice. We compared each DTR’s predicted survival rate at time |$t$| with the best of others and identified a set of DTRs that contains the true best DTR with a certain type I error rate. We outlined the procedure to find rejection rule that would satisfy such type I error and demonstrated the method with simulations. We are interested in developing a similar multiple comparison method to find the set of DTRs with the best survival across time (i.e. not just at time |$t$|⁠), using restricted mean survival, for example. While we cannot address many of the barriers to implement a SMART design with a survival outcome, we hope that the development and dissemination of methods to analyze DTRs like those in this article will bolster the use of SMART design, inform clinical practice, and provide SMART data for future researchers.

Software

R code to simulate a SMART design II with survival outcomes, fit the joint model, and compare against previous methods is available from the GitHub at https://github.com/ycchao/code_Joint_model_SMART.

Supplementary material

Supplementary materials are available online at http://biostatistics.oxfordjournals.org.

Acknowledgments

Conflict of Interest: None declared.

Funding

Cancer Intervention and Surveillance Modeling Network (CISNET) (U01 CA199338); and University of Michigan Prostate Specialized Program of Research Excellence (SPORE) (P50 CA186786), both from the National Institute of Health/National Cancer Institute.

References

Almirall,

D.

,

DiStefano,

C.

,

Chang,

Y.-C.

,

Shire,

S.

,

Kaiser,

A.

,

Lu,

X.

,

Nahum-Shani,

I.

,

Landa,

R.

,

Mathy,

P.

and

Kasari,

C.

(

2016

).

Longitudinal effects of adaptive interventions with a speech-generating device in minimally verbal children with ASD

.

Journal of Clinical Child & Adolescent Psychology

45

,

442

–

456

.

Auyeung,

S. F.

,

Long,

Q.

,

Royster,

E. B.

,

Murthy,

S.

,

McNutt,

M. D.

,

Lawson,

D.

,

Miller,

A.

,

Manatunga,

A.

and

Musselman,

D. L.

(

2009

).

Sequential multiple-assignment randomized trial design of neurobehavioral treatment for patients with metastatic malignant melanoma undergoing high-dose interferon-alpha therapy

.

Clinical Trials

6

,

480

–

490

.

Chen,

Y.-H.

(

2009

).

Weighted Breslow-type and maximum likelihood estimation in semiparametric transformation models

.

Biometrika

96

,

591

–

600

.

Díaz,

I.

,

Savenkov,

O.

and

Ballman,

K.

(

2018

).

Targeted learning ensembles for optimal individualized treatment rules with time-to-event outcomes

.

Biometrika

105

,

723

–

738

.

Ertefaie,

A.

,

Wu,

T.

,

Lynch,

K. G.

and

Nahum-Shani,

I.

(

2016

).

Identifying a set that contains the best dynamic treatment regimes

.

Biostatistics

17

,

135

.

Feng,

W.

and

Wahed,

A. S.

(

2008

).

Supremum weighted log-rank test and sample size for comparing two-stage adaptive treatment strategies

.

Biometrika

95

,

695

–

707

.

Feng,

Wentao

and

Wahed,

Abdus S.

(

2009

).

Sample size for two-stage studies with maintenance therapy

.

Statistics in Medicine

28

,

2028

–

2041

.

Fu,

S. S.

,

Rothman,

A. J.

,

Vock,

D. M.

,

Lindgren,

B.

,

Almirall,

D.

,

Begnaud,

A.

,

Melzer,

A.

,

Schertz,

K.

,

Glaeser,

S.

,

Hammett,

P.

and others. (

2017

).

Program for lung cancer screening and tobacco cessation: study protocol of a sequential, multiple assignment, randomized trial

.

Contemporary Clinical Trials

60

,

86

–

95

.

Guo,

X.

and

Tsiatis,

A.

(

2005

).

A weighted risk set estimator for survival distributions in two-stage randomization designs with censored survival data

.

The International Journal of Biostatistics

1

,

1

–

15

.

Hager,

R.

,

Tsiatis,

A. A.

and

Davidian,

M.

(

2018

).

Optimal two-stage dynamic treatment regimes from a classification perspective with censored survival data

.

Biometrics

74

,

1180

–

1192

.

Hsu,

J.

(

1996

).

Multiple Comparisons: Theory and Methods

.

Chapman and Hall/CRC

,

Boca Raton, Florida, USA

.

Google Preview

Hu,

C.

and

Tsodikov,

A.

(

2014

).

Joint modeling approach for semicompeting risks data with missing nonterminal event status

.

Lifetime Data Analysis

20

,

563

–

583

.

Jiang,

R.

,

Lu,

W.

,

Song,

R.

,

Hudgens,

M. G.

and

Naprvavnik,

S.

(

2017

).

Doubly robust estimation of optimal treatment regimes for survival data-with application to an HIV/AIDS study

.

The Annals of Applied Statistics

11

,

1763

–

1786

.

PubMed

Kelleher,

S. A

,

Dorfman,

C. S.

,

Vilardaga,

J. C. P.

,

Majestic,

C.

,

Winger,

J.

,

Gandhi,

V.

,

Nunez,

C.

,

Van Denburg,

A.

,

Shelby,

R. A

,

Reed,

S. D.

and others. (

2017

).

Optimizing delivery of a behavioral pain intervention in cancer patients using a sequential multiple assignment randomized trial SMART

.

Contemporary Clinical Trials

57

,

51

–

57

.

Kidwell,

K. M.

and

Wahed,

A. S.

(

2013

).

Weighted log-rank statistic to compare shared-path adaptive treatment strategies

.

Biostatistics

14

,

299

–

312

.

Kim,

J. S.

,

Asher,

C. A.

,

Burkhauser,

M.

,

Mesite,

L.

and

Leyva,

D.

(

2019

).

Using a sequential multiple assignment randomized trial (SMART) to develop an adaptive K–2 literacy intervention with personalized print texts and app-based digital activities

.

AERA Open

5

,

2332858419872701

.

Ko,

J. H.

and

Wahed,

A. S.

(

2012

).

Up-front versus sequential randomizations for inference on adaptive treatment strategies

.

Statistics in Medicine

31

,

812

–

830

.

Lavori,

P. W.

and

Dawson,

R.

(

2004

).

Dynamic treatment regimes: practical design considerations

.

Clinical Trials

1

,

9

–

20

.

Lavori,

P. W.

,

Dawson,

R.

and

Rush,

A. J.

(

2000

).

Flexible treatment strategies in chronic disease: clinical and research implications

.

Biological Psychiatry

48

,

605

–

614

.

Li,

Z.

and

Murphy,

S. A.

(

2011

).

Sample size formulae for two-stage randomized trials with survival outcomes

.

Biometrika

98

,

503

–

518

.

Lokhnygina,

Y.

and

Helterbrand,

J. D.

(

2007

).

Cox regression methods for two-stage randomization designs

.

Biometrics

63

,

422

–

428

.

Lunceford,

J. K.

,

Davidian,

M.

and

Tsiatis,

A. A.

(

2002

).

Estimation of survival distributions of treatment policies in two-stage randomization designs in clinical trials

.

Biometrics

58

,

48

–

57

.

Miyahara,

S.

and

Wahed,

A. S.

(

2010

).

Weighted Kaplan–Meier estimators for two-stage treatment regimes

.

Statistics in Medicine

29

,

2581

–

2591

.

Murphy,

S. A.

(

2005

).

An experimental design for the development of adaptive treatment strategies

.

Statistics in Medicine

24

,

1455

–

1481

.

Murphy,

S. A.

,

van der Laan,

M. J.

and

Robins,

J. M.

(

2001

).

Marginal mean models for dynamic regimes

.

Journal of the American Statistical Association

96

,

1410

–

1423

.

Naar,

S.

,

Ellis,

D.

,

Idalski Carcone,

A.

,

Jacques-Tiura,

A. J.

,

Cunningham,

P.

,

Templin,

T.

,

Hartlieb,

K. B.

and

Jen,

K. L. C.

(

2019

).

Outcomes from a sequential multiple assignment randomized trial of weight loss strategies for African American adolescents with obesity

.

Annals of Behavioral Medicine

53

,

928

–

938

.

Ruppert,

A. S.

,

Yin,

J.

,

Davidian,

M.

,

Tsiatis,

A. A.

,

Byrd,

J. C.

,

Woyach,

J. A.

and

Mandrekar,

S. J.

(

2019

).

Application of a sequential multiple assignment randomized trial (SMART) design in older patients with chronic lymphocytic leukemia

.

Annals of Oncology

30

,

542

–

550

.

Schmitz,

J. M.

,

Stotts,

A. L.

,

Vujanovic,

A. A.

,

Weaver,

M. F.

,

Yoon,

J. H.

,

Vincent,

J.

and

Green,

C. E.

(

2018

).

A sequential multiple assignment randomized trial for cocaine cessation and relapse prevention: tailoring treatment to the individual

.

Contemporary Clinical Trials

65

,

109

–

115

.

Stone,

R. M.

,

Berg,

D. T.

,

George,

S. L.

,

Dodge,

R. K.

,

Paciucci,

P. A.

,

Schulman,

P. P.

,

Lee,

E. J.

,

Moore,

J. O.

,

Powell,

B. L.

,

Baer,

M. R.

and others. (

2001

).

Postremission therapy in older patients with de novo acute myeloid leukemia: a randomized trial comparing mitoxantrone and intermediate-dose cytarabine with standard-dose cytarabine

.

Blood

98

,

548

–

553

.

Tang,

X.

and

Melguizo,

M.

(

2015

).

DTR: an R package for estimation and comparison of survival outcomes of dynamic treatment regimes

.

Journal of Statistical Software

65

,

1

–

28

.

Tang,

X.

and

Wahed,

A. S.

(

2015

).

Cumulative hazard ratio estimation for treatment regimes in sequentially randomized clinical trials

.

Statistics in Biosciences

7

,

1

–

18

.

Thall,

P. F.

,

Wooten,

L. H.

,

Logothetis,

C. J.

,

Millikan,

R. E.

and

Tannir,

N. M.

(

2007

).

Bayesian and frequentist two-stage treatment strategies based on sequential failure times subject to interval censoring

.

Statistics in Medicine

26

,

4687

–

4702

.

Tran,

Q.

,

Kidwell,

K. M.

and

Tsodikov,

A.

(

2018

).

A joint model of cancer incidence, metastasis, and mortality

.

Lifetime Data Analysis

24

,

385

–

406

.

Wahed,

Abdus S.

(

2010

).

Inference for two-stage adaptive treatment strategies using mixture distributions

.

Journal of the Royal Statistical Society: Series C (Applied Statistics)

59

,

1

–

18

.

Wahed,

A. S.

and

Thall,

P. F.

(

2013

).

Evaluating joint effects of induction–salvage treatment regimes on overall survival in acute leukaemia

.

Journal of the Royal Statistical Society: Series C (Applied Statistics)

62

,

67

–

83

.

PubMed

Wahed,

A. S.

and

Tsiatis,

A. A.

(

2004

).

Optimal estimator for the survival distribution and related quantities for treatment policies in two-stage randomization designs in clinical trials

.

Biometrics

60

,

124

–

133

.

Yavuz,

I.

,

Chng,

Y.

and

Wahed,

A. S.

(

2018

).

Estimating the cumulative incidence function of dynamic treatment regimes

.

Journal of the Royal Statistical Society: Series A (Statistics in Society)

181

,

85

–

106

.

Zhao,

Y.-Q.

,

Zeng,

D.

,

Laber,

E. B.

,

Song,

R.

,

Yuan,

M.

and

Kosorok,

M. R.

(

2014

).

Doubly robust learning for estimating individualized treatment with censored data

.

Biometrika

102

,

151

–

168

.