Evaluating the surrogacy of multiple vaccine-induced immune response biomarkers in HIV vaccine trials

Gaussian conditional distributions, continuous |$W$|⁠, low main effects

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.725–0.7	1	0.9–0.85	0.83	0.725	0.82	0.75	0.005	0.775	0.001	0.775
BI-SS	Lasso	Mice	1	0.9–0.6	0.1	0.9	0.55	0.9	0.4	0.9	0.19	0.9	0.117	0.9
Direct	Lasso	Para	1	—	0.14	—	0.57	—	0.43	—	0.008	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.26	—	0.025	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.725–0.7	1	0.9–0.85	0.83	0.725	0.82	0.75	0.005	0.775	0.001	0.775
BI-SS	Lasso	Mice	1	0.9–0.6	0.1	0.9	0.55	0.9	0.4	0.9	0.19	0.9	0.117	0.9
Direct	Lasso	Para	1	—	0.14	—	0.57	—	0.43	—	0.008	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.26	—	0.025	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Table 1.

Gaussian conditional distributions, continuous |$W$|⁠, low main effects

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.725–0.7	1	0.9–0.85	0.83	0.725	0.82	0.75	0.005	0.775	0.001	0.775
BI-SS	Lasso	Mice	1	0.9–0.6	0.1	0.9	0.55	0.9	0.4	0.9	0.19	0.9	0.117	0.9
Direct	Lasso	Para	1	—	0.14	—	0.57	—	0.43	—	0.008	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.26	—	0.025	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.725–0.7	1	0.9–0.85	0.83	0.725	0.82	0.75	0.005	0.775	0.001	0.775
BI-SS	Lasso	Mice	1	0.9–0.6	0.1	0.9	0.55	0.9	0.4	0.9	0.19	0.9	0.117	0.9
Direct	Lasso	Para	1	—	0.14	—	0.57	—	0.43	—	0.008	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.26	—	0.025	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Table 2.

Gaussian conditional distributions, continuous |$W$|⁠, medium main effects

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9–0.875	0.88	0.725	0.86	0.725	0.011	0.75	0.007	0.75
BI-SS	Lasso	Mice	1	0.9–0.6	0.04	0.9	0.52	0.9	0.36	0.9	0.021	0.875	0.127	0.875
Direct	Lasso	Para	0.98	—	0.12	—	0.55	—	0.41	—	0.009	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.02	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9–0.875	0.88	0.725	0.86	0.725	0.011	0.75	0.007	0.75
BI-SS	Lasso	Mice	1	0.9–0.6	0.04	0.9	0.52	0.9	0.36	0.9	0.021	0.875	0.127	0.875
Direct	Lasso	Para	0.98	—	0.12	—	0.55	—	0.41	—	0.009	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.02	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—

Table 2.

Gaussian conditional distributions, continuous |$W$|⁠, medium main effects

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9–0.875	0.88	0.725	0.86	0.725	0.011	0.75	0.007	0.75
BI-SS	Lasso	Mice	1	0.9–0.6	0.04	0.9	0.52	0.9	0.36	0.9	0.021	0.875	0.127	0.875
Direct	Lasso	Para	0.98	—	0.12	—	0.55	—	0.41	—	0.009	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.02	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9–0.875	0.88	0.725	0.86	0.725	0.011	0.75	0.007	0.75
BI-SS	Lasso	Mice	1	0.9–0.6	0.04	0.9	0.52	0.9	0.36	0.9	0.021	0.875	0.127	0.875
Direct	Lasso	Para	0.98	—	0.12	—	0.55	—	0.41	—	0.009	—	0.006	—
Direct	Lasso	Mice	1	—	0	—	0.5	—	0.33	—	0.02	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—
NoSel	—	Mice	1	—	0	—	0.5	—	0.33	—	0.058	—	0.057	—

Table 3.

Gaussian conditional distributions, discrete |$W$|⁠, medium main effects

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9-0.875	0.89	0.725	0.87	0.725	0.013	0.75	0.011	0.775
		Nonpara	1	0.875–0.6	1	0.9-0.875	1	0.875	1	0.875	0.017	0.9	0.013	0.9
		Mice	1	0.9–0.6	0.14	0.9	0.57	0.9	0.42	0.9	0.015	0.875	0.018	0.875
Direct	Lasso	Para	0.99	—	0.13	—	0.56	—	0.42	—	0.006	—	0.004	—
		Nonpara	1	—	0.11	—	0.55	—	0.41	—	0.004	—	0.003	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.015	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Nonpara	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9-0.875	0.89	0.725	0.87	0.725	0.013	0.75	0.011	0.775
		Nonpara	1	0.875–0.6	1	0.9-0.875	1	0.875	1	0.875	0.017	0.9	0.013	0.9
		Mice	1	0.9–0.6	0.14	0.9	0.57	0.9	0.42	0.9	0.015	0.875	0.018	0.875
Direct	Lasso	Para	0.99	—	0.13	—	0.56	—	0.42	—	0.006	—	0.004	—
		Nonpara	1	—	0.11	—	0.55	—	0.41	—	0.004	—	0.003	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.015	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Nonpara	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Table 3.

Gaussian conditional distributions, discrete |$W$|⁠, medium main effects

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9-0.875	0.89	0.725	0.87	0.725	0.013	0.75	0.011	0.775
		Nonpara	1	0.875–0.6	1	0.9-0.875	1	0.875	1	0.875	0.017	0.9	0.013	0.9
		Mice	1	0.9–0.6	0.14	0.9	0.57	0.9	0.42	0.9	0.015	0.875	0.018	0.875
Direct	Lasso	Para	0.99	—	0.13	—	0.56	—	0.42	—	0.006	—	0.004	—
		Nonpara	1	—	0.11	—	0.55	—	0.41	—	0.004	—	0.003	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.015	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Nonpara	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

Method	Penalty	Impute	\|${w=1}$\|	\|${\pi}$\|	\|${w=0 }$\|	\|${\pi}$\|	\|${w=1/2}$\|	\|${\pi}$\|	\|${w=1/3}$\|	\|${\pi}$\|	Best	\|${\pi}$\|	Best	\|${\pi}$\|
Method	Penalty	Impute	(Sensitivity)		(Specificity)						Misclas		ModErr
BI-SS	Lasso	Para	1	0.675–0.6	1	0.9-0.875	0.89	0.725	0.87	0.725	0.013	0.75	0.011	0.775
		Nonpara	1	0.875–0.6	1	0.9-0.875	1	0.875	1	0.875	0.017	0.9	0.013	0.9
		Mice	1	0.9–0.6	0.14	0.9	0.57	0.9	0.42	0.9	0.015	0.875	0.018	0.875
Direct	Lasso	Para	0.99	—	0.13	—	0.56	—	0.42	—	0.006	—	0.004	—
		Nonpara	1	—	0.11	—	0.55	—	0.41	—	0.004	—	0.003	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.015	—	0.022	—
NoSel	—	Para	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Nonpara	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—
		Mice	1	—	0	—	0.5	—	0.33	—	0.053	—	0.051	—

All variable selection methods show very high sensitivity (⁠|$w=1$|⁠) in retaining the interaction markers in all of the settings, except in some cases (e.g. in Setting 3; see supplementary material Table S1 available at Biostatistics online) for the direct selection method with parametric imputation method.
The BI-SS with LASSO method outperforms the direct selection method in feature selection performance (⁠|$Score_w$| values) such as specificity (when |$w=0$|⁠), and also in the case of other weights |$w<1$| as well. This was seen in every setting (Tables 1–3 and supplementary material Tables S1 and S2 available at Biostatistics online). The no selection method retains every marker in the model and thus has specificity 0 in all settings.
The BI-SS procedure with the parametric imputation method outperforms the BI-SS procedure with imputation using MICE package in every setting, both in feature selection performance |$Score_w$| for all |$w<1$|⁠, as well as for model and misclassification errors (Tables 1–3 and supplementary material Tables S1 and S2 available at Biostatistics online).
The same trend as in three is seen when we compare the parametric imputation method and MICE with the direct selection procedure.
The direct selection method with the parametric imputation method and the corresponding BI-SS procedure consistently achieve the lowest misclassification and model error rates, and their performance remains similar with BI-SS performing slightly better than the direct method in some settings (e.g. Settings 1 and 5; see Table 1 and supplementary material Table S2 available at Biostatistics online), and vice versa in the others (e.g. Settings 2 and 4; see Tables 2 and 3).
The no selection procedure is clearly the worst performing method, with inflated misclassification and model error rates compared to the others, showing the importance of marker selection in this context (Tables 1–3 and supplementary material Tables S1 and S2 available at Biostatistics online).
The nonparametric imputation method, used with BI-SS and the direct selection method in Settings 4 and 5 when |$W$| is discrete, performs quite well: for example, in Setting 4, it achieves better |$Score_w$| performance with BI-SS than the parametric method both when |$w=1/2$| and |$w=1/3$| (see Table 3). However, this method can sometimes show inflated misclassification error, for example, in Setting 4, it achieves the highest misclassification rate out of the three.
In Setting 5, when the biomarker values were exponentially distributed, the parametric imputation of the missing potential marker values were conducted using both (i) the correctly specified parametric model based on exponential distribution (“Exp”), and (ii) the misspecified parametric model based on Gaussian distribution (“Gauss”). All methods (including the one based on the misspecified parametric model) perform quite well with high |$Score_w$| values. The correctly specified parametric model dominates all others in terms of model error rates, and does slightly better than MICE and the misspecified parametric model in misclassification rates, along with the nonparametric method (see supplementary material Table S2 available at Biostatistics online).

6. Real data analysis

We apply our proposed methodology to data from an HIV vaccine trial (RV144). The RV144 trial (n = 16402, randomized with 8200 on placebo and 8202 on canarypox vector vaccine [ALVAC-HIV vCP1521] boosted with gp120 AIDSVAX B/E vaccine) showed an estimated vaccine efficacy of 31.2% for the prevention of HIV-1 infection over 42 months (Rerks-Ngarm and others, 2009). A follow-up immune correlates study assessed vaccine-induced immune responses at peak immunogenicity in 41 vaccinees who acquired HIV-1 infection as compared with 205 vaccinees who did not acquire infection. Among others, six primary variables and eight secondary variables were of particular interest to study (Haynes and others, 2012). However, the RV144 trial does not have a CPV; nor are we aware of any existing trial dataset with a CPV component. Still, to further the discussion about scope of our approach and to plan for future BIP + CPV designs, we have used the available data from RV144 and synthetically produced a CPV component. We consider all six primary and seven secondary (one was removed for collinearity) variables to construct the list of potential biomarkers of interest (see supplementary material Appendix C available at Biostatistics online). Sex (two levels) and baseline self-reported behavioral risk factors (three levels) are used to create |$W$|⁠, a BIP with six levels.

To synthetically create the CPV component, we assume a normal model for |$P(\textbf{S}|Z=0,Y=0,W=W_j)=N_{13}(\mu_{W_j},\Sigma)$|⁠. Since the goal of our proposed method is to identify markers with high surrogacy from a list of many markers, our synthetic CPV component needs to be simulated in a way that allows for one or more of the biomarkers considered in the risk model (supplementary material Appendix C available at Biostatistics online) to be identified as principal surrogates. In Haynes and others (2012), biomarkers “IgA antibodies binding to Env” and “gp70-V1V2 binding” were identified as correlates of risk for HIV, and we generate the synthetic CPV component assuming that these two biomarkers (so, |$J_0=2$|⁠) are also the true correlates of protection that modify the vaccine’s effect on infection. In particular, parameters |$\mu_{W_j}$| for each level |$W:=W_j$| are chosen in a way such that the infection risk conditional on potential biomarker values and intervention status follows a logistic model (as in A4), where nonzero interactions exist only between the above-mentioned biomarkers and the infection status (details can be found in supplementary material Appendix D available at Biostatistics online). The estimation methods follow similarly as before, except that inverse probability weighting (Horvitz and Thompson, 1952) was applied during LASSO to account for the case-control sampling. The chosen risk models for the different approaches (BI-SS, direct selection, and no selection with parametric, nonparametric, and MICE imputation methods) are analyzed and the feature selection performance of the first two (BI-SS and direct) are measured (in their ability to retain the two aforementioned biomarkers and to screen out the rest) at different values of |$\pi$| within the proposed range of 0.6–0.9. A five-fold cross validation is then performed on the observed data, where in each iteration the competing methods are used to estimate the risk model from the four training folds and its predictions are compared with the observed disease levels in the remaining fold to calculate the cross-validated misclassification error of each method at the given values of |$\pi$|⁠. For each measure, we also calculate the 95% bootstrap confidence intervals, based on 100 bootstrap samples of the observed data.

The results of the analysis are presented in Tables 4 (feature selection) and 5 (prediction). For feature selection, we consider IgA and gp70-V1V2 to be the only “true” correlates of protection desired to be selected among the 13 biomarkers. As can be seen from Table 4, the BI-SS procedures with parametric and nonparametric imputation perform quite well in this regard. For |$w=1$|⁠, the BI-SS procedure with each of the three imputation methods yields a perfect (feature selection) score of 1 (meaning both IgA and gp70-V1V2 were correctly chosen by each) for |$\pi=0.6/0.75$|⁠, while only the parametric method achieved a score of 1 for |$\pi=0.9$|⁠. For |$w=0$|⁠, the BI-SS procedure with the nonparametric imputation yields the highest score (0.91), followed by BISS with parametric imputation, with a score 0.82, both for |$\pi=0.9$|⁠. For |$0<w<1$|⁠, the performance of different imputation methods combined with the BI-SS procedure decreases in the following order: parametric, then nonparametric, and then MICE. Also, performance of the BI-SS procedure (with each of the three imputation methods) is never worse than their direct selection counterparts, and for optimal values of |$\pi$| it performs much better the direct method. The cross-validated misclassification error (Table 5) are also the lowest for BI-SS with parametric imputation at |$\pi=0.9$|⁠, followed by at |$\pi=0.75$|⁠. The BI-SS procedures with nonparametric imputation and MICE perform poorly for |$\pi=0.9$| (even worse than when we do not perform any marker selection), most probably because, as their sensitivity scores show in Table 4, they missed picking up one of the two signals more often than not. The No Selection method yields similar results under each imputation method.

Table 4.

Feature selection performance for each method in the RV144 trial (with 95% bootstrap confidence intervals)

Method	Impute	\|${\pi}$\|	\|${w=1}$\|	\|${w=0 }$\|	\|${w=1/2}$\|	\|${w=1/3}$\|
Method	Impute	\|${\pi}$\|	(Sensitivity)	(Specificity)
BI-SS	Para	0.6	1 (0.5–1.0)	0.55 (0.36–0.73)	0.77 (0.48–0.86)	0.70 (0.47–0.82)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.57–0.95)	0.82 (0.59–0.94)
		0.9	1 (0.0–1.0)	0.82 (0.73–1.00)	0.91 (0.41–1.00)	0.88 (0.55–1.00)
	Nonpara	0.6	1 (0.5–1.0)	0.45 (0.36–0.73)	0.73 (0.43–0.82)	0.64 (0.41–0.76)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.52–0.93)	0.82 (0.53–0.90)
		0.9	0.5 (0.0–1.0)	0.91 (0.73–1.00)	0.70 (0.41–1.00)	0.77 (0.55–1.00)
	Mice	0.6	1 (0.5–1.0)	0.09 (0.09–0.30)	0.55 (0.53–0.65)	0.39 (0.38–0.53)
		0.75	1 (0.5–1.0)	0.36 (0.19–0.55)	0.68 (0.34–0.77)	0.58 (0.29–0.70)
		0.9	0.5 (0.5–1.0)	0.64 (0.45–0.82)	0.57 (0.48–0.86)	0.59 (0.47–0.82)
Direct	Para	—	0.5 (0.0–0.5)	0.55 (0.45–0.82)	0.53 (0.27–0.66)	0.53 (0.36–0.71)
	Nonpara	—	0.5 (0.0–0.5)	0.45 (0.12–0.64)	0.48 (0.21–0.58)	0.47 (0.23–0.55)
	Mice	—	1 (0.5–1)	0.09 (0.00–0.36)	0.55 (0.25–0.64)	0.39 (0.17–0.51)

Method	Impute	\|${\pi}$\|	\|${w=1}$\|	\|${w=0 }$\|	\|${w=1/2}$\|	\|${w=1/3}$\|
Method	Impute	\|${\pi}$\|	(Sensitivity)	(Specificity)
BI-SS	Para	0.6	1 (0.5–1.0)	0.55 (0.36–0.73)	0.77 (0.48–0.86)	0.70 (0.47–0.82)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.57–0.95)	0.82 (0.59–0.94)
		0.9	1 (0.0–1.0)	0.82 (0.73–1.00)	0.91 (0.41–1.00)	0.88 (0.55–1.00)
	Nonpara	0.6	1 (0.5–1.0)	0.45 (0.36–0.73)	0.73 (0.43–0.82)	0.64 (0.41–0.76)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.52–0.93)	0.82 (0.53–0.90)
		0.9	0.5 (0.0–1.0)	0.91 (0.73–1.00)	0.70 (0.41–1.00)	0.77 (0.55–1.00)
	Mice	0.6	1 (0.5–1.0)	0.09 (0.09–0.30)	0.55 (0.53–0.65)	0.39 (0.38–0.53)
		0.75	1 (0.5–1.0)	0.36 (0.19–0.55)	0.68 (0.34–0.77)	0.58 (0.29–0.70)
		0.9	0.5 (0.5–1.0)	0.64 (0.45–0.82)	0.57 (0.48–0.86)	0.59 (0.47–0.82)
Direct	Para	—	0.5 (0.0–0.5)	0.55 (0.45–0.82)	0.53 (0.27–0.66)	0.53 (0.36–0.71)
	Nonpara	—	0.5 (0.0–0.5)	0.45 (0.12–0.64)	0.48 (0.21–0.58)	0.47 (0.23–0.55)
	Mice	—	1 (0.5–1)	0.09 (0.00–0.36)	0.55 (0.25–0.64)	0.39 (0.17–0.51)

Table 4.

Feature selection performance for each method in the RV144 trial (with 95% bootstrap confidence intervals)

Method	Impute	\|${\pi}$\|	\|${w=1}$\|	\|${w=0 }$\|	\|${w=1/2}$\|	\|${w=1/3}$\|
Method	Impute	\|${\pi}$\|	(Sensitivity)	(Specificity)
BI-SS	Para	0.6	1 (0.5–1.0)	0.55 (0.36–0.73)	0.77 (0.48–0.86)	0.70 (0.47–0.82)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.57–0.95)	0.82 (0.59–0.94)
		0.9	1 (0.0–1.0)	0.82 (0.73–1.00)	0.91 (0.41–1.00)	0.88 (0.55–1.00)
	Nonpara	0.6	1 (0.5–1.0)	0.45 (0.36–0.73)	0.73 (0.43–0.82)	0.64 (0.41–0.76)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.52–0.93)	0.82 (0.53–0.90)
		0.9	0.5 (0.0–1.0)	0.91 (0.73–1.00)	0.70 (0.41–1.00)	0.77 (0.55–1.00)
	Mice	0.6	1 (0.5–1.0)	0.09 (0.09–0.30)	0.55 (0.53–0.65)	0.39 (0.38–0.53)
		0.75	1 (0.5–1.0)	0.36 (0.19–0.55)	0.68 (0.34–0.77)	0.58 (0.29–0.70)
		0.9	0.5 (0.5–1.0)	0.64 (0.45–0.82)	0.57 (0.48–0.86)	0.59 (0.47–0.82)
Direct	Para	—	0.5 (0.0–0.5)	0.55 (0.45–0.82)	0.53 (0.27–0.66)	0.53 (0.36–0.71)
	Nonpara	—	0.5 (0.0–0.5)	0.45 (0.12–0.64)	0.48 (0.21–0.58)	0.47 (0.23–0.55)
	Mice	—	1 (0.5–1)	0.09 (0.00–0.36)	0.55 (0.25–0.64)	0.39 (0.17–0.51)

Method	Impute	\|${\pi}$\|	\|${w=1}$\|	\|${w=0 }$\|	\|${w=1/2}$\|	\|${w=1/3}$\|
Method	Impute	\|${\pi}$\|	(Sensitivity)	(Specificity)
BI-SS	Para	0.6	1 (0.5–1.0)	0.55 (0.36–0.73)	0.77 (0.48–0.86)	0.70 (0.47–0.82)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.57–0.95)	0.82 (0.59–0.94)
		0.9	1 (0.0–1.0)	0.82 (0.73–1.00)	0.91 (0.41–1.00)	0.88 (0.55–1.00)
	Nonpara	0.6	1 (0.5–1.0)	0.45 (0.36–0.73)	0.73 (0.43–0.82)	0.64 (0.41–0.76)
		0.75	1 (0.5–1.0)	0.73 (0.55–0.91)	0.86 (0.52–0.93)	0.82 (0.53–0.90)
		0.9	0.5 (0.0–1.0)	0.91 (0.73–1.00)	0.70 (0.41–1.00)	0.77 (0.55–1.00)
	Mice	0.6	1 (0.5–1.0)	0.09 (0.09–0.30)	0.55 (0.53–0.65)	0.39 (0.38–0.53)
		0.75	1 (0.5–1.0)	0.36 (0.19–0.55)	0.68 (0.34–0.77)	0.58 (0.29–0.70)
		0.9	0.5 (0.5–1.0)	0.64 (0.45–0.82)	0.57 (0.48–0.86)	0.59 (0.47–0.82)
Direct	Para	—	0.5 (0.0–0.5)	0.55 (0.45–0.82)	0.53 (0.27–0.66)	0.53 (0.36–0.71)
	Nonpara	—	0.5 (0.0–0.5)	0.45 (0.12–0.64)	0.48 (0.21–0.58)	0.47 (0.23–0.55)
	Mice	—	1 (0.5–1)	0.09 (0.00–0.36)	0.55 (0.25–0.64)	0.39 (0.17–0.51)

Table 5.

Cross-validated misclassification errors in prediction from the risk model estimated by each method in the RV144 trial (with 95% bootstrap confidence intervals)

	Method
Impute	BISS w LASSO			Direct w LASSO	NoSel
Impute	\|${\pi }$\|=0.6	\|${\pi}$\|=0.75	\|${\pi}$\|=0.9	Direct w LASSO	NoSel
Para	0.0061	0.0058	0.0057	0.0065	0.0064
Para	(0.0047–0.0071)	(0.0047–0.0075)	(0.0047–0.0074)	(0.0053–0.0082)	(0.0046–0.0072)
Nonpara	0.0064	0.0062	0.0069	0.0067	0.0064
Nonpara	(0.0050–0.0082)	(0.0050–0.0079)	(0.0051–0.0081)	(0.0051–0.0081)	(0.0045–0.0070)
Mice	0.0068	0.0059	0.0073	0.0064	0.0063
Mice	(0.0049–0.0078)	(0.0047–0.0076)	(0.0051–0.0084)	(0.0050–0.0075)	(0.0046–0.0072)

	Method
Impute	BISS w LASSO			Direct w LASSO	NoSel
Impute	\|${\pi }$\|=0.6	\|${\pi}$\|=0.75	\|${\pi}$\|=0.9	Direct w LASSO	NoSel
Para	0.0061	0.0058	0.0057	0.0065	0.0064
Para	(0.0047–0.0071)	(0.0047–0.0075)	(0.0047–0.0074)	(0.0053–0.0082)	(0.0046–0.0072)
Nonpara	0.0064	0.0062	0.0069	0.0067	0.0064
Nonpara	(0.0050–0.0082)	(0.0050–0.0079)	(0.0051–0.0081)	(0.0051–0.0081)	(0.0045–0.0070)
Mice	0.0068	0.0059	0.0073	0.0064	0.0063
Mice	(0.0049–0.0078)	(0.0047–0.0076)	(0.0051–0.0084)	(0.0050–0.0075)	(0.0046–0.0072)

Table 5.

Cross-validated misclassification errors in prediction from the risk model estimated by each method in the RV144 trial (with 95% bootstrap confidence intervals)

	Method
Impute	BISS w LASSO			Direct w LASSO	NoSel
Impute	\|${\pi }$\|=0.6	\|${\pi}$\|=0.75	\|${\pi}$\|=0.9	Direct w LASSO	NoSel
Para	0.0061	0.0058	0.0057	0.0065	0.0064
Para	(0.0047–0.0071)	(0.0047–0.0075)	(0.0047–0.0074)	(0.0053–0.0082)	(0.0046–0.0072)
Nonpara	0.0064	0.0062	0.0069	0.0067	0.0064
Nonpara	(0.0050–0.0082)	(0.0050–0.0079)	(0.0051–0.0081)	(0.0051–0.0081)	(0.0045–0.0070)
Mice	0.0068	0.0059	0.0073	0.0064	0.0063
Mice	(0.0049–0.0078)	(0.0047–0.0076)	(0.0051–0.0084)	(0.0050–0.0075)	(0.0046–0.0072)

	Method
Impute	BISS w LASSO			Direct w LASSO	NoSel
Impute	\|${\pi }$\|=0.6	\|${\pi}$\|=0.75	\|${\pi}$\|=0.9	Direct w LASSO	NoSel
Para	0.0061	0.0058	0.0057	0.0065	0.0064
Para	(0.0047–0.0071)	(0.0047–0.0075)	(0.0047–0.0074)	(0.0053–0.0082)	(0.0046–0.0072)
Nonpara	0.0064	0.0062	0.0069	0.0067	0.0064
Nonpara	(0.0050–0.0082)	(0.0050–0.0079)	(0.0051–0.0081)	(0.0051–0.0081)	(0.0045–0.0070)
Mice	0.0068	0.0059	0.0073	0.0064	0.0063
Mice	(0.0049–0.0078)	(0.0047–0.0076)	(0.0051–0.0084)	(0.0050–0.0075)	(0.0046–0.0072)

7. Discussion

In this article, we proposed methodology for estimating the risk model conditional on multiple vaccine-induced immune response biomarkers in order to identify principal surrogate endpoints that modifies a vaccine’s protective effect. To counteract the issue of missing potential marker values, we propose to use the BIP design augmented with a CPV group following the recommendation of Follmann (2006), and develop imputation methods to impute the missing potential marker values and perform biomarker selection through stepwise resampling and imputation.

Based on the numerical studies conducted in this article, clearly the BI-SS procedure with parametric imputation is the best performing method overall. We also showed that MICE fails to correctly impute the missing potential marker values more often than not, as it fails to utilize the randomization condition in (2.2) needed for their correct imputation in our special problem setting. This is also shown by the improved performance of the parametric and the nonparametric imputation methods over MICE.

As discussed in Long and Johnson (2015), usually there is no perfect value for the tuning parameter |$\pi$| in BI-SS, but values between 0.6 and 0.9 are considered good. Here, in most of our simulation settings, the optimal |$\pi$| fell in the proposed range (0.6–0.9), so the performance of the proposed estimator seemed to be relatively robust within this interval. Also, note that these different metrics behave differently with |$\pi$| (e.g. specificity increases with |$\pi$| while sensitivity decreases with |$\pi$|⁠), so finding an optimal |$\pi$| depends on the specific interest of the researcher.

The parametric method used in our simulations was based either on (i) Gaussian assumption or (ii) Exponential assumption on the conditional marker distributions, although it can be potentially implemented with any underlying parametric assumption. We studied one example in our simulation section (Setting 5) that considers a misspecified parametric model for imputation. As seen there, the imputation performance using a misspecified Gaussian model when the true model is exponential does not yield a very dramatic drop in performance than when we used the correct parametric model, suggesting the common Gaussian working model is reasonably robust. When baseline predictors |$\textbf{W}$| are discrete, our proposed nonparametric method to impute the missing potential marker values provides a valid alternative.

Lastly, note that Assumption A4 is not necessary for identifying disease risk model conditional on |$\textbf{S}(1)$| in a BIP+CPV design. The GLM model assumption is important for model estimation in a BIP-only design and has been commonly assumed for principal surrogates evaluation in vaccine research. We also adopt it here mainly for two reasons: (i) it is compatible with the parametric imputation of the missing |$\textbf{S}(1)$| values in infected placebo recipients using our parametric method; and (ii) it allows us to use commonly used and efficient feature selection algorithms such as LASSO.

8. Supplementary Material

Supplementary material is available at http://biostatistics.oxfordjournals.org.

Acknowledgments

The authors also thank the participants, investigators, and sponsors of the RV144 trials. Conflict of Interest: None declared.

Funding

National Institutes of Health (R01 GM106177-01 and 2R37AI05465-10).

References

Donoho,

D. L.

(

2000

).

High-dimensional data analysis: the curses and blessings of dimensionality

.

AMS Math Challenges Lecture

32

.

OpenURL Placeholder Text

Fan,

J.

and

Li,

R.

(

2001

).

Variable selection via nonconcave penalized likelihood and its oracle properties

.

Journal of the American Statistical Association

96

,

1348

–

1360

.

Fleming,

T. R.

and

DeMets,

D. L.

(

1996

).

Surrogate endpoints in clinical trials: are we being misled?

Annals of Internal Medicine

125

,

605

–

613

.

Follmann,

D. A.

(

2006

).

Augmented designs to assess immune response in vaccine trials

.

Biometrics

62

,

1161

–

1169

.

Gilbert,

P. B.

and

Hudgens,

M. G.

(

2008

).

Evaluating candidate principal surrogate endpoints

.

Biometrics

64

,

1146

–

1154

.

Haynes,

B. F.

,

Gilbert,

P. B.

,

McElrath,

M. J.

,

Zolla-Pazner,

S.

,

Tomaras,

G. D.

,

Alam,

S. M.

,

Evans,

D. T.

,

Montefiori,

D. C.

,

Karnasuta,

C.

,

Sutthent,

R. L.

, and others (

2012

).

Immune-correlates analysis of an HIV-1 vaccine efficacy trial

.

New England Journal of Medicine

366

,

1275

–

1286

.

Horvitz,

D. G.

and

Thompson,

D. J.

(

1952

).

A generalization of sampling without replacement from a finite universe

.

Journal of the American statistical Association

47

,

663

–

685

.

Huang,

Y.

(

2018

).

Evaluating principal surrogate markers in vaccine trials in the presence of multiphase sampling

.

Biometrics

74

,

27

–

39

.

Huang,

Y.

and

Gilbert,

P. B.

(

2011

).

Comparing biomarkers as principal surrogate endpoints

.

Biometrics

67

,

1442

–

1451

.

Long,

Q.

and

Johnson,

B. A.

(

2015

).

Variable selection in the presence of missing data: resampling and imputation

.

Biostatistics

16

,

596

–

610

.

Rerks-Ngarm,

S.

,

Pitisuttithum,

P.

,

Nitayaphan,

S.

,

Kaewkungwal,

J.

,

Chiu,

J.

,

Paris,

R.

,

Premsri,

N.

,

Namwat,

C.

,

de Souza,

M.

,

Adams,

E.

, and others. (

2009

).

Vaccination with ALVAC and AIDSVAX to prevent HIV-1 infection in Thailand

.

New England Journal of Medicine

361

,

2209

–

2220

.