Covariate-adjusted log-rank test: guaranteed efficiency gain and universal applicability

Type-I errors (in percentages) based on 10 000 simulations

Case	Randomization	$T_{L}$	$T_{CL}$	$T_{SL}$	$T_{CSL}$
I	Simple	4.91	5.16	4.86	4.78
	Permuted block	3.25	5.22	4.80	4.85
	Minimization	3.40	5.43	5.02	5.23
II	Simple	5.39	5.14	5.00	4.97
	Permuted block	3.59	5.03	4.94	4.82
	Minimization	4.01	5.23	5.11	5.28
III	Simple	5.07	5.43	5.27	5.16
	Permuted block	2.29	4.79	4.76	4.82
	Minimization	2.88	5.43	5.23	5.52
IV	Simple	5.41	5.30	5.39	5.21
	Permuted block	4.44	5.48	5.10	5.49
	Minimization	4.21	5.18	5.04	5.06

Case	Randomization	$T_{L}$	$T_{CL}$	$T_{SL}$	$T_{CSL}$
I	Simple	4.91	5.16	4.86	4.78
	Permuted block	3.25	5.22	4.80	4.85
	Minimization	3.40	5.43	5.02	5.23
II	Simple	5.39	5.14	5.00	4.97
	Permuted block	3.59	5.03	4.94	4.82
	Minimization	4.01	5.23	5.11	5.28
III	Simple	5.07	5.43	5.27	5.16
	Permuted block	2.29	4.79	4.76	4.82
	Minimization	2.88	5.43	5.23	5.52
IV	Simple	5.41	5.30	5.39	5.21
	Permuted block	4.44	5.48	5.10	5.49
	Minimization	4.21	5.18	5.04	5.06

Table 1

Open in new tab Download slide

Type-I errors (in percentages) based on 10 000 simulations

Case	Randomization	$T_{L}$	$T_{CL}$	$T_{SL}$	$T_{CSL}$
I	Simple	4.91	5.16	4.86	4.78
	Permuted block	3.25	5.22	4.80	4.85
	Minimization	3.40	5.43	5.02	5.23
II	Simple	5.39	5.14	5.00	4.97
	Permuted block	3.59	5.03	4.94	4.82
	Minimization	4.01	5.23	5.11	5.28
III	Simple	5.07	5.43	5.27	5.16
	Permuted block	2.29	4.79	4.76	4.82
	Minimization	2.88	5.43	5.23	5.52
IV	Simple	5.41	5.30	5.39	5.21
	Permuted block	4.44	5.48	5.10	5.49
	Minimization	4.21	5.18	5.04	5.06

Case	Randomization	$T_{L}$	$T_{CL}$	$T_{SL}$	$T_{CSL}$
I	Simple	4.91	5.16	4.86	4.78
	Permuted block	3.25	5.22	4.80	4.85
	Minimization	3.40	5.43	5.02	5.23
II	Simple	5.39	5.14	5.00	4.97
	Permuted block	3.59	5.03	4.94	4.82
	Minimization	4.01	5.23	5.11	5.28
III	Simple	5.07	5.43	5.27	5.16
	Permuted block	2.29	4.79	4.76	4.82
	Minimization	2.88	5.43	5.23	5.52
IV	Simple	5.41	5.30	5.39	5.21
	Permuted block	4.44	5.48	5.10	5.49
	Minimization	4.21	5.18	5.04	5.06

Based on 10 000 simulations, power curves of four tests for θ ranging from 0 to 0.6, under four cases and stratified permuted block randomization are plotted in Fig. 1. Similar figures for simple randomization and minimization are given in the Supplementary Material. In all cases, the power curves of covariate-adjusted tests $T_{CL}$ and $T_{CSL}$ are better than those of unadjusted tests $T_{L}$ and $T_{SL}$ ⁠, especially the benchmark $T_{L}$ ⁠. Under Cox’s model, $T_{CSL}$ is better than $T_{CL}$ ⁠, but not necessarily under the non-Cox model. The stratified $T_{SL}$ is mostly better than the unstratified $T_{L}$ ⁠, but unlike $T_{CL}$ and $T_{CSL}$ ⁠, there is no guaranteed efficiency gain, e.g., case III when $θ > 0.4$ ⁠. The difference in censoring model also has some effect.

Fig. 1

Power curves based on 10 000 simulations.

More simulation results can be found in the Supplementary Material.

6 A real data application

We apply four tests $T_{L}, T_{CL}, T_{SL}$ and $T_{CSL}$ to the data from the AIDS Clinical Trials Group Study 175, ACTG 175, a randomized controlled trial evaluating antiretroviral treatments in adults infected with human immunodeficiency virus type 1 whose CD4 cell counts were from 200 to 500 per cubic millimeter (Hammer et al., 1996). The primary endpoint was time to a composite event defined as a $⩾ 50$ % decline in the CD4 cell count, an AIDS-defining event, or death. Stratified permuted block randomization with equal allocation was applied with covariate Z having three levels related with the length of prior antiretroviral therapy: Z = 1, 2 and 3, representing 0 weeks, between 1 to 52 weeks and more than 52 weeks of prior antiretroviral therapy, respectively. The dataset is publicly available in the R package speff2trial (R Development Core Team, 2024).

We focus on the comparison of treatment 0 (zidovudine) versus treatment 1 (didanosine). For stratified log-rank test $T_{SL}$ ⁠, the three-level Z is used as the stratification variable. For covariate adjustment, two additional prognostic baseline covariates are considered as X: the baseline CD4 cell count and the number of days receiving antiretroviral therapy prior to treatment. In addition to testing treatment effect for all patients, a subgroup analysis with Z strata as subgroups is also of interest because responses to antiretroviral therapy may vary according to the extent of prior drug exposure. Within each subgroup defined by Z, the stratified tests become the same as their unstratified counterparts, and thus we only apply tests $T_{L}$ and $T_{CL}$ in the subgroup analysis.

Table 2 reports the number of patients, numerator and denominator of each test, and a p-value for testing with all patients or with a subgroup. The effect of covariate adjustment is clear: for the covariate-adjusted tests, the standard errors ${\hat{σ}}_{CL}$ and ${\hat{σ}}_{CSL}$ are smaller than ${\hat{σ}}_{L}$ and ${\hat{σ}}_{SL}$ in all analyses.

Table 2

Statistics for the ACTG 175 example

		Subgroup
	All patients	Z = 1	Z = 2	Z = 3
Number of patients	1093	461	198	434
Log-rank test
$n^{1 / 2} {\hat{U}}_{L}$	–1.223	–0.542	–0.144	–1.292
${\hat{σ}}_{L}$	0.265	0.235	0.270	0.290
p-value (adjusted for subgroup analysis)	< 0.001	0.064	1	< 0.001
Estimated θ	–0.528	–0.455	–0.140	–0.740
Standard error of the estimated θ	0.116	0.199	0.263	0.171
Covariate-adjusted log-rank test
$n^{1 / 2} {\hat{U}}_{CL}$	–1.273	–0.553	–0.129	$- 1.382$
${\hat{σ}}_{CL}$	0.257	0.230	0.265	0.282
p-value (adjusted for subgroup analysis)	< 0.001	0.049	1	< 0.001
Estimated θ	–0.550	–0.464	–0.127	$- 0.793$
Standard error of the estimated θ	0.113	0.195	0.257	0.166
Stratified log-rank test
$n^{1 / 2} {\hat{U}}_{SL}$	–1.228
${\hat{σ}}_{SL}$	0.264
p-value	< 0.001
Estimated θ	–0.531
Standard error of the estimated θ	0.116
Covariate-adjusted stratified log-rank test
$n^{1 / 2} {\hat{U}}_{CSL}$	$- 1.284$
${\hat{σ}}_{CSL}$	0.258
p-value	< 0.001
Estimated θ	$- 0.556$
Standard error of the estimated θ	0.113

		Subgroup
	All patients	Z = 1	Z = 2	Z = 3
Number of patients	1093	461	198	434
Log-rank test
$n^{1 / 2} {\hat{U}}_{L}$	–1.223	–0.542	–0.144	–1.292
${\hat{σ}}_{L}$	0.265	0.235	0.270	0.290
p-value (adjusted for subgroup analysis)	< 0.001	0.064	1	< 0.001
Estimated θ	–0.528	–0.455	–0.140	–0.740
Standard error of the estimated θ	0.116	0.199	0.263	0.171
Covariate-adjusted log-rank test
$n^{1 / 2} {\hat{U}}_{CL}$	–1.273	–0.553	–0.129	$- 1.382$
${\hat{σ}}_{CL}$	0.257	0.230	0.265	0.282
p-value (adjusted for subgroup analysis)	< 0.001	0.049	1	< 0.001
Estimated θ	–0.550	–0.464	–0.127	$- 0.793$
Standard error of the estimated θ	0.113	0.195	0.257	0.166
Stratified log-rank test
$n^{1 / 2} {\hat{U}}_{SL}$	–1.228
${\hat{σ}}_{SL}$	0.264
p-value	< 0.001
Estimated θ	–0.531
Standard error of the estimated θ	0.116
Covariate-adjusted stratified log-rank test
$n^{1 / 2} {\hat{U}}_{CSL}$	$- 1.284$
${\hat{σ}}_{CSL}$	0.258
p-value	< 0.001
Estimated θ	$- 0.556$
Standard error of the estimated θ	0.113

Here θ denotes the log hazard ratio for all patients and for each subgroup.

Table 2

Statistics for the ACTG 175 example

		Subgroup
	All patients	Z = 1	Z = 2	Z = 3
Number of patients	1093	461	198	434
Log-rank test
$n^{1 / 2} {\hat{U}}_{L}$	–1.223	–0.542	–0.144	–1.292
${\hat{σ}}_{L}$	0.265	0.235	0.270	0.290
p-value (adjusted for subgroup analysis)	< 0.001	0.064	1	< 0.001
Estimated θ	–0.528	–0.455	–0.140	–0.740
Standard error of the estimated θ	0.116	0.199	0.263	0.171
Covariate-adjusted log-rank test
$n^{1 / 2} {\hat{U}}_{CL}$	–1.273	–0.553	–0.129	$- 1.382$
${\hat{σ}}_{CL}$	0.257	0.230	0.265	0.282
p-value (adjusted for subgroup analysis)	< 0.001	0.049	1	< 0.001
Estimated θ	–0.550	–0.464	–0.127	$- 0.793$
Standard error of the estimated θ	0.113	0.195	0.257	0.166
Stratified log-rank test
$n^{1 / 2} {\hat{U}}_{SL}$	–1.228
${\hat{σ}}_{SL}$	0.264
p-value	< 0.001
Estimated θ	–0.531
Standard error of the estimated θ	0.116
Covariate-adjusted stratified log-rank test
$n^{1 / 2} {\hat{U}}_{CSL}$	$- 1.284$
${\hat{σ}}_{CSL}$	0.258
p-value	< 0.001
Estimated θ	$- 0.556$
Standard error of the estimated θ	0.113

		Subgroup
	All patients	Z = 1	Z = 2	Z = 3
Number of patients	1093	461	198	434
Log-rank test
$n^{1 / 2} {\hat{U}}_{L}$	–1.223	–0.542	–0.144	–1.292
${\hat{σ}}_{L}$	0.265	0.235	0.270	0.290
p-value (adjusted for subgroup analysis)	< 0.001	0.064	1	< 0.001
Estimated θ	–0.528	–0.455	–0.140	–0.740
Standard error of the estimated θ	0.116	0.199	0.263	0.171
Covariate-adjusted log-rank test
$n^{1 / 2} {\hat{U}}_{CL}$	–1.273	–0.553	–0.129	$- 1.382$
${\hat{σ}}_{CL}$	0.257	0.230	0.265	0.282
p-value (adjusted for subgroup analysis)	< 0.001	0.049	1	< 0.001
Estimated θ	–0.550	–0.464	–0.127	$- 0.793$
Standard error of the estimated θ	0.113	0.195	0.257	0.166
Stratified log-rank test
$n^{1 / 2} {\hat{U}}_{SL}$	–1.228
${\hat{σ}}_{SL}$	0.264
p-value	< 0.001
Estimated θ	–0.531
Standard error of the estimated θ	0.116
Covariate-adjusted stratified log-rank test
$n^{1 / 2} {\hat{U}}_{CSL}$	$- 1.284$
${\hat{σ}}_{CSL}$	0.258
p-value	< 0.001
Estimated θ	$- 0.556$
Standard error of the estimated θ	0.113

Here θ denotes the log hazard ratio for all patients and for each subgroup.

For the analysis based on all patients, all four tests significantly reject the null hypothesis H ₀ of the no-treatment effect. In the subgroup analysis, the p-values are adjusted using Bonferroni’s correction to control for the familywise error rate. From Table 2, p-values in the subgroup analysis are substantially larger than those in the analysis of all patients, because of reduced sample sizes as well as Bonferroni’s correction. The empirical result in this example illustrates the benefit of covariate adjustment in testing when the sample size is not very large. Using the adjusted log-rank test $T_{CL}$ ⁠, together with the estimated effect size and its standard error shown in Table 2, we can conclude the superiority of treatment 1 for both Z = 1 and Z = 3, which is consistent with the evidence of Hammer et al. (1996).

Acknowledgement

We would like to thank all reviewers for useful comments and suggestions. Our research was supported by the National Natural Science Foundation of China and the U.S. National Science Foundation. Shao is also affiliated with the East China Normal University.

Supplementary material

The Supplementary Material contains all technical proofs and some additional results.

References

Baldi Antognini

A.

,

Zagoraiou

M

, . (

2015

).

On the almost sure convergence of adaptive allocation procedures

.

Bernoulli

21

,

881

–

908

.

Cassel

C. M.

,

Särndal

C. E.

,

Wretman

J. H

, . (

1976

).

Some results on generalized difference estimation and generalized regression estimation for finite populations

.

Biometrika

63

,

615

–

20

.

Cheng

S.

,

Wei

L.

,

Ying

Z

, . (

1995

).

Analysis of transformation models with censored data

.

Biometrika

82

,

835

–

45

.

Ciolino

J. D.

,

Palac

H. L.

,

Yang

A.

,

Vaca

M.

,

Belli

H. M

, . (

2019

).

Ideal vs. real: a systematic review on handling covariates in randomized controlled trials

.

BMC Med. Res. Methodol

.

19

,

136

.

Díaz

I.

,

Colantuoni

E.

,

Hanley

D. F.

,

Rosenblum

M.

, (

2019

).

Improved precision in the analysis of randomized trials with survival outcomes, without assuming proportional hazards

.

Lifetime Data Anal

.

25

,

439

–

68

.

DiRienzo

A. G.

,

Lagakos

S. W.

, (

2002

).

Effects of model misspecification on tests of no randomized treatment effect arising from Cox’s proportional hazards model

.

J. R. Statist. Soc. B

63

,

745

–

57

.

EMA

(

2015

). Guideline on adjustment for baseline covariates in clinical trials, EMA/CHMP/295050/2013. London, UK: European Medicines Agency (EMA). https://www.ema.europa.eu/en/documents/scientific-guideline/guideline-adjustment-baseline-covariates-clinical-trials_en.pdf. Accessed August 3, 2023.

FDA

(

2023

). Adjusting for covariates in randomized clinical trials for drugs and biological products. Guidance for Industry. Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research, Food and Drug Administration (FDA), U.S. Department of Health and Human Services. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/adjusting-covariates-randomized-clinical-trials-drugs-and-biological-products. Accessed August 3 2023.

Hammer

S. M.

,

Katzenstein

D. A.

,

Hughes

M. D.

,

Gundacker

H.

,

Schooley

R. T.

,

Haubrich

R. H.

,

Henry

W. K.

,

Lederman

M. M.

,

Phair

J. P.

,

Niu

M.

, et al. (

1996

).

A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter

.

New Engl. J. Med

.

335

,

1081

–

90

.

ICH E9

. (

1998

). Statistical principles for clinical trials E9. International Council for Harmonisation (ICH). https://database.ich.org/sites/default/files/E9_Guideline.pdf. Accessed August 3, 2023.

Kalbfleisch

J. D.

,

Prentice

R. L.

, (

2011

).

The Statistical Analysis of Failure Time Data

.

New York

:

John Wiley

,.

Google Preview

OpenURL Placeholder Text

Kong

F. H.

,

Slud

E

, . (

1997

).

Robust covariate-adjusted logrank tests

.

Biometrika

84

,

847

–

62

.

Lin

D. Y.

,

Wei

L. J

, . (

1989

).

The robust inference for the Cox proportional hazards model

.

J. Am. Statist. Assoc

.

84

,

1074

–

8

.

Lin

W

, . (

2013

).

Agnostic notes on regression adjustments to experimental data: reexamining Freedman’s critique

.

Ann. Appl. Statist

.

7

,

295

–

318

.

Lu

X.

,

Tsiatis

A. A.

, (

2008

).

Improving the efficiency of the log-rank test using auxiliary covariates

.

Biometrika

95

,

679

–

94

.

Lu

X.

,

Tsiatis

A. A

, . (

2011

).

Semiparametric estimation of treatment effect with time-lagged response in the presence of informative censoring

.

Lifetime Data Anal

.

17

,

566

–

93

.

Mantel

N

, . (

1966

).

Evaluation of survival data and two new rank order statistics arising in its consideration

.

Cancer Chemother. Rep

.

50

,

163

–

70

.

PubMed

OpenURL Placeholder Text

Moore

K. L.

,

van der Laan

M. J

, . (

2009

).

Increasing power in randomized trials with right censored outcomes through covariate adjustment

.

J. Biopharm. Statist

.

19

,

1099

–

131

.

Parast

L.

,

Tian

L.

,

Cai

T

, . (

2014

).

Landmark estimation of survival and treatment effect in a randomized clinical trial

.

J. Am. Statist. Assoc

.

109

,

384

–

94

.

Peto

R.

,

Pike

M. C.

,

Armitage

P.

,

Breslow

N. E.

,

Cox

D. R.

,

Howard

S. V.

,

Mantel

N.

,

McPherson

K.

,

Peto

J.

,

Smith

P. G

, . (

1976

).

Design and analysis of randomized clinical trials requiring prolonged observation of each patient. I. Introduction and design

.

Br. J. Cancer

34

,

585

–

612

.

Pocock

S. J.

,

Simon

R

, . (

1975

).

Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial

.

Biometrics

31

,

103

–

15

.

R Development Core Team

(

2024

). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0, http://www.R-project.org.

Robins

J. M.

,

Finkelstein

D. M

, . (

2000

).

Correcting for noncompliance and dependent censoring in an AIDS clinical trial with inverse probability of censoring weighted (IPCW) log-rank tests

.

Biometrics

56

,

779

–

88

.

Schulz

K. F.

,

Grimes

D. A

, . (

2002

).

Generation of allocation sequences in randomised trials: chance, not choice

.

Lancet

359

,

515

–

19

.

Shao

J

, . (

2021

).

Inference for covariate-adaptive randomization: aspects of methodology and theory (with discussions)

.

Statist. Theory Rel. Fields

5

,

172

–

86

.

Taves

D. R

, . (

1974

).

Minimization: a new method of assigning patients to treatment and control groups

.

Clin. Pharmacol. Ther

.

15

,

443

–

53

.

Taves

D. R

, . (

2010

).

The use of minimization in clinical trials

.

Contemp. Clin. Trials

31

,

180

–

4

.

Tsiatis

A. A.

,

Davidian

M.

,

Zhang

M.

,

Lu

X

, . (

2008

).

Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: a principled yet flexible approach

.

Statist. Med

.

27

,

4658

–

77

.

Wang

B.

,

Susukida

R.

,

Mojtabai

R.

,

Amin-Esmaeili

M.

,

Rosenblum

M

, . (

2023

).

Model-robust inference for clinical trials that improve precision by stratified randomization and covariate adjustment

.

J. Am. Statist. Assoc

.

118

,

1152

–

63

.

Ye

T.

,

Shao

J

, . (

2020

).

Robust tests for treatment effect in survival analysis under covariate-adaptive randomization

.

J. R. Statist. Soc. B

82

,

1301

–

23

.

10.1080/01621459.2022.2049278.

Ye

T.

,

Shao

J.

,

Yi

Y.

,

Zhao

Q

, . (

2022

).

Toward better practice of covariate adjustment in analyzing randomized clinical trials

.

J. Am. Statist. Assoc

., doi:

OpenURL Placeholder Text