New approaches for testing non-inferiority for three-arm trials with Poisson distributed outcomes

Frequencies of new MRI CLs over 1 and 2 years. |$N$| denotes sample size.

1 year				2 years
Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var	Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var
\|$P$\|	0	13	50, 1.6, 1.2	\|$P$\|	0	9	50, 3.0, 2.6
	\|$\geq1$\|	37			\|$\geq1$\|	41
\|$R$\|	0	34	46, 0.4, 0.49	\|$R$\|	0	22	46, 0.8, 0.6
	\|$\geq1$\|	12			\|$\geq1$\|	24
\|$E$\|	0	24	48, 0.8, 1.0	\|$E$\|	0	18	48, 1.3 , 1.21
	\|$\geq1$\|	24			\|$\geq1$\|	30

1 year				2 years
Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var	Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var
\|$P$\|	0	13	50, 1.6, 1.2	\|$P$\|	0	9	50, 3.0, 2.6
	\|$\geq1$\|	37			\|$\geq1$\|	41
\|$R$\|	0	34	46, 0.4, 0.49	\|$R$\|	0	22	46, 0.8, 0.6
	\|$\geq1$\|	12			\|$\geq1$\|	24
\|$E$\|	0	24	48, 0.8, 1.0	\|$E$\|	0	18	48, 1.3 , 1.21
	\|$\geq1$\|	24			\|$\geq1$\|	30

Table 1.

Frequencies of new MRI CLs over 1 and 2 years. |$N$| denotes sample size.

1 year				2 years
Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var	Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var
\|$P$\|	0	13	50, 1.6, 1.2	\|$P$\|	0	9	50, 3.0, 2.6
	\|$\geq1$\|	37			\|$\geq1$\|	41
\|$R$\|	0	34	46, 0.4, 0.49	\|$R$\|	0	22	46, 0.8, 0.6
	\|$\geq1$\|	12			\|$\geq1$\|	24
\|$E$\|	0	24	48, 0.8, 1.0	\|$E$\|	0	18	48, 1.3 , 1.21
	\|$\geq1$\|	24			\|$\geq1$\|	30

1 year				2 years
Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var	Arm	Counts	Frequencies	\|$N$\|⁠, Mean, Var
\|$P$\|	0	13	50, 1.6, 1.2	\|$P$\|	0	9	50, 3.0, 2.6
	\|$\geq1$\|	37			\|$\geq1$\|	41
\|$R$\|	0	34	46, 0.4, 0.49	\|$R$\|	0	22	46, 0.8, 0.6
	\|$\geq1$\|	12			\|$\geq1$\|	24
\|$E$\|	0	24	48, 0.8, 1.0	\|$E$\|	0	18	48, 1.3 , 1.21
	\|$\geq1$\|	24			\|$\geq1$\|	30

The rest of the article is organized as follows. In Section 2, we give the NI hypothesis and existing Frequentist methods for testing the count data. We also introduce our proposed more powerful conditional testing in this section. In Section 3, we propose a novel Bayesian methodology for the same. We consider both conjugate and non-conjugate priors incorporating the condition of AS. In Section 4, the power and sample size calculations are discussed in detail for Frequentist approach, Bayesian normal approximation, and exact Bayesian methods. Section 5 presents the simulation results along with the power curves. Finally in Section 6, we apply our proposed Bayesian methodology for NI testing on this clinical trial dataset. The article concludes with discussion and future direction in Section 7. All proofs and additional simulation results are provided in Supplementary Appendix available at Biostatistics online.

2. Frequentist approach for NI testing

We adopt the notation used in Stucke and Kieser (2013) to illustrate the fraction margin approach for three-arm NI trial. We denote the experimental treatment by |$E$|⁠, the reference by |$R$|⁠, and the placebo by |$P$|⁠. The sample size corresponding to the three arms are denoted by |$n_{E}$|⁠, |$n_{R}$|⁠, and |$n_{P}$|⁠, respectively, which are not necessarily equal. Let |$X_{kE},$||$X_{kR}$|⁠, and |$X_{kP}$|⁠, |$k=1,\ldots, n_{l}$| denote the primary count type independent random variable corresponding to the |$k{\rm th}$| individual in the respective treatment arms. The |$X_{kl}$| is distributed as |$\text{Poisson}\left(\lambda_{l}t_{l}\right)$| with |$\lambda_{l}(>0)$| represents the rate parameter and |$t_{l}$| denotes the fixed follow-up times for |$l\in\left\{E, R, P\right\} $|⁠. Hence, |$\lambda_{l}t_{l}$| denotes the expected number of counts per-patient in the |$l$|th group. We assume that these random variables are mutually independent. Without loss of generality, we assume that higher the values of the Poisson rates |$\lambda_{l}$|⁠, greater is the treatment benefits. Again, we denote the total number of counts for all |$n_{l}$| patients in the |$l{\rm th}$| treatment arm by |$X_{l}=\sum_{k=1}^{n_{l}}X_{kl}$| which is distributed as |$\text{Poisson}\left(\lambda_{l}t_{l}n_{l}\right)$|⁠, |$l\in\left\{ E, R, P\right\}$|⁠. Later on for our analysis, we will consider homogeneous Poisson distributions for the treatment arms; that is, we take |$t_{l} = 1$| for |$l\in\{E,R,P\}$|⁠. Modeling non-homogeneous Poisson distribution is a possibility which we did not explore in this article. The usual NI hypothesis for a two-arm trial (without placebo) is

$$\begin{equation} H_{0}:\lambda_{E}-\lambda_{R}\leq\delta\mbox{ vs. }H_{1}:\lambda_{E}-\lambda_{R}>\delta,\label{ni1} \end{equation}$$

(2.1)

where |$\delta<0$| denotes the pre-specified amount of NI margin. In the current three-arm trial, the construction of |$ \delta $| via fraction margin approach (Pigeot and others, 2003) can be mathematically expressed as |$ \delta=f(\lambda_{R}-\lambda_{P}), $| where |$-1<f<0$| assuming the condition of AS (⁠|$\lambda_{R}>\lambda_{P}$|⁠). Hence, the hypothesis in (2.1) can be rewritten using the expression for |$\delta$| as follows: |$ H_{0}:\lambda_{E}-\lambda_{R}\leq f {\left(\lambda_{R}-\lambda_{P}\right)}\mbox{ vs. }H_{1}:\lambda_{E}-\lambda_{R}>f(\lambda_{R}-\lambda_{P}) $|⁠. Now, putting |$\theta=1+f$|⁠, the above hypothesis becomes

$$\begin{equation} H_{0}:\frac{\lambda_{E}-\lambda_{P}}{\lambda_{R}-\lambda_{P}}\leq\theta\mbox{ vs. }H_{1}:\frac{\lambda_{E}-\lambda_{P}}{\lambda_{R}-\lambda_{P}}>\theta,\label{ni2} \end{equation}$$

(2.2)

where |$\theta$| is the pre-specified fraction of the effect of the reference drug relative to the placebo. Clearly, rejection of the null hypothesis ensures that the experimental treatment retains a portion of the unknown effect of the reference over placebo under the fraction margin approach (Kieser and Stucke, 2016) and would support NI of the experimental drug over the active control. Different choices of |$\theta (\in [0, 1])$| have been proposed in Pigeot and others (2003). Particularly, for NI testing of the experimental drug, |$\theta$| is allowed to vary in the interval |$[0.5,1)$|⁠, indicating at least |$50\%$| or more effect retention. The hypothesis in (2.2) can be expressed in the following form which is used later for deriving the statistical test procedures

$$\begin{equation} H_{0}:\lambda_{E}-\theta\lambda_{R}-\left(1-\theta\right)\lambda_{P}\leq0\mbox{ vs. }H_{1}:\lambda_{E}-\theta\lambda_{R}-\left(1-\theta\right)\lambda_{P}>0.\label{NI hypo} \end{equation}$$

(2.3)

2.1. Existing Frequentist approaches

Mütze and others (2016) developed NI hypothesis testing where count outcome assumed to follow a negative binomial distribution. They constructed the test statistic for testing NI hypothesis by considering the maximum likelihood (ML) estimate of the linear contrast in |$H_{0}$| (in 2.3) given by, |$ T=\hat{\lambda}_{E}-\theta\hat{\lambda}_{R}-\left(1-\theta\right)\hat{\lambda}_{P}, $| where |$\hat{\lambda}_{l}={X_{l}}/{n_{l}t_{l}}$| is the maximum likelihood estimate (MLE) of |$\lambda_{l}$|⁠, |$l\in\left\{ E,R,P\right\} $|⁠. The variance of the test statistic is given as |${\rm Var}\left(T\right)={\lambda_{E}}/{n_{E}t_{E}}+\theta^{2}{\lambda_{R}}/{n_{R}t_{R}}+\left(1-\theta\right)^{2}{\lambda_{P}}/{n_{P}t_{P}}. $| Both ML and restricted maximum likelihood (RML) estimation techniques can be adopted to estimate |${\rm Var}(T)$|⁠. The RML estimator can be obtained subject to the constraint |$\lambda_{E}-\theta\lambda_{R}-\left(1-\theta\right)\lambda_{P}=0$|⁠. Mütze and others (2016) also derived asymptotic sample size formulae along with optimal sample size allocation considering both balanced and unbalanced designs albeit in the Frequentist set up. For a two-arm trial, Stucke and Kieser (2013) derived the statistical test procedure using RML estimator and obtained approximate sample size formulae under the Frequentist set up. Note, this approach of NI testing is valid provided the AS null hypothesis has already been rejected first. Hence, the step for NI testing is always a conditional test. However, this AS conditioning is not used in any of the existing approach of Frequentist test. We have shown mathematically that if the pretested AS condition is used properly in the second step (i.e., in NI testing), this could lead to a more powerful test with considerable savings in sample size.

2.2. Proposed Frequentist approach

As mentioned in Section 1.1, it is often argued (Pigeot and others, 2003; Koch and Röhmel, 2004; Ghosh and others, 2011; Wu and others, 2018) that if active control has not lost all of its effect over placebo then the statistical power to perform joint testing (NI and AS) will be very similar to NI testing only. This may not be true in all situation as shown in Kieser and Friede (2007), except when power of the pretest is close to unity. Nevertheless, NI testing only happens provided the AS condition |$(\lambda_R\,{>}\,\lambda_P)$| holds. However, this pretested AS condition has not been used further, though NI and AS test statistics are related. We introduce here a new conditional approach for NI hypothesis testing by incorporating the pretested AS condition (⁠|$\lambda_R{\,>\,}\lambda_P$|⁠) directly. We have shown that this approach will perform better or as good as the existing approach. For finding the MLE, we truncate the parameter space of (⁠|$\lambda_E, \lambda_R, \lambda_P$|⁠) such that it belongs to |$\{\lambda_E,\lambda_R,\lambda_P: \lambda_E , \lambda_R, \lambda_P \in [0,\infty), \lambda_R\,{>}\,\lambda_P\}$|⁠. One may develop a likelihood ratio test based on the statistic |$ T=\hat{\lambda}_{E}-\theta\hat{\lambda}_{R}-(1-\theta)\hat{\lambda}_{P}= (\hat{\lambda}_{E}- \hat{\lambda}_{P}) - \theta(\hat{\lambda}_{R}-\hat{\lambda}_{P})=U-\theta V $| under null hypothesis subject to the imposed condition, |$\hat{\lambda}_R\,{>}\,\hat{\lambda}_P$| via Wald-type test. Following Mütze and others (2016) argument, one can improve the convergence of Wald-type test via the RML which requires solving under |$H_0$|⁠, |$ \left(\hat{\lambda}_{E,{\rm RML}},\hat{\lambda}_{R,{\rm RML}},\hat{\lambda}_{P,{\rm RML}}\right) = arg\,max_{{\lambda}_{E}-\theta{\lambda}_{R}-(1-\theta){\lambda}_{P}\leq 0 ,{\lambda}_R\,{>}\,{\lambda}_P} \log l({\lambda}_{E},{\lambda}_{R},{\lambda}_{P}), $| where |$\log l({\lambda}_{E},{\lambda}_{R},{\lambda}_{P})$| is the log-likelihood of |$(\lambda_E, \lambda_R, \lambda_P)$| to estimate |$T$| by |$T_{{\rm RML}}$|⁠. This optimization problem can be solved only numerically as no closed form expression is possible. To reduce computational burden one practical strategy that is often recommended is to work with unrestricted MLE which is |$ T_{ML}= \hat{\lambda}_{E, {\rm ML}}-\theta \hat{\lambda}_{R,{\rm ML}}-(1-\theta) \hat{\lambda}_{P,{\rm ML}}$|⁠, however, only considering the part restricted by |$\hat{\lambda}_{R,{\rm ML}}\,{>}\,\hat{\lambda}_{P,{\rm ML}}$|⁠, which is

$$\begin{equation}\label{a1} T_{{\rm RML}} \simeq T_{{\rm ML}} \ast I[\hat{\lambda}_{R,{\rm ML}}\,{>}\,\hat{\lambda}_{P,{\rm ML}}]. \end{equation}$$

(2.4)

This strategy is proved to be quite useful in many practical applications (Huang and others, 2011; Kulldorff, 1997). Since working with product of random variables in (2.4) is little cumbersome, one can further show that |$f(T_{{\rm RML}})\simeq {f(T_{{\rm ML}}|\hat{\lambda}_{R, {\rm ML}}\,{>}\,\hat{\lambda}_{P,{\rm ML}})} \times Pr[\hat{\lambda}_{R,{\rm ML}}\,{>}\,\hat{\lambda}_{P,{\rm ML}}]$|⁠. It is easy to prove that |$Pr[\hat{\lambda}_{R,{\rm ML}}\,{>}\,\hat{\lambda}_{P,{\rm ML}}]$| is a constant value which can be absorbed as a proportionality constant. Hence, for all practical purposes, one can consider the distribution of the test statistic, |$f(T_{{\rm ML}}|\hat{\lambda}_{R}\,{>}\,\hat{\lambda}_{P}) \propto f(\hat{\lambda}_{E,{\rm ML}}-\theta \hat{\lambda}_{R,{\rm ML}}-(1-\theta) \hat{\lambda}_{P,{\rm ML}}|\hat{\lambda}_{R,{\rm ML}}\,{>}\,\hat{\lambda}_{P,{\rm ML}})$|⁠. For notational simplicity from now onwards, we denote the ML estimate |$\hat{\lambda}_{l,{\rm ML}}$| by |$\hat{\lambda}_{l}$|⁠, |$l \in \{E,R,P\}$|⁠. This leads to the modified test statistic for NI testing: |$W=(\hat{\lambda}_{E}-\theta\hat{\lambda}_{R}-(1-\theta)\hat{\lambda}_{P}|\hat{\lambda}_{R}\,{>}\,\hat{\lambda}_{P}) = \left(U-\theta V|V\,{>}\,0\right)$|⁠. Under the asymptotic normality of |$W$|⁠, we have |$ {(W-\mu_{w})}/{\sigma_{w}}\sim AN\left(0,1\right)$|⁠, where |$\mu_{w}\mbox{ and }\sigma_{w}^{2}$| are the mean and variance of |$W$|⁠, respectively.

Lemma 2.2.1.

Under conditional normal approximation, the mean |$\mu_{w}$| and variance |$\sigma_{w}^{2}$| of |$W=\hat{\lambda}_{E}-\theta\hat{\lambda}_{R}-(1-\theta)\hat{\lambda}_{P}|\hat{\lambda}_{R}\,{>}\,\hat{\lambda}_{P}$| are given by |$ \mu_{w}= \mu_{U}+\sigma_{U}\frac{\rho}{c}\phi\left(d\right)-\theta \left(\mu_{V}+\sigma_{V}\frac{1}{c}\phi\left(d\right)\right), \sigma_{w}^{2}= \sigma_{U}^{2}\left[1+\frac{\rho^{2}}{c}d\phi\left(d\right)-\left(\frac{\rho}{c}\phi\left(d\right)\right)^{2}\right] + \theta^2 \sigma_{V}^{2}\left[1-\frac{\phi\left(d\right)}{c}\left(\frac{\phi\left(d\right)}{c}-d\right)\right] - 2\theta\left[\sigma_{U}\sigma_{V}\frac{\rho}{c}\left(c+d\phi\left(d\right)\right)+ \sigma_{U}\mu_{V}\frac{\rho}{c}\phi\left(d\right) \right.$||$\left. + \sigma_{V}\mu_{U}\frac{1}{c}\phi\left(d\right)+\mu_{U}\mu_{V} -\left(\mu_{U} +\sigma_{U}\frac{\rho}{c}\phi\left(d\right)\right) \left(\mu_{V}+\sigma_{V}\frac{1}{c}\phi\left(d\right)\right)\right], \mbox{ where } \mu_{U} =\lambda_{E} -\lambda_{P},\mbox{ }\mu_{V}=\lambda_{R}-\lambda_{P},\mbox{ }\sigma^2_{l}=\frac{\lambda_{l}}{n_{l}} \mbox{ for }l \in \{E,R,P\},\mbox{ } d =-\frac{\mu_{V}}{\sigma_{V}}, c =1 -\Phi\left(d\right), \mbox{ }\sigma_{U}^{2} =\sigma_{E}^{2}+\sigma_{P}^{2},\mbox{ }\sigma_{V}^{2} =\sigma_{R}^{2}+\sigma_{P}^{2},\mbox{ and }\rho =\frac{Var(\hat{\lambda}_{P})}{\sqrt{Var\left(U\right)Var\left(V\right)}}=\frac{\sigma_{P}^{2}}{\sqrt{\sigma_{U}^{2}\sigma_{V}^{2}}}. $|

Proof: See Supplementary Appendix A available at Biostatistics online.

Now under |$H_{0}$|⁠, let us denote |$\lambda_{E}\mbox{ by }\lambda_{E}^{\rm null}$| and under |$H_{1}$| denote |$\lambda_{E}\mbox{ by }\lambda_{E}^{\rm alt}$| as point alternative. Since |$\lambda_{E}^{\rm null}$| satisfies |$\lambda_{E}^{\rm null}-\theta \lambda_{R}-(1-\theta) \lambda_{P}=0$|⁠, the expression of |$\lambda_{E}^{\rm null}$| can be obtained via |$ \lambda_{E}^{\rm null} =\lambda_{P}+\theta\left(\lambda_{R}-\lambda_{P}\right). $| Under |$H_{1}$|⁠, |$\lambda_{E}^{\rm alt}$| satisfies |$\lambda_{E}^{\rm alt}-\theta\lambda_{R}-(1-\theta)\lambda_{P}\,{>}\,0\Rightarrow(\lambda_{E}^{\rm alt}-\lambda_{P})\,{>}\,\theta(\lambda_{R}-\lambda_{P})$|⁠. Since |$\lambda_{E}$| is involved in the expression of the mean and variance of |$W$|⁠, we denote |$E\left(W\right)$| and |${\rm Var}\left(W\right)$| under |$H_{0}$| by |$\mu_{w}^{\rm null}$| and |$\sigma_{w}^{2\rm null}$| and under |$H_{1}$|⁠, by |$\mu_{w}^{\rm alt}$| and |$\sigma_{w}^{2 \rm alt}$|⁠, respectively. Thus, we have |$ {(W-\mu_{w}^{\rm null})}/{\sigma_{w}^{\rm null}} \sim AN\left(0,1\right)\mbox{ under }H_{0}\mbox{ and } {(W-\mu_{w}^{\rm alt})}/{\sigma_{w}^{\rm alt}} \sim AN\left(0,1\right)\mbox{ under }H_{1}. $| In Frequentist approach, the critical region of the test is given by |$W\,{>}\,k^{*}_{\alpha}$|⁠, where |$k^{*}_{\alpha}$| is obtained by assuming a test of size |$\alpha$|⁠: |$ P_{H_{0}}\left(W\,{>}\,k^{*}_{\alpha}\right) =\alpha\Rightarrow k^{*}_{\alpha}=\mu_{w}^{\rm null}+z_{1-\alpha}\sigma_{w}^{\rm null}, $| where |$z_{1-\alpha}$| is the |$100\left(1-\alpha\right)\%$| percentile point of the |$N\left(0,1\right)$| distribution. Traditionally, the value of |$\alpha$| is chosen to be |$0.025$| (other choices are possible too). The expression of the power of the test is given by |$ P_{H_{1}}\left(W\,{>}\,k^{*}_{\alpha}\right)=1-\Phi({(k^{*}_{\alpha}-\mu_{w}^{\rm alt})}/{\sigma_{w}^{\rm alt}}). $|

Lemma 2.2.2.

Proof: See Supplementary Appendix B available at Biostatistics online.

This lemma shows that there is effective power gain in the conditional test or conversely speaking, to attain a fixed power, the conditional test requires smaller sample size. Though for simplicity, the proof is given for equal allocation case, it can be easily extended for more general unequal allocation case. As observed in the Section 4.5, this power difference is substantial when the gap between |$ \lambda_R $| and |$ \lambda_P $| is small. In Supplementary section available at Biostatistics online, we have provided additional simulation result to demonstrate this fact. It should be also noted that this lemma is generalizable for continuous as well as for binary outcome with slightly different algebra, indicating the fact that our proposed conditional test should be de facto standard for Pigeot’s fraction margin approach irrespective of the outcome types.

3. Bayesian approaches for NI testing

As indicated in Section 1, availability of considerable prior information is almost guaranteed in any active control trial and NI RCT is not an exception. Albeit, the usage of these historical information via the Frequentist approaches is rather limited. Bayesian approach provides a natural path to leverage this historical data which may result in substantial effective sample size gain. However, to the best of our knowledge no Bayesian methodology paper exists for any three-arm trial with count type endpoints. In this section, we discuss an exact Bayesian and an approximation-based Bayesian method for NI testing involving Poisson rates. Note, we did not develop here Bayesian approach for existing Frequentist approach of Mütze and others (2016). However, as we proposed a more powerful Frequentist test in Section 2.2, our Bayesian development closely follows that procedure.

As stated earlier, the NI margin is constructed as the negative fraction of the unknown difference of the count rate of responses in the reference and the placebo arm. We consider |$\theta\geq0.5$| to test for the NI of |$E$| relative to |$R$| with two different prior scenarios, including the conjugate prior where the AS condition |$\left(\lambda_{R}\,{>}\,\lambda_{P}\right)$| is directly incorporated. This restriction reflects that the NI study is being carried out under the similar condition as that of the former studies in which the efficacy of the active control was proved, and it still retains its effect over placebo. This is a very realistic assumption because if the current trial is similar to the historical trial then the effect of reference drug over placebo should be constant in both the current and the historical trial (constancy assumption). In the following section, we discuss both the conjugate and non-conjugate prior settings. In case there is no available prior information, flat non-informative prior is assigned to |$\lambda_{l}$| which includes Jeffreys prior and other priors with adjusted parameters yielding large variance.

3.1. Exact Bayesian approach

3.1.1. Conjugate Gamma prior

In the conjugate prior setting, we use a Gamma distribution as the prior for the Poisson rate in each arm of the trial; that is, we assume |$\lambda_{l}\sim \text{Gamma}\left(\alpha_{l},\beta_{l}\right)$|⁠, |$l\in\{E,R,P\}$|⁠, where we assume |$\alpha_{l}$|⁠, |$\beta_{l}$| to be fixed hyper-parameters. After incorporating the assumption of AS |$\left(\lambda_{R}\,{>}\,\lambda_{P}\right)$|⁠, the joint prior distribution of the Poisson rates in the three-arms becomes |$ f\left(\boldsymbol{\lambda}\right)=I\left(\lambda_{R}\,{>}\,\lambda_{P}\right)\prod_{l\in\left\{ E,R,P\right\} }f(\lambda_{l}|\alpha_{l},\beta_{l}), $| where |$f\left(\lambda_{l}|\alpha_{l},\beta_{l}\right)$| is the density of |$\text{Gamma}(\alpha_{l}, \beta_{l})$| distribution given as

$$f\left(\lambda_{l}|\alpha_{l},\beta_{l}\right)\varpropto\lambda_{l}^{\alpha_{l}-1}\exp{\{-\lambda_{l}\beta_{l}\}}, \mbox{ } \lambda_{l} \,{>}\, 0.$$

Since the number of counts, |$X_{l}$|⁠, in each arm, follows a Poisson distribution with parameter |$n_{l}\lambda_{l}t_{l}$|⁠, |$l\in\left\{ E,R,P\right\} $|⁠, the posterior distribution for |$\lambda_{l}|X_{l}$| is |$\text{Gamma}\left(\alpha_{l}+X_{l},\beta_{l}+n_{l}t_{l}\right)$| satisfying AS condition is given by

$$f\left(\lambda_{E},\lambda_{R},\lambda_{P}|\text{Data},\alpha_{l},\beta_{l}\right)\varpropto I\left(\lambda_{R}\,{>}\,\lambda_{P}\right) \prod_{l\in\left\{ E,R,P\right\} }\lambda_{l}^{\alpha_{l}+x_{l}-1}{\rm exp}{\{-\lambda_{l}\left(\beta_{l}+n_{l}t_{l}\right)\}},\mbox{ }\lambda_{l} \,{>}\, 0,\mbox{ }l\in\{E,R,P\}.$$

The Markov chain Monte Carlo (MCMC) samples can be easily generated from this joint posterior distribution. The hyper-parameters |$\alpha_{l}$| and |$\beta_{l}$|⁠, |$l\in\{E,R,P\}$| can be chosen depending on how much prior information is available. In the absence of prior information from historical placebo-controlled trial, they are chosen to be vague. The mean |$\left(\mu\right)$|⁠, mode |$\left(\mu^{0}\right)$|⁠, and variance |$\left(\sigma^{2}\right)$| of |$\text{Gamma}\left(\alpha,\beta\right)$| are given as |$ \mu ={\alpha}/{\beta}$|⁠, |$\mu^{0}={(\alpha-1)}/{\beta}$|⁠, and |$\sigma^{2}={\alpha}/{\beta^{2}}$|⁠. For the informative priors, the variance is made smaller making priors to be more specific.

3.1.2. Non-conjugate prior

In this case, the prior distributions are so assigned to the parameters |$\lambda_{E} $|⁠, |$\lambda_{R}$|⁠, and |$\lambda_{P}$| that satisfy the restriction |$0\,{<}\,\lambda_{P}\,{<}\,\lambda_{R}$|⁠. We give joint prior on |$\left(\lambda_{R},\lambda_{P}\right)$| by putting a |$ \text{Gamma} $| prior on |$\lambda_{R}$| and a Beta prior on |$\lambda_{P}/\lambda_{R}$| which ensures |$\lambda_{R}\,{>}\,\lambda_{P}$|⁠. We put unrestricted prior |$\text{Gamma}\left(\alpha_{E},\beta_{E}\right)$| on |$\lambda_{E}$|⁠. The following transformation is made from |$\left(\lambda_{R},\lambda_{P}\right)$| to |$\left(u_{1},u_{2}\right)$|⁠: |$ u_{1} ={\lambda_{P}}/{\lambda_{R}}\sim \text{Beta}\left(a,b\right),\mbox{ }u_{2}=\lambda_{R}\sim \text{Gamma}(p,r). $| So, we have |$0\,{<}\,u_{1}\,{<}\,1$| (satisfies the AS condition |$(\lambda_{R}\,{>}\,\lambda_{P})$|⁠) and |$u_{2}\,{>}\,0$|⁠. The joint distribution of |$\left(u_{1},u_{2}\right)$| is given by |$ f\left(u_{1},u_{2}\right) =\text{Beta}\left(a,b\right)\times \text{Gamma}\left(p,r\right)\varpropto u_{1}^{a-1}\left(1-u_{1}\right)^{b-1}\exp{\{-ru_{2}\}}u_{2}^{p-1}, $| which gives the joint distribution of |$\left(\lambda_{R},\lambda_{P}\right)$| as |$ f\left(\lambda_{R},\lambda_{P}\right)\varpropto\frac{1}{\lambda_{R}}\left({\lambda_{P}}/{\lambda_{R}}\right)^{a-1}\left(1-{\lambda_{P}}/{\lambda_{R}}\right)^{b-1}\exp{\{-r\lambda_{R}\}}\lambda_{R}^{p-1}$|⁠, |$0\,{<}\,\lambda_{P}\,{<}\,\lambda_{R}$|⁠. The joint prior distribution of |$(\lambda_{E},\lambda_{R},\lambda_{P})$| can be obtained by multiplying |$f(\lambda_{R},\lambda_{P})$| with |$f(\lambda_{E})\equiv \text{Gamma}\left(\alpha_{E},\beta_{E}\right)$|⁠, which is given as

$$f\left(\lambda_{E},\lambda_{R},\lambda_{P}\right)\varpropto\lambda_{E}^{\alpha_{E}-1}\exp{\{-\beta_{E}\lambda_{E}\}}\lambda_{P}^{a-1}\left(\lambda_{R}-\lambda_{P}\right)^{b-1}\exp{\{-r\lambda_{R}\}}\lambda_{R}^{p-a-b}, $$

|$0\,{<}\,\lambda_{E}\,{<}\,\infty,\mbox{ }0\,{<}\,\lambda_{P}\,{<}\,\lambda_{R}\,{<}\,\infty$|⁠. The joint posterior distribution |$f\left(\lambda_{E},\mbox{ }\lambda_{R},\mbox{ }\lambda_{P}|\text{Data}\right)$| is proportional to the multiplication of the joint likelihood and the joint prior as

$$\begin{align*} f\left(\lambda_{E},\mbox{ }\lambda_{R},\mbox{ }\lambda_{P}| \text{Data}\right) & \varpropto \text{Gamma}\left(\lambda_{E} |\alpha_{E}+X_{E},\beta_{E}+n_{E}t_{E}\right)\times \exp{\{-n_{P}\lambda_{P}t_{P}\}}\lambda_{P}^{a+x_{P}-1}\times\\ & \exp{\{-\lambda_{R}\left(r+n_{R}t_{R}\right)\}} \lambda_{R}^{x_{R}+p-a-b}\left(\lambda_{R}-\lambda_{P}\right)^{b-1},\mbox{ }0\,{<}\,\lambda_{P}\,{<}\,\lambda_{R}\,{<}\,\infty,\mbox{ }0\,{<}\,\lambda_{E}\,{<}\,\infty. \end{align*}$$

The posterior is not in the closed form and a Metropolis–Hastings acceptance–rejection sampling is required with a proposal density to generate posterior samples (Gelman and others, 2014). A convenient proposal density could be |$ \text{Gamma} $| distribution with appropriately chosen priors. In our simulation, we use “rjags” (R-package; Plummer and others, 2016) to generate the samplers from the posterior density.

Remark 1:

Following Pigeot and others (2003) and Ghosh and others (2011), we continue to assume that AS condition |$(\lambda_R\,{>}\,\lambda_P)$| is tested in Step 1, before proceeding to test for NI. As a result truncated priors are chosen in Step 2, i.e., at NI testing. This assumption explicitly reflects the fact that active control still retains some of its effect over placebo. In a situation where this assumption is questionable, it is not advisable to carry out a three-arm NI trial, rather a superiority trial of the new treatment over placebo is more realistic.

3.1.3. Test procedure

For NI testing, the value of |$\theta$| is so chosen that is clinically acceptable to claim that an experimental drug is non-inferior to an active control. Usually, |$\theta$| is chosen in the range |$[0.5,1)$| and NI of the test drug relative to the reference is claimed if the posterior probability of the alternative hypothesis given in (2.2) exceeds a pre-specified cutoff |$p^{*}$|⁠. Following Ghosh and others (2011) (Section 3.3) thus, the Bayesian decision rule to claim NI of the test drug over the reference is given as

$$\begin{equation}\label{dec} P\left(H_{1}:\frac{\lambda_{E}-\lambda_{P}}{\lambda_{R}-\lambda_{P}}\,{>}\,\theta| \lambda_{R}\,{>}\,\lambda_{P}, \text{Data}\right)\,{>}\,p^{*}. \end{equation}$$

(3.1)

The value of |$p^{*}$| is usually chosen to be 0.975 or 0.95. The above probability can be calculated empirically by generating the samplers from the posterior distribution of |$\lambda_{l}|X_{l}$|⁠, |$l\in\{E,R,P\}$|⁠. The estimated probability is given by

$$\begin{equation} \hat{P}\left(H_{1}:\frac{\lambda_{E}-\lambda_{P}}{\lambda_{R}-\lambda_{P}}\,{>}\,\theta|\lambda_{R}\,{>}\,\lambda_{P}, \text{Data}\right)\thickapprox\frac{1}{M}\sum_{m=1}^{M}I\left(\frac{\lambda_{E}^{m}-\lambda_{P}^{m}}{\lambda_{R}^{m}-\lambda_{P}^{m}}\,{>}\,\theta|\lambda^m_{R}\,{>}\,\lambda^m_{P}, \text{Data}\right)\!,\label{est_prob} \end{equation}$$

(3.2)

where |$\lambda_{E}^{m},\mbox{ }\lambda_{R}^{m}$|⁠, and |$\lambda_{P}^{m}$| denote the |$m$|th sample value drawn from the posterior distributions.

3.2. Approximate Bayesian approach

Note that in the exact Bayesian approach, the posterior sample generation is necessary to carry out the Bayesian inference which is often computationally intensive. Here, we propose an approximation-based Bayesian approach for NI testing incorporating the AS condition that gives closed form of the posterior probability and hence, saves the computation time of the MCMC sample generation from posterior distribution. Consider the |$ \text{Gamma} $| prior for the rate |$\lambda_l$| in each arm, that is, |$\lambda_l\sim \text{Gamma}\left(\alpha_l,\beta_l\right)$| and assume that the responses are distributed as Poisson; that is, |$X_l\sim \text{Poisson}\left(n_l\lambda_l t_l\right)$| for |$l\in\{E,R,P\}$|⁠. The Frequentist test statistic for testing the hypothesis in equation (2.3) is given by |$ T={X_{E}}/{n_{E}t_{E}}-\theta{X_{R}}/{n_{R}t_{R}}-(1-\theta){X_{P}}/{n_{P}t_{P}}. $| For the sake of simplicity, we take |$t_{l}=1$|⁠, |$l\in \{E,R,P\}$|⁠. Under asymptotic normality assumption, we have |$T|\mu_{T}\sim AN\left(\mu_{T},\sigma_{T}^{2}\right)$|⁠, where |$ \mu_{T}=\lambda_{E}-\theta\lambda_{R}-(1-\theta)\lambda_{P}=(\lambda_{E}-\lambda_{P})-\theta(\lambda_{R}-\lambda_{P}) $| and |$ \sigma_{T}^{2}={\lambda_{E}}/{n_{E}}+\theta^{2}{\lambda_{R}}/{n_{R}}+\left(1-\theta\right)^{2}{\lambda_{P}}/{n_{P}}. $| Putting Normal prior on |$\mu_{T}$|⁠, we have |$\mu_{T}\sim AN\left(\mu^{*},\sigma^{*2}\right)$|⁠, where |$\mu^{*}=E\left(\mu_{T}\right)=\mu_{E}-\theta\mu_{R}-\left(1-\theta\right)\mu_{P}$| and |$\sigma^{*2}=\sigma_{E}^{2}+\theta^{2}\sigma_{R}^{2}+(1-\theta)^{2}\sigma_{P}^{2},$||$\mu_{l}$| and |$\sigma_{l}^{2}$|⁠, |$l\in\{E,R,P\}$| are the respective mean and variance of the |$ \text{Gamma} $| prior for the Poisson rates. Next, we bring in the condition of AS |$(\lambda_{R}\,{>}\,\lambda_{P})$|⁠. So instead of taking prior on |$\mu_{T},$| we take prior on |$\nu_{T}\equiv\left(\mu_{T}|\lambda_{R}\,{>}\,\lambda_{P}\right)$|⁠. Assume that |$\nu_T\sim AN\left(\mu_{\nu}^{*},\sigma_{\nu}^{*2}\right)$| and the posterior, |$\nu_T|\text{Data} \sim AN\left(\widetilde{\mu}_{T}\widetilde{\sigma}_{T}^{2},\widetilde{\sigma}_{T}^{2}\right)$|⁠. We refer to Arnold and Beaver (1993) for the detailed derivation of |$\mu_{\nu}^{*},$||$\sigma_{\nu}^{*2}$|⁠, |$\widetilde{\mu}_{T}$|⁠, and |$\widetilde{\sigma}_{T}^{2}$| (see also Supplementary Appendix C available at Biostatistics online). The Bayesian decision rule for the experimental treatment to be non-inferior to the active comparator is given by Gamalo and others (2014)|$: P\left(\nu_{T}\geq0|\text{Data}\right)\,{>}\, p^{*}, $| where |$p^{*}$| is the pre-specified clinically reasonable constant.

4. Power and sample size determination

We address the problem of calculating the sample size for the assessment of NI to attain a desired power using three approaches described in Sections 2 and 3. The normal approximation-based approaches do not require any simulation for the estimation of the power function as it can be expressed in a closed form (as presented in the following subsections). However, the exact Bayesian approach requires the simulation technique to obtain the empirical power which is then set to a desired level to calculate the corresponding sample size. In our sample size calculation, we consider |$t_{l}=1,$||$l\in\{E,R,P\}$|⁠. We want to determine the sample size |$n_{l},$||$l\in\{E,R,P\}$| setting the power at |$\left(1-\beta\right)$|⁠, |$\beta$| is the pre-specified type-II error. We assume |$n_{E}=n$|⁠, |$n_{R}=r_{1}n$|⁠, and |$n_{P}=r_{2}n$|⁠, where |$r_{1}$| and |$r_{2}$| determine the allocation of the sample sizes in the reference and the placebo arms, respectively, relative to the experimental arm. The total sample size in that case would be |$N=n\left(1+r_{1}+r_{2}\right)$|⁠. In the following sub-section, we discuss the power and sample size calculation under the proposed Frequentist, approximation-based Bayesian, and exact Bayesian approaches.

4.1. Frequentist approach

To obtain the empirical power function of the NI testing in (2.3) using the test procedure described in Section 2.2, we fix |$\lambda_{R}$|⁠, |$\lambda_{P}$|⁠, and |$\theta$| and vary |$\lambda_{E}$| such that the ratio |${(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})} $||$\in\left[0.5,\mbox{ }1.5\right]$|⁠. The ratio |${(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}$| is so chosen that for NI testing under |$H_0$| it equals |$\theta \in [0.5,1)$| and exceeds |$\theta$| under |$H_1$|⁠. Under the null hypothesis, denote |$\lambda_{E}$| by |$\lambda_{E}^{\rm null}$| which satisfies |${(\lambda_{E}^{\rm null}-\lambda_{P})}=\theta{(\lambda_{R}-\lambda_{P})}$| and under |$H_{1}$|⁠, denote |$\lambda_{E}$| by |$\lambda_{E}^{\rm alt}$| which satisfies |$(\lambda_{E}^{\rm alt}-\lambda_{P})\,{>}\,\theta(\lambda_{R}-\lambda_{P})$|⁠. The empirical type-I error is obtained for |$\lambda_{E}=\lambda_{E}^{\rm null}$| and the power is obtained for |$\lambda_{E}=\lambda_{E}^{\rm alt}$|⁠. The sample size is calculated from the following equation, to achieve a power of at least |$100(1-\beta)\%$|

$$\begin{align} P_{H_{1}}\left(W\,{>}\,k^{*}_{\alpha}\right) & \geq1-\beta \Rightarrow\Phi\left(\frac{k^{*}_{\alpha}-\mu_{w}^{alt}}{\sigma_{w}^{alt}}\right)\leq\beta.\label{freqpower} \end{align}$$

(4.1)

Setting |$\beta$| at |$20\%$|⁠, that is, to have at least |$80\%$| power and at fixed |$\alpha(=0.025)$|⁠, |$n$| is determined from (4.1). We vary |$\lambda_{E}^{\rm alt}$| to get the minimum sample size satisfying at least |$80\%$| power for each |$\lambda_{E}^{\rm alt}$|⁠.

4.2. Exact Bayesian approach

The Bayesian decision rule to declare NI as given in (3.1) can be written as:

$$\begin{equation} \begin{aligned}P\left(\lambda_{E}-\theta\lambda_{R}-(1-\theta)\lambda_{P}\,{>}\,0|\lambda_{R}\,{>}\,\lambda_{P},\mbox{ }\text{Data}\right)\,{>}\,p^{*}\end{aligned} ,\label{rule} \end{equation}$$

(4.2)

Define |$\eta_{RP}=\lambda_{R}-\lambda_{P}$|⁠. Since the probability in (4.2) does not have a closed form, it is approximated by generating samples from the posterior distribution and estimating the probability as

$$\begin{equation}\label{estimatedprob} \begin{aligned} P\left(\lambda_{E}-\theta\lambda_{R}-(1-\theta)\lambda_{P}\,{>}\,0|\lambda_{R}\,{>}\,\lambda_{P},\text{Data}\right) = & P\left(\left(\lambda_{E}-\lambda_{P}\right)\,{>}\,\theta\left(\lambda_{R}-\lambda_{P}\right)|\lambda_{R}\,{>}\,\lambda_{P},\right. \left. \text{Data}\right)\\ = \int_{0}^{\infty}P\left(\lambda_{E}-\lambda_{P}\,{>}\,\theta c|\eta_{RP}=c,\text{Data}\right)f_{\eta_{RP}|\eta_{RP}\,{>}\,0}\left(c\right)dc \approx & \frac{1}{M}\sum_{i=1}^{M}g(\theta c_{i},\text{Data}), \end{aligned} \end{equation}$$

(4.3)

where |$g\left(\theta c_{i},\text{Data}\right) = P\left(\lambda_{E}-\lambda_{P} \,{>}\, \theta c_{i} | \eta_{RP} = c_{i},\text{Data}\right)$| and |$c_{i}$| being the |$i$|th sample value of |$(\lambda_{R}-\lambda_{P}| $||$ \lambda_{R} \,{>}\, \lambda_{P})$|⁠. We repeat the calculation of the estimated probability given in the left-hand side of the equation (4.2) for |$n^{*}$| times and obtain the proportion of times it exceeds the cutoff |$p^{*}$|⁠. In simulation, the value of |$n^{*}$| is usually chosen to be 1000. As in the previous two approaches, we keep |$\lambda_{R},\mbox{ }\lambda_{P}$|⁠, and |$\theta$| fixed and vary |$\lambda_{E}$| such that |${(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}$| varies within the range |$[0.5,1.5]$|⁠. For |$\lambda_{E}^{\rm alt}\,{>}\,\lambda_{E}^{\rm null}$|⁠, the estimated power of the test can be calculated as

$$\hat{\rm Power}=\frac{\mbox{No. of times }P\left(\lambda_{E}^{\rm alt}-\theta\lambda_{R}-(1-\theta)\lambda_{P}\,{>}\,0|\lambda_{R}\,{>}\,\lambda_{P}, \text{Data}\right)\,{>}\,p^{*} }{n^{*}}.$$

The sample size can be obtained by setting the estimated power to be at least |$100(1-\beta)\%,$||$\beta$| is usually chosen to be |$0.2$|⁠. We note here that since under the exact Bayesian approach the estimation of power involves generating samples from posterior distributions, there could be minor fluctuation in the estimated sample size.

4.3. Approximate Bayesian approach

For sample size determination under the approximation-based Bayesian approach, we choose “n” that satisfies the two conditions (Gamalo and others, 2014): (C1) |$P\left[P\left(\nu_{T}\geq0|\text{Data}\right) \right.$||$\left.\,{>}\, p^{*}|H_{0}\right]\leq\alpha$|⁠, (C2) |$P\left[P\left(\nu_{T}\geq0|\text{Data}\right)\,{>}\, p^{*}|H_{1}\right]\geq1-\beta$|⁠, where the probability in (C1) is the estimated average type-I error while that in (C2) is the estimated power of the test, |$\beta$| being the type-II error. The sample size is determined from condition (C2) by fixing |$\beta$| to have at least |$100\left(1-\beta\right)\%$| power and simultaneously satisfying condition (C1). As in the Frequentist approach, we choose |$\alpha=0.025$|⁠. We note that |$ P\left(\nu_{T}\geq0|\text{Data}\right) =P\left({(\nu_{T}-\tilde{\sigma}_{T}^{2}\widetilde{\mu_{T}})}/{\widetilde{\sigma}_{T}}\geq{-\tilde{\sigma}_{T}^{2}\widetilde{\mu_{T}}}/{\widetilde{\sigma}_{T}}\right)\,{>}\, p^{*} \Leftrightarrow-\tilde{\sigma}_{T}\widetilde{\mu_{T}}\leq z_{1-p^{*}} \Leftrightarrow T\geq-z_{1-p^{*}}\left({1}/{\sigma_{T}^{2}}+{1}/{\sigma_{\nu}^{*2}}\right)^{1/2}\sigma_{T}^{2}-{\mu_{\nu}^{*}}/{\sigma_{\nu}^{*2}}\sigma_{T}^{2}, $| where |$\widetilde{\mu}_{T}={T}/{\sigma_{T}^{2}}+{\mu_{\nu}^{*}}/{\sigma_{\nu}^{*2}}\mbox{ and }\widetilde{\sigma}_{T}^{2}={1}/{({1}/{\sigma_{T}^{2}}+1/{\sigma_{\nu}^{*2}})}$| (see Supplementary Appendix C available at Biostatistics online). Here, |$z_{1-p^{*}}$| is the |$100\left(1-p^{*}\right)$| % of the |$N\left(0,1\right)$| distribution. Now, the power function is obtained by varying |$\lambda_{E}$| such that |$0.5\leq {(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}\leq1.5$|⁠, keeping the other rates and |$\theta$| fixed. Let us denote |$\mu_{T}$| and |$\sigma_{T}^{2}$| by |$\mu_{T}^{\rm null}$| and |$\sigma_{T}^{2\rm null}$|⁠, respectively, under |$H_{0}$|⁠, and similarly under |$H_{1}$|⁠, denote the respective quantities by |$\mu_{T}^{\rm alt}$| and |$\sigma_{T}^{2\rm alt}$|⁠. Thus condition (C1) can be rewritten in terms of |$T$| as

$$\begin{align} P_{H_{0}}\left[T\,{>}\,-z_{1-p^{*}}\left(\frac{1}{\sigma_{T}^{2\rm null}}+\frac{1}{\sigma_{\nu}^{*2}}\right)^{1/2}\sigma_{T}^{2\rm null}-\frac{\mu_{\nu}^{*}}{\sigma_{\nu}^{*2}}\sigma_{T}^{2\rm null}\right] & \leq\alpha\nonumber\\ \Leftrightarrow P_{H_{0}}\left[\frac{T-\mu_{T}^{\rm null}}{\sigma_{T}^{\rm null}}\,{>}\,\left(-z_{1-p^{*}}\left(\frac{1}{\sigma_{T}^{2\rm null}}+\frac{1}{\sigma_{\nu}^{*2}}\right)^{1/2}\sigma_{T}^{2\rm null}-\frac{\mu_{\nu}^{*}}{\sigma_{\nu}^{*2}}\sigma_{T}^{2\rm null}-\mu_{T}^{\rm null}\right)\bigg/\sigma_{T}^{\rm null}\right] & \leq\alpha\label{c1}\\ \Leftrightarrow\varPhi\left(z_{1-p^{*}}\left(\frac{1}{\sigma_{T}^{2\rm null}}+\frac{1}{\sigma_{\nu}^{*2}}\right)^{1/2}\sigma_{T}^{\rm null}+\frac{\mu_{\nu}^{*}}{\sigma_{\nu}^{*2}}\sigma_{T}^{\rm null}+\frac{\mu_{T}^{\rm null}}{\sigma_{T}^{\rm null}}\right) & \leq\alpha.\nonumber \end{align}$$

(4.4)

Similarly, condition (C2) becomes

$$\begin{equation} \varPhi\left(z_{1-p^{*}}\left(\frac{1}{\sigma_{T}^{2\rm alt}}+\frac{1}{\sigma_{\nu}^{*2}}\right)^{1/2}\sigma_{T}^{\rm alt}+\frac{\mu_{\nu}^{*}}{\sigma_{\nu}^{*2}}\sigma_{T}^{\rm alt}+\frac{\mu_{T}^{\rm alt}}{\sigma_{T}^{\rm alt}}\right)\geq1-\beta.\label{c2} \end{equation}$$

(4.5)

A similar derivation albeit for two-arm NI trial for binary outcome can be found in Gamalo and others (2014). Now, “|$n$|” can be solved from (4.5) by setting |$\beta=20\%$| and simultaneously satisfying condition (C1) for each |$\lambda_{E}^{\rm alt}$| (which is included in |$\mu_{T}^{\rm alt}$|⁠).

4.4. Sample size under different allocation

We determine the approximate sample size to attain a power of |$1-\beta=0.8$| under three different allocation scenarios for |$\left(E,\mbox{}R,\mbox{}P\right)$|⁠: |$\left(1{:}1{:}1\right)$|⁠, |$\left(2{:}2{:}1\right)$|⁠, and |$\left(3:2:1\right)$| of the total sample size |$N(=n(1+r_{1}+r_{2}))$|⁠. We express the sample sizes in the reference and the placebo group as proportions |$r_{1}$| and |$r_{2}$| of the sample size |$n_{E}$| in the experimental group. Hence, for the allocation |$\left(1:1:1\right),$||$r_{1}=r_{2}=1$|⁠; for |$\left(2{:}2{:}1\right)$|⁠, |$r_{1}=1\mbox{ and }r_{2}=\frac{1}{2}$|⁠; and for |$\left(3{:}2:1\right)$| the values are |$r_{1}=\frac{2}{3}$| and |$r_{2}=\frac{1}{3}$|⁠. Type-I error or |$\alpha=0.025$| is kept fixed for the Frequentist approach, while for Bayesian approach we also made sure the equations (4.4) and (4.5) hold simultaneously for fixed |$(\alpha, \beta )$|⁠. In practice, |$\theta$| is chosen in |$[0.5,1)$|⁠, to ensure retention of at least |$50\%$| effect of the active control. The sample sizes are presented for |$\theta=0.8$| and |$0.75$| and for a range of |$\lambda_{E}$| keeping |$\lambda_{R}=21$| and |$\lambda_{P}=7$| in Table 2. Other values of |$\lambda$|’s are also possible satisfying the restriction |$\lambda_{R}\,{>}\,\lambda_{P}$|⁠.

Table 2.

Sample sizes based on exact and approximate approaches to achieve a power of |$80\%$| for |$\theta=0.8$| and |$0.75$|⁠, |$\alpha=0.025$| and keeping |$\lambda_R=21$| and |$\lambda_P=7$| under three different allocations. The simulated power (⁠|$ \hat{\phi} $|⁠) and estimated average type-I error (⁠|$ \hat{\alpha} $|⁠) for exact Bayesian approach under non-informative Gamma prior are also reported to show that calculated sample size is adequate to guarantee |$80\%$| power except for minor numerical fluctuation. Note, Frequentist type-I error is always strictly maintained at |$\alpha= 0.025$| by equation 4.1.

					Frequentist normal			Approximate Bayesian			Exact Bayesian
\|$E$\|	\|$R$\|	\|$P$\|	\|$\theta$\|	\|$\lambda_{E}$\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$ \hat{\alpha} $\|
				20.0	79	237	0.802	78	234	0.801	79	237	0.802	0.0215
				19.7	113	339	0.802	112	336	0.798	112	336	0.808	0.0222
			0.80	19.4	176	528	0.790	175	525	0.795	175	525	0.796	0.0225
				19.1	312	936	0.795	310	930	0.797	302	906	0.789	0.0229
1	1	1		18.8	700	2100	0.803	697	2091	0.799	685	2055	0.802	0.0224
				20.0	39	117	0.810	38	114	0.813	38	114	0.807	0.0173
				19.7	50	150	0.806	48	144	0.798	48	144	0.786	0.0208
			0.75	19.4	66	198	0.805	65	195	0.804	65	195	0.804	0.0217
				19.1	93	279	0.803	91	273	0.804	88	264	0.790	0.0185
				18.8	140	420	0.801	138	414	0.806	133	399	0.787	0.0184
				20.0	40	200	0.805	40	200	0.808	37	185	0.794	0.0219
				19.7	57	285	0.801	57	285	0.803	52	260	0.798	0.0208
			0.80	19.4	89	445	0.799	89	445	0.802	81	405	0.783	0.0217
				19.1	158	790	0.798	157	785	0.800	153	765	0.805	0.0193
2	2	1		18.8	353	1765	0.804	352	1760	0.802	351	1755	0.797	0.0189
				20.0	20	100	0.813	19	95	0.800	18	90	0.821	0.0210
				19.7	26	130	0.816	25	125	0.809	24	120	0.811	0.0201
			0.75	19.4	34	170	0.808	33	165	0.801	33	165	0.815	0.0181
				19.1	48	240	0.813	46	230	0.801	45	225	0.831	0.0181
				18.8	72	360	0.808	70	350	0.804	64	320	0.810	0.0180
				20.0	33	198	0.819	32	192	0.813	31	186	0.795	0.0216
				19.7	47	282	0.804	46	276	0.799	44	264	0.805	0.0210
			0.80	19.4	72	432	0.798	71	426	0.796	71	426	0.786	0.0199
				19.1	128	768	0.803	127	762	0.799	125	750	0.795	0.0205
3	2	1		18.8	287	1722	0.800	284	1704	0.797	277	1662	0.782	0.0209
				20.0	16	96	0.814	15	90	0.799	15	90	0.802	0.0225
				19.7	21	126	0.821	20	120	0.813	18	108	0.787	0.0201
			0.75	19.4	27	162	0.799	27	162	0.809	26	156	0.802	0.0216
				19.1	38	228	0.807	37	222	0.804	37	222	0.796	0.0193
				18.8	58	348	0.807	56	336	0.800	54	324	0.811	0.0188

					Frequentist normal			Approximate Bayesian			Exact Bayesian
\|$E$\|	\|$R$\|	\|$P$\|	\|$\theta$\|	\|$\lambda_{E}$\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$ \hat{\alpha} $\|
				20.0	79	237	0.802	78	234	0.801	79	237	0.802	0.0215
				19.7	113	339	0.802	112	336	0.798	112	336	0.808	0.0222
			0.80	19.4	176	528	0.790	175	525	0.795	175	525	0.796	0.0225
				19.1	312	936	0.795	310	930	0.797	302	906	0.789	0.0229
1	1	1		18.8	700	2100	0.803	697	2091	0.799	685	2055	0.802	0.0224
				20.0	39	117	0.810	38	114	0.813	38	114	0.807	0.0173
				19.7	50	150	0.806	48	144	0.798	48	144	0.786	0.0208
			0.75	19.4	66	198	0.805	65	195	0.804	65	195	0.804	0.0217
				19.1	93	279	0.803	91	273	0.804	88	264	0.790	0.0185
				18.8	140	420	0.801	138	414	0.806	133	399	0.787	0.0184
				20.0	40	200	0.805	40	200	0.808	37	185	0.794	0.0219
				19.7	57	285	0.801	57	285	0.803	52	260	0.798	0.0208
			0.80	19.4	89	445	0.799	89	445	0.802	81	405	0.783	0.0217
				19.1	158	790	0.798	157	785	0.800	153	765	0.805	0.0193
2	2	1		18.8	353	1765	0.804	352	1760	0.802	351	1755	0.797	0.0189
				20.0	20	100	0.813	19	95	0.800	18	90	0.821	0.0210
				19.7	26	130	0.816	25	125	0.809	24	120	0.811	0.0201
			0.75	19.4	34	170	0.808	33	165	0.801	33	165	0.815	0.0181
				19.1	48	240	0.813	46	230	0.801	45	225	0.831	0.0181
				18.8	72	360	0.808	70	350	0.804	64	320	0.810	0.0180
				20.0	33	198	0.819	32	192	0.813	31	186	0.795	0.0216
				19.7	47	282	0.804	46	276	0.799	44	264	0.805	0.0210
			0.80	19.4	72	432	0.798	71	426	0.796	71	426	0.786	0.0199
				19.1	128	768	0.803	127	762	0.799	125	750	0.795	0.0205
3	2	1		18.8	287	1722	0.800	284	1704	0.797	277	1662	0.782	0.0209
				20.0	16	96	0.814	15	90	0.799	15	90	0.802	0.0225
				19.7	21	126	0.821	20	120	0.813	18	108	0.787	0.0201
			0.75	19.4	27	162	0.799	27	162	0.809	26	156	0.802	0.0216
				19.1	38	228	0.807	37	222	0.804	37	222	0.796	0.0193
				18.8	58	348	0.807	56	336	0.800	54	324	0.811	0.0188

Table 2.

Sample sizes based on exact and approximate approaches to achieve a power of |$80\%$| for |$\theta=0.8$| and |$0.75$|⁠, |$\alpha=0.025$| and keeping |$\lambda_R=21$| and |$\lambda_P=7$| under three different allocations. The simulated power (⁠|$ \hat{\phi} $|⁠) and estimated average type-I error (⁠|$ \hat{\alpha} $|⁠) for exact Bayesian approach under non-informative Gamma prior are also reported to show that calculated sample size is adequate to guarantee |$80\%$| power except for minor numerical fluctuation. Note, Frequentist type-I error is always strictly maintained at |$\alpha= 0.025$| by equation 4.1.

					Frequentist normal			Approximate Bayesian			Exact Bayesian
\|$E$\|	\|$R$\|	\|$P$\|	\|$\theta$\|	\|$\lambda_{E}$\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$ \hat{\alpha} $\|
				20.0	79	237	0.802	78	234	0.801	79	237	0.802	0.0215
				19.7	113	339	0.802	112	336	0.798	112	336	0.808	0.0222
			0.80	19.4	176	528	0.790	175	525	0.795	175	525	0.796	0.0225
				19.1	312	936	0.795	310	930	0.797	302	906	0.789	0.0229
1	1	1		18.8	700	2100	0.803	697	2091	0.799	685	2055	0.802	0.0224
				20.0	39	117	0.810	38	114	0.813	38	114	0.807	0.0173
				19.7	50	150	0.806	48	144	0.798	48	144	0.786	0.0208
			0.75	19.4	66	198	0.805	65	195	0.804	65	195	0.804	0.0217
				19.1	93	279	0.803	91	273	0.804	88	264	0.790	0.0185
				18.8	140	420	0.801	138	414	0.806	133	399	0.787	0.0184
				20.0	40	200	0.805	40	200	0.808	37	185	0.794	0.0219
				19.7	57	285	0.801	57	285	0.803	52	260	0.798	0.0208
			0.80	19.4	89	445	0.799	89	445	0.802	81	405	0.783	0.0217
				19.1	158	790	0.798	157	785	0.800	153	765	0.805	0.0193
2	2	1		18.8	353	1765	0.804	352	1760	0.802	351	1755	0.797	0.0189
				20.0	20	100	0.813	19	95	0.800	18	90	0.821	0.0210
				19.7	26	130	0.816	25	125	0.809	24	120	0.811	0.0201
			0.75	19.4	34	170	0.808	33	165	0.801	33	165	0.815	0.0181
				19.1	48	240	0.813	46	230	0.801	45	225	0.831	0.0181
				18.8	72	360	0.808	70	350	0.804	64	320	0.810	0.0180
				20.0	33	198	0.819	32	192	0.813	31	186	0.795	0.0216
				19.7	47	282	0.804	46	276	0.799	44	264	0.805	0.0210
			0.80	19.4	72	432	0.798	71	426	0.796	71	426	0.786	0.0199
				19.1	128	768	0.803	127	762	0.799	125	750	0.795	0.0205
3	2	1		18.8	287	1722	0.800	284	1704	0.797	277	1662	0.782	0.0209
				20.0	16	96	0.814	15	90	0.799	15	90	0.802	0.0225
				19.7	21	126	0.821	20	120	0.813	18	108	0.787	0.0201
			0.75	19.4	27	162	0.799	27	162	0.809	26	156	0.802	0.0216
				19.1	38	228	0.807	37	222	0.804	37	222	0.796	0.0193
				18.8	58	348	0.807	56	336	0.800	54	324	0.811	0.0188

					Frequentist normal			Approximate Bayesian			Exact Bayesian
\|$E$\|	\|$R$\|	\|$P$\|	\|$\theta$\|	\|$\lambda_{E}$\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$n_{P}$\|	\|$N$\|	\|$ \hat{\phi} $\|	\|$ \hat{\alpha} $\|
				20.0	79	237	0.802	78	234	0.801	79	237	0.802	0.0215
				19.7	113	339	0.802	112	336	0.798	112	336	0.808	0.0222
			0.80	19.4	176	528	0.790	175	525	0.795	175	525	0.796	0.0225
				19.1	312	936	0.795	310	930	0.797	302	906	0.789	0.0229
1	1	1		18.8	700	2100	0.803	697	2091	0.799	685	2055	0.802	0.0224
				20.0	39	117	0.810	38	114	0.813	38	114	0.807	0.0173
				19.7	50	150	0.806	48	144	0.798	48	144	0.786	0.0208
			0.75	19.4	66	198	0.805	65	195	0.804	65	195	0.804	0.0217
				19.1	93	279	0.803	91	273	0.804	88	264	0.790	0.0185
				18.8	140	420	0.801	138	414	0.806	133	399	0.787	0.0184
				20.0	40	200	0.805	40	200	0.808	37	185	0.794	0.0219
				19.7	57	285	0.801	57	285	0.803	52	260	0.798	0.0208
			0.80	19.4	89	445	0.799	89	445	0.802	81	405	0.783	0.0217
				19.1	158	790	0.798	157	785	0.800	153	765	0.805	0.0193
2	2	1		18.8	353	1765	0.804	352	1760	0.802	351	1755	0.797	0.0189
				20.0	20	100	0.813	19	95	0.800	18	90	0.821	0.0210
				19.7	26	130	0.816	25	125	0.809	24	120	0.811	0.0201
			0.75	19.4	34	170	0.808	33	165	0.801	33	165	0.815	0.0181
				19.1	48	240	0.813	46	230	0.801	45	225	0.831	0.0181
				18.8	72	360	0.808	70	350	0.804	64	320	0.810	0.0180
				20.0	33	198	0.819	32	192	0.813	31	186	0.795	0.0216
				19.7	47	282	0.804	46	276	0.799	44	264	0.805	0.0210
			0.80	19.4	72	432	0.798	71	426	0.796	71	426	0.786	0.0199
				19.1	128	768	0.803	127	762	0.799	125	750	0.795	0.0205
3	2	1		18.8	287	1722	0.800	284	1704	0.797	277	1662	0.782	0.0209
				20.0	16	96	0.814	15	90	0.799	15	90	0.802	0.0225
				19.7	21	126	0.821	20	120	0.813	18	108	0.787	0.0201
			0.75	19.4	27	162	0.799	27	162	0.809	26	156	0.802	0.0216
				19.1	38	228	0.807	37	222	0.804	37	222	0.796	0.0193
				18.8	58	348	0.807	56	336	0.800	54	324	0.811	0.0188

We present the sample size for the placebo group |$\left(n_{P}\right)$|⁠. The sample sizes |$n_{R}$| and |$n_{E}$| for the arms |$R$| and |$E$| can be obtained from the allocation ratios. The total sample size for (1:1:1) is |$N=3n_{P}^{(1)}$|⁠, that for (2:2:1) is |$N=5n_{P}^{(2)}$|⁠, while for (3:2:1) it is |$N=6n_{P}^{(3)}$|⁠, where |$n_{P}^{(1)},$||$n_{P}^{(2)}$|⁠, and |$n_{P}^{(3)}$| are the respective sample sizes for the placebo group under the three allocations. Although appealing at first glance, one may not want to use a balanced study design because of two aspects: (i) due to ethical reasons in case an effective treatment exists, the number of patients receiving the placebo should be kept as small as possible and (ii) as pointed out by Koch and Tangen (1999), the difference between |$E$| and |$R$| should be expected to be much smaller than their respective difference from placebo so that the latter are easier to detect. From Table 2, we note that the necessary sample size is remarkably smaller for the unbalanced allocation (2:2:1) as compared to a balanced design and a minor reduction is again obtained for the unbalanced allocation (3:2:1) as compared to (2:2:1). Some additional results on this are also provided in Supplementary Appendix available at Biostatistics online.

4.5. Sample size for marginal vs. conditional Frequentist approach

To make a comparison of the existing marginal Frequentist approach with the proposed conditional Frequentist approach one, we present the sample sizes under both the approaches in Table 3. For simplicity, we only consider equal allocation to the three treatment arms. We determine the sample size under the two approaches for |$\theta= \{0.9, 0.8\}$| with (⁠|$\lambda_R=21$|⁠, |$\lambda_P= 7$|⁠), (⁠|$\lambda_R=18$|⁠, |$\lambda_P= 17.5$|⁠), and (⁠|$\lambda_R=7.5$|⁠, |$\lambda_P= 7$|⁠). From Table 3, we observe that for |$\lambda_R=21$| and |$\lambda_P=7$| the sample size under the conditional approach is identical to that calculated under the marginal approach, while for |$\lambda_R= 18$| and |$\lambda_P= 17.5$| or |$\lambda_R= 7$| and |$\lambda_P= 7.5$|⁠, the sample size under the conditional approach is smaller than the existing one to achieve a power of |$80 \%$|⁠. This observation points out that for smaller difference between |$\lambda_R$| and |$\lambda_P$|⁠, the proposed conditional approach is more powerful than the existing marginal approach, while for larger difference both the approaches behave similarly. This in line with the theoretical result we have proven in Lemma 2.2.2.

Table 3.

Sample size for marginal vs. conditional Frequentist approach

		Marginal		Conditional			Marginal		Conditional			Marginal		Conditional
	\|$ (\lambda_R = 21, \lambda_P = 7)$\|					\|$ (\lambda_R = 18, \lambda_P = 17.5)$\|					\|$ (\lambda_R = 7.5, \lambda_P = 7)$\|
\|$\theta$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|
0.9	23.0	26	78	26	78	20.3	48	144	44	132	10.0	18	54	16	48
	22.7	31	93	31	93	20.0	63	189	57	171	9.7	23	69	21	63
	22.4	38	114	38	114	19.7	86	258	79	237	9.4	30	90	27	81
	22.1	47	141	47	141	19.4	124	372	115	345	9.1	41	123	38	114
	21.8	61	183	61	183	19.1	197	591	185	555	8.8	61	183	57	171
	21.5	81	243	81	243	18.8	359	1077	345	1035	8.5	100	300	91	273
0.8	23.0	12	36	12	36	20.3	43	129	40	120	10.0	16	48	15	45
	22.7	13	39	13	39	20.0	55	165	52	156	9.7	20	60	19	57
	22.4	15	45	15	45	19.7	75	225	71	213	9.4	26	78	25	75
	22.1	18	54	18	54	19.4	107	321	102	306	9.1	36	108	34	102
	21.8	20	60	20	60	19.1	167	501	160	480	8.8	52	156	50	150
	21.5	24	72	24	72	18.8	295	885	287	861	8.5	84	252	80	240

		Marginal		Conditional			Marginal		Conditional			Marginal		Conditional
	\|$ (\lambda_R = 21, \lambda_P = 7)$\|					\|$ (\lambda_R = 18, \lambda_P = 17.5)$\|					\|$ (\lambda_R = 7.5, \lambda_P = 7)$\|
\|$\theta$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|
0.9	23.0	26	78	26	78	20.3	48	144	44	132	10.0	18	54	16	48
	22.7	31	93	31	93	20.0	63	189	57	171	9.7	23	69	21	63
	22.4	38	114	38	114	19.7	86	258	79	237	9.4	30	90	27	81
	22.1	47	141	47	141	19.4	124	372	115	345	9.1	41	123	38	114
	21.8	61	183	61	183	19.1	197	591	185	555	8.8	61	183	57	171
	21.5	81	243	81	243	18.8	359	1077	345	1035	8.5	100	300	91	273
0.8	23.0	12	36	12	36	20.3	43	129	40	120	10.0	16	48	15	45
	22.7	13	39	13	39	20.0	55	165	52	156	9.7	20	60	19	57
	22.4	15	45	15	45	19.7	75	225	71	213	9.4	26	78	25	75
	22.1	18	54	18	54	19.4	107	321	102	306	9.1	36	108	34	102
	21.8	20	60	20	60	19.1	167	501	160	480	8.8	52	156	50	150
	21.5	24	72	24	72	18.8	295	885	287	861	8.5	84	252	80	240

Table 3.

Open in new tab Download slide

Sample size for marginal vs. conditional Frequentist approach

		Marginal		Conditional			Marginal		Conditional			Marginal		Conditional
	\|$ (\lambda_R = 21, \lambda_P = 7)$\|					\|$ (\lambda_R = 18, \lambda_P = 17.5)$\|					\|$ (\lambda_R = 7.5, \lambda_P = 7)$\|
\|$\theta$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|
0.9	23.0	26	78	26	78	20.3	48	144	44	132	10.0	18	54	16	48
	22.7	31	93	31	93	20.0	63	189	57	171	9.7	23	69	21	63
	22.4	38	114	38	114	19.7	86	258	79	237	9.4	30	90	27	81
	22.1	47	141	47	141	19.4	124	372	115	345	9.1	41	123	38	114
	21.8	61	183	61	183	19.1	197	591	185	555	8.8	61	183	57	171
	21.5	81	243	81	243	18.8	359	1077	345	1035	8.5	100	300	91	273
0.8	23.0	12	36	12	36	20.3	43	129	40	120	10.0	16	48	15	45
	22.7	13	39	13	39	20.0	55	165	52	156	9.7	20	60	19	57
	22.4	15	45	15	45	19.7	75	225	71	213	9.4	26	78	25	75
	22.1	18	54	18	54	19.4	107	321	102	306	9.1	36	108	34	102
	21.8	20	60	20	60	19.1	167	501	160	480	8.8	52	156	50	150
	21.5	24	72	24	72	18.8	295	885	287	861	8.5	84	252	80	240

		Marginal		Conditional			Marginal		Conditional			Marginal		Conditional
	\|$ (\lambda_R = 21, \lambda_P = 7)$\|					\|$ (\lambda_R = 18, \lambda_P = 17.5)$\|					\|$ (\lambda_R = 7.5, \lambda_P = 7)$\|
\|$\theta$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|	\|$\lambda_E$\|	\|$n_{P}$\|	\|$N$\|	\|$n_{P}$\|	\|$N$\|
0.9	23.0	26	78	26	78	20.3	48	144	44	132	10.0	18	54	16	48
	22.7	31	93	31	93	20.0	63	189	57	171	9.7	23	69	21	63
	22.4	38	114	38	114	19.7	86	258	79	237	9.4	30	90	27	81
	22.1	47	141	47	141	19.4	124	372	115	345	9.1	41	123	38	114
	21.8	61	183	61	183	19.1	197	591	185	555	8.8	61	183	57	171
	21.5	81	243	81	243	18.8	359	1077	345	1035	8.5	100	300	91	273
0.8	23.0	12	36	12	36	20.3	43	129	40	120	10.0	16	48	15	45
	22.7	13	39	13	39	20.0	55	165	52	156	9.7	20	60	19	57
	22.4	15	45	15	45	19.7	75	225	71	213	9.4	26	78	25	75
	22.1	18	54	18	54	19.4	107	321	102	306	9.1	36	108	34	102
	21.8	20	60	20	60	19.1	167	501	160	480	8.8	52	156	50	150
	21.5	24	72	24	72	18.8	295	885	287	861	8.5	84	252	80	240

5. Simulation studies

We enumerate few simulation studies to evaluate the performance of the Frequentist as well as Bayesian procedures presented above. We generate the power curves for different values of |$\theta$|⁠, under both the conjugate and non-conjugate priors and make a comparison among the informative and relatively non-informative |$ \text{Gamma} $| priors under the conjugate set up. We consider a randomized trial with the sample size allocation ratio |$n_E{:}n_R{:}n_P =$| 1:1:1. Unequal sample size allocation is also possible and shown in Table 2 from the sample size perspective. However, to maintain brevity for the current power comparisons only equal allocation is described in detail.

5.1. Simulation steps

The following simulation steps are used to calculate the type-I error and power for the two different prior scenarios described earlier: (i) conjugate |$ \text{Gamma} $| prior and (ii) a non-conjugate prior. For the conjugate prior setting, we choose two sets of hyper-parameters, one of which is relatively informative with respect to the other. Note that the priors are so chosen that the mean of the |$ \text{Gamma} $| distribution equals the Poisson rates and shrinking the variance for the informative priors compared to the non-informative ones. For the non-conjugate prior, we put non-informative |$ \text{Gamma} $| prior on |$\lambda_{E}$| and suitable values are chosen for the Beta and |$ \text{Gamma} $| hyper-parameters. In the following, we give the formal steps of the simulation:

Step 1. Specify |$n_{E}, n_{R}, n_{P}$| (or, the allocation ratios), |$\lambda_{l},$||$l\in\{E,R,P\}$| with |$\lambda_{R}\,{>}\,\lambda_{P}$|⁠, and |$\theta$| so that |$\lambda_{E}\in$||$\left[\lambda_{P}+0.5(\lambda_{R}- \lambda_{P}),\right.$||$\left.\lambda_{P}+1.5(\lambda_{R}-\lambda_{P})\right]$| to generate |$\{X_{E}, X_{R}, X_{P}\} = \text{Data}$|⁠.
Step 2. For a given value of |${(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}$| or equivalently |$\lambda_{E}$|⁠, generate the data |$X_{l}$| from Poisson distribution |$\text{Poisson}\left(n_{l}\lambda_{l}\right)$|⁠, |$l\in\{E,R,P\}$|⁠.
Step 3. Generate |$M$| many posterior samples from the posterior distribution under the two priors given under Section 3.1. For the conjugate prior, we keep only those posterior values in the sample for which |$\lambda_{R}\,{>}\,\lambda_{P}.$| For the non-conjugate prior, the posterior sample values satisfy |$\lambda_{R}\,{>}\,\lambda_{P}$| automatically because of the in-built restriction. For the |$m$|th posterior sample, calculate the ratio |${(\lambda_{E}^{m}-\lambda_{P}^{m})}/{(\lambda_{R}^{m}-\lambda_{P}^{m})}$|⁠.
Step 4. Calculate the posterior probability:
$$P\left(\frac{\lambda_{E}-\lambda_{P}}{\lambda_{R}-\lambda_{P}}\,{>}\,\theta|\lambda_{R}\,{>}\,\lambda_{P}, \text{Data}\right)\approx \frac{1}{M}\sum_{m=1}^{M}I\left(\frac{\lambda_{E}^{m}-\lambda_{P}^{m}}{\lambda_{R}^{m}-\lambda_{P}^{m}}\,{>}\,\theta|\lambda_{R}^m\,{>}\,\lambda_{P}^m, \text{Data}\right). $$
Step 5. Bayesian decision criterion: If |$P\left({(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}\,{>}\,\theta|\lambda_{R}\,{>}\,\lambda_{P}, \text{Data}\right)\,{>}\, p^{*},$| increase COUNTS by 1; otherwise 0.
Step 6. Go back to step 2 and repeat the simulation |$n^{*}$| (a large number chosen a priori) number of times:
- i. Calculate the type-I error by using COUNTS divided by |$n^{*}$| for |$\lambda_{E}$| satisfying |${(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}=\theta$|⁠.
- ii. Calculate the power by using COUNTS divided by |$n^{*}$| for |$\lambda_{E}$| satisfying |${(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}\,{>}\,\theta$|⁠.
Step 7. The power curve is generated for a range of |$\lambda_{E}$| such that |$0.5\le{(\lambda_{E}-\lambda_{P})}/{(\lambda_{R}-\lambda_{P})}\leq1.5$|⁠.

Note that under the Frequentist and approximation-based Bayesian approaches, Step 3 is not needed and Step 5 needs to be replaced by the corresponding decision criterion given in Section 2.2 and Section 3.2, respectively.

5.2. Simulation results

For the conjugate prior, we chose the number of posterior samplers, |$M$|⁠, to be |$1000$|⁠. For the non-conjugate prior, a trace plot of posterior estimate for each parameter suggests |$M=1000$| MCMC samplers, where every 50th value of 50,000 MCMC samples taken as a value in the sample with 1000 burn-ins, are more than sufficient to produce stable estimate of the parameters. We consider |$\lambda_{R}=21$|⁠, |$\lambda_{P}=7$|⁠, and varying |$\lambda_E$| as in Table 2 for generating the power curves. Additionally, we also consider another specification of the parameters: |$\lambda_{R}=7$|⁠, |$\lambda_{P}=1$|⁠, and set |$\lambda_{E}$| in the range |$\left[4,9\right]$|⁠, to see the behavior of the power curves for smaller values of |$\lambda_l$|⁠, |$l \in \{E,R,P\}$|⁠. The choice of |$p^{*}$| is an important criteria. Following the Frequentist set up, we choose |$p^{*}=0.975$|⁠. However, as reported in Gamalo and others (2011), this choice of |$p^{*}$| could give too restrictive type-I error. One way to alleviate this problem is to perform Bayesian calibration; however, it is not pursued in the present paper. In Figure 1, we present four power curves corresponding to four different values of |$\theta = \{0.8, 0.75, 0.7, 0.65\}$| with |$n=100$| for parameter specification (⁠|$\lambda_R=21$|⁠, |$\lambda_P=7$|⁠) and (⁠|$\lambda_R=7$|⁠, |$\lambda_P=1$|⁠) under the Frequentist and Bayesian conjugate prior. We see that as |$\theta$| increases, the power curve shifts to the right as the proposed NI test is more powerful for smaller values of |$\theta$|⁠. This is because for smaller values of |$\theta$|⁠, it is easier to declare NI. Note that for the exact Bayesian approach, we chose the Jeffreys prior as |$\text{Gamma}(0.5,0.00001)$| which is a flat prior having large variance. The Jeffreys prior is obtained by equating |$\sqrt{(I(\lambda))}=c \lambda^{-0.5}$| with the density of |$\text{Gamma}(\alpha,\beta)$| and thus solving for |$\alpha$|⁠, |$\beta$|⁠, and the constant |$c$|⁠, |$I(\lambda)$| is the Fisher information of |$\lambda$|⁠. This prior is also used in computing Table 2. For interested reader an excellent discussion on choosing other neutral and non-informative priors on Gamma distribution is given in Kerman and others (2011). The horizontal red line in the Figure 1 corresponds to |$\alpha=0.025$|⁠. The type-I error rate under the Frequentist approach is always maintained at |$\alpha=0.025$|⁠, while that under the exact Bayesian approach is maintained at or below |$\alpha=0.025$| (see Table 2). Additional results on simulation for comparing conjugate vs. non-conjugate as well informative vs. non-informative priors are provided in Supplementary Appendix available at Biostatistics online.

$Power curves for different $\theta$ under two sets of Poisson distribution parameter values (1) $\lambda_R=21$, $\lambda_P=7$ (top row) and (2) $\lambda_R=7$, $\lambda_P=1$ (bottom row). (a and c, left column) for Frequentist approach, while (b and d, right column) for exact Bayesian conjugate prior.$

Fig. 1.

6. Application

We illustrate our proposed Frequentist and Bayesian methodology with a MS (Calabrese and others, 2012) example described in Section 1.1. The lesions of MS typically arise within the optic nerves, spinal cord, brain stem, and the periventricular white matter of the cerebral hemispheres. Neuropathological techniques and magnetic resonance imaging (MRI) are used to identify the relationship of lesions to cortical veins. Further details of the data are presented in Table 1. For our NI testing, we consider GA as the experimental treatment |$\left(E\right)$|⁠, subcutaneous (sc) IFN beta-1a as the reference drug |$\left(R\right)$|⁠, and no therapy as the placebo |$\left(P\right)$| for both 1-year and 2-year data. As indicated before, Poisson model provides satisfactory result in goodness of fit test for each arm. For our illustration, we analyze both 1-year and 2-year CLs count data separately using the exact Bayesian method under different priors, as well as using the Frequentist method. However, for formulating the original NI hypothesis, we assumed higher rate indicates greater treatment benefit, but here, for the lesion count data, smaller rate indicates higher treatment benefit. So, we reformulated the hypothesis in (2.3) as

$$\begin{equation}\label{risk_NI} H_{0}:\lambda_{E}-\theta\lambda_{R}-{\left(1-\theta\right)}\lambda_{P}\geq0\mbox{ vs. }H_{1}:\lambda_{E}-\theta\lambda_{R}-{\left(1-\theta\right)}\lambda_{P}\,{<}\,0. \end{equation}$$

(6.1)

The required AS condition will be |$\left(\lambda_{P}\,{>}\,\lambda_{R}\right)$|⁠. Now, from |$H_{0}$| given in (6.1) we have the following equivalent condition: |$ \left(\lambda_{E}-\lambda_{P}\right)-\theta\left(\lambda_{R}-\lambda_{P}\right)\geq0 \Rightarrow \left(\lambda_{P}-\lambda_{E}\right)\leq\theta\left(\lambda_{P}-\lambda_{R}\right) \Rightarrow {\left(\lambda_{P}-\lambda_{E}\right)}/{\left(\lambda_{P}-\lambda_{R}\right)}\leq\theta $|⁠. Hence, the alternative hypothesis |$H_{1}$| in (6.1) becomes |$ H_{1}: {\left(\lambda_{P}-\lambda_{E}\right)}/{\left(\lambda_{P}-\lambda_{R}\right)}\,{>}\,\theta. $| From this, the Bayesian decision criteria is

$$\begin{equation}\label{decision} P\left(H_{1}: \frac{\lambda_{E}-\lambda_{P}}{\lambda_{R}-\lambda_{P}}\,{>}\,\theta|\text{Data}\right)\,{>}\,p^{*}. \end{equation}$$

(6.2)

This shows that the Bayesian decision rule remains unchanged as in the previous case. We use |$p^{*}=0.975$| to determine NI of GA over sc-IFN beta-1a.

From Table 1, we observe that after 12 months |$37/50$||$\left(74\%\right)$| of the patients who did not receive any therapy developed |${\geq}1$| new CLs counts, |$12/46$||$\left(26\%\right)$| patients treated with sc-IFN beta-1a and |$24/48$||$\left(50\%\right)$| treated with GA respectively developed at least one lesian count. These figures after 24 months came out as |$41/50$||$\left(82\%\right)$| for the patients with no therapy, |$24/46\left(52\%\right)$| for those treated with sc-IFN beta-1a, and |$30/48\left(62\%\right)$| for the GA-treated patients. So, we observe that the percentages of at least one lesian count increased from 1 year to 2 years for all treatment arms. Also, the calculated rates of occurrence of the CLs for no therapy, reference, and test drug, are respectively, 1.53, 0.37, and 0.79 after 1-year and 2.94, 0.72, and 1.29 after 2 years. The rate of the untreated (placebo) group is much higher than those of the treated groups, which indicates that the treatments have beneficial effect in lowering the new CLs development. We first carry out the analysis under the Frequentist approach and calculate the |$p$|-value for testing the hypothesis in equation (6.1) as p-value |$=P_{H_0}(W\,{<}\,W_{\rm obs}), $| where |$W$| is the test statistic given by |$ W=(\hat{\lambda}_{E}-\theta \hat{\lambda}_{R}-\left(1-\theta\right)\hat{\lambda}_{P}|\hat{\lambda}_{P}\,{>}\,\hat{\lambda}_{R}) $| and |$W_{\rm obs}$| is the observed value of |$W$|⁠. The |$p$|-value is then compared with |$\alpha=0.025$| to deduce the Frequentist decision of NI. For Bayesian conjugate prior, we carry out the analysis assuming both non-informative and informative priors. For the non-informative case, we assume Jeffreys prior as |$\text{Gamma}\left(0.5,0.00001\right)$| for |$\lambda_{l}$|⁠, |$l\in\left\{ E,R,P\right\} $| and generate posterior samplers for the three rates from |$ \text{Gamma} $| distributions as in Step 3 of simulation studies, but keeping samples satisfying |$\lambda_{P}\,{>}\,\lambda_{R}$| to account for the AS condition. We calculate the posterior probability for the rejection of |$H_{1}$| as given in the left hand side of (6.2). We report |$P(H_{1}|\text{Data})$| in Table 4 for different values of |$\theta$| in the range |$[0.5,1)$|⁠, in order to ensure that the test drug has a meaningful effect. These posterior probabilities are compared with |$p^{*}$| to deduce the Bayesian decision. In Table 4, we also report the decisions: 1 (if NI is claimed) or 0 (otherwise) for Frequentist as well as Bayesian analyses.

Table 4.

Bayesian and Frequentist decision in the lesian count data where “1” stands for the rejection and “0” stands for acceptance of the null hypothesis. Also posterior probabilities are reported under different priors.

	Posterior probabilities					Decision
	Conjugate		Non-conjugate			Conjugate		Non-conjugate
\|$\theta$\|	Non-informative	Informative	Non-informative	Informative	Frequentist decision	Non-informative	Informative	Non-informative	Informative
1-year data
0.80	0.102	0.717	0.220	0.152	0	0	0	0	0
0.75	0.198	0.786	0.275	0.220	0	0	0	0	0
0.70	0.322	0.859	0.310	0.291	0	0	0	0	0
0.65	0.461	0.925	0.350	0.398	0	0	0	0	0
0.60	0.590	0.957	0.384	0.615	0	0	0	0	0
0.55	.00717	0.976	0.413	0.845	0	0	1	0	0
0.50	0.829	0.988	0.445	0.976	0	0	1	0	1
2-year data
0.80	0.262	0.226	0.266	0.280	0	0	0	0	0
0.75	0.481	0.456	0.302	0.456	0	0	0	0	0
0.70	0.705	0.729	0.336	0.650	0	0	0	0	0
0.65	0.847	0.902	0.355	0.821	0	0	0	0	0
0.60	0.930	0.979	0.381	0.913	0	0	1	0	0
0.55	0.983	0.998	0.404	0.963	1	1	1	0	0
0.50	0.994	0.999	0.431	0.990	1	1	1	0	1

	Posterior probabilities					Decision
	Conjugate		Non-conjugate			Conjugate		Non-conjugate
\|$\theta$\|	Non-informative	Informative	Non-informative	Informative	Frequentist decision	Non-informative	Informative	Non-informative	Informative
1-year data
0.80	0.102	0.717	0.220	0.152	0	0	0	0	0
0.75	0.198	0.786	0.275	0.220	0	0	0	0	0
0.70	0.322	0.859	0.310	0.291	0	0	0	0	0
0.65	0.461	0.925	0.350	0.398	0	0	0	0	0
0.60	0.590	0.957	0.384	0.615	0	0	0	0	0
0.55	.00717	0.976	0.413	0.845	0	0	1	0	0
0.50	0.829	0.988	0.445	0.976	0	0	1	0	1
2-year data
0.80	0.262	0.226	0.266	0.280	0	0	0	0	0
0.75	0.481	0.456	0.302	0.456	0	0	0	0	0
0.70	0.705	0.729	0.336	0.650	0	0	0	0	0
0.65	0.847	0.902	0.355	0.821	0	0	0	0	0
0.60	0.930	0.979	0.381	0.913	0	0	1	0	0
0.55	0.983	0.998	0.404	0.963	1	1	1	0	0
0.50	0.994	0.999	0.431	0.990	1	1	1	0	1

Table 4.

Bayesian and Frequentist decision in the lesian count data where “1” stands for the rejection and “0” stands for acceptance of the null hypothesis. Also posterior probabilities are reported under different priors.

	Posterior probabilities					Decision
	Conjugate		Non-conjugate			Conjugate		Non-conjugate
\|$\theta$\|	Non-informative	Informative	Non-informative	Informative	Frequentist decision	Non-informative	Informative	Non-informative	Informative
1-year data
0.80	0.102	0.717	0.220	0.152	0	0	0	0	0
0.75	0.198	0.786	0.275	0.220	0	0	0	0	0
0.70	0.322	0.859	0.310	0.291	0	0	0	0	0
0.65	0.461	0.925	0.350	0.398	0	0	0	0	0
0.60	0.590	0.957	0.384	0.615	0	0	0	0	0
0.55	.00717	0.976	0.413	0.845	0	0	1	0	0
0.50	0.829	0.988	0.445	0.976	0	0	1	0	1
2-year data
0.80	0.262	0.226	0.266	0.280	0	0	0	0	0
0.75	0.481	0.456	0.302	0.456	0	0	0	0	0
0.70	0.705	0.729	0.336	0.650	0	0	0	0	0
0.65	0.847	0.902	0.355	0.821	0	0	0	0	0
0.60	0.930	0.979	0.381	0.913	0	0	1	0	0
0.55	0.983	0.998	0.404	0.963	1	1	1	0	0
0.50	0.994	0.999	0.431	0.990	1	1	1	0	1

	Posterior probabilities					Decision
	Conjugate		Non-conjugate			Conjugate		Non-conjugate
\|$\theta$\|	Non-informative	Informative	Non-informative	Informative	Frequentist decision	Non-informative	Informative	Non-informative	Informative
1-year data
0.80	0.102	0.717	0.220	0.152	0	0	0	0	0
0.75	0.198	0.786	0.275	0.220	0	0	0	0	0
0.70	0.322	0.859	0.310	0.291	0	0	0	0	0
0.65	0.461	0.925	0.350	0.398	0	0	0	0	0
0.60	0.590	0.957	0.384	0.615	0	0	0	0	0
0.55	.00717	0.976	0.413	0.845	0	0	1	0	0
0.50	0.829	0.988	0.445	0.976	0	0	1	0	1
2-year data
0.80	0.262	0.226	0.266	0.280	0	0	0	0	0
0.75	0.481	0.456	0.302	0.456	0	0	0	0	0
0.70	0.705	0.729	0.336	0.650	0	0	0	0	0
0.65	0.847	0.902	0.355	0.821	0	0	0	0	0
0.60	0.930	0.979	0.381	0.913	0	0	1	0	0
0.55	0.983	0.998	0.404	0.963	1	1	1	0	0
0.50	0.994	0.999	0.431	0.990	1	1	1	0	1

From Table 4, we observe that the posterior probabilities increase as the values of |$\theta$| decrease implying higher chance of declaring NI for smaller values of |$\theta$|⁠. For the 1-year data, under Jeffreys prior, we see that the average posterior probability |$P(H_{1}|\text{Data})$| are small for all values of |$\theta$| meaning that NI of GA cannot be claimed for |$\theta\in[0.5,0.8].$| This is due to the fact that the rate of lesion count occurrence for GA-treated patients is higher than those treated with sc-IFN beta-1a, which gives an indication that GA is possibly inferior to sc-IFN beta-1a since its effect is not within the NI margin. But if we choose an informative prior with suitable parameters, then NI can be claimed for |$\theta\leq0.55$|⁠. In this case, we chose the following priors for the three arms respectively: |$E$|⁠: |$\text{Gamma}(8,10)$|⁠, |$R$|⁠: |$\text{Gamma}(20,5)$|⁠, and |$P$|⁠: |$\text{Gamma}(12,8)$|⁠. For the 2-year data also, the rate of lesian count for GA is higher than that of the reference group; however, the difference between the rates is within the NI margin |$\delta$|⁠, to claim NI of GA for small values of |$\theta$|⁠, even under the Jeffreys prior. Considering informative prior, we can still improve on the posterior probabilities. Choosing the following priors: |$E$|⁠: |$\text{Gamma}(64,49.6)$|⁠, |$R$|⁠: |$\text{Gamma}(12,17)$|⁠, |$P$|⁠: |$\text{Gamma}(60,20.4)$|⁠, NI is claimed for |$\theta\leq0.6$|⁠. Finally, considering the non-conjugate prior, for the reformulated hypothesis in (6.1), we assume the following: |$ u_{1} ={\lambda_{R}}/{\lambda_{P}}\sim \text{Beta}(a,b)\mbox{,} u_{2} =\lambda_{P}\sim \text{Gamma}(p,r), $| where |$0\,{<}\,u_{1}\,{<}\,1$|⁠, which satisfies the AS condition (⁠|$\lambda_{R}\,{>}\,\lambda_{P}$|⁠) and |$u_{2}\,{>}\,0$|⁠. For the 1-year data, assuming a relatively non-informative prior: |$\text{Gamma}\left(0.8,1\right)$| for |$E$|⁠; |$\text{Gamma}\left(1.5,1\right)$| for |$P$|⁠; and |$\text{Beta}\left(1,3.1\right)$| for |${\lambda_{R}}/{\lambda_{P}}$|⁠; we observe that NI cannot be claimed for any |$\theta\in[0.5,0.8]$|⁠. However, if the following priors are chosen: |$\text{Gamma}\left(160,200\right)$| for |$E$|⁠; |$\text{Gamma}\left(600,400\right)$| for |$P$|⁠; and |$\text{Beta}\left(200,630\right)$| for |${\lambda_{R}}/{\lambda_{P}}$|⁠; then NI can be claimed for |$\theta=0.5$|⁠. Also for the 2-year data, the observations are similar for the non-informative prior in the non-conjugate setting. NI cannot be claimed for the priors: |$\text{Gamma}\left(0.8,0.62\right)$| for |$E$|⁠; |$\text{Gamma}\left(0.75,0.255\right)$| for |$P$|⁠; and |$\text{Beta}\left(1.2,3.7\right)$| from the ratio |${\lambda_{R}}/{\lambda_{P}}$|⁠. However, for the relatively informative priors: |$\text{Gamma}\left(80,62\right)$| for |$E$|⁠; |$\text{Gamma}\left(75,25.5\right)$| for |$P$|⁠; and |$\text{Beta}\left(12,37\right)$| for |$\frac{\lambda_{R}}{\lambda_{P}}$|⁠; NI can be claimed for |$\theta=0.5$|⁠. We note that the hyper-parameters for both conjugate and non-conjugate priors are so chosen that the mean of the |$ \text{Gamma} $| distribution equals the estimated count rate in the respective arms. Also, we observed that for the 1-year data, more informative priors are needed to claim NI, as compared to the 2-year data. This indicates that the present trial data for 1-year end-point does not support NI strongly, while for 2-year endpoint, NI can be claimed if we choose |$\theta \,{<}\, 0.6$|⁠.

7. Discussion

According to several guidelines, the NI margin should be pre-specified in the protocol, while some allows flexibility of pre-specifying a fixed amount of effect retention (e.g., FDA, 2016; ICH Steering Committee, 1998, 2000; EMA, 2005; Wangge and others, 2013). Thus the value of the NI margin can vary greatly depending on the estimated effect size of the reference treatment (⁠|$\lambda_{R}-\lambda_{P}$|⁠). In this article, we presented novel Frequentist and Bayesian test procedures for three-arm NI trial under fraction margin approach. We proposed more powerful conditional test (Lemma 2.2.2) based on Frequentist principle which directly incorporates the AS condition in NI testing. We believe this is a better usage of available information. Under AS assumption, conditional principle is more realistic and more powerful than the traditional marginal NI testing and it does not result in a biased test (e.g., joint testing of NI and AS). In the conditional Frequentist approach, we conditioned the NI test statistic on |$\hat{\lambda}_R \,{>}\, \hat{\lambda}_P$|⁠; however, it is very much possible to condition it based on the AS test statistic. However, this is not done in the current paper as that will make Bayesian (conditioned on |$ \lambda_R\,{>}\, \lambda_P $|⁠) and Frequentist approach incomparable, since then each approach will use slightly different conditioning statement. In the Bayesian context, we explored conjugate prior and also specified more flexible non-conjugate prior choices. In Section 4.2, for integer-valued parameters we have also shown an interesting connection between Bayesian posterior probability to Frequentist exact probability. This could be further exploited to connect Bayesian and Frequentist sample size in the line of Zaslavsky (2013). Since Bayesian power calculation requires additional computation, we tabulated the sample size in Table 2 under three different types of allocation. We hope that the clinicians will find this readily useful in designing such NI trial.

We have observed that the Bayesian normal approximation and the exact Bayesian approach yield greater power and hence require smaller sample size compared to the Frequentist approach. Albeit, we would like caution an user about the control of type-I error in Bayesian context as pointed out in recent papers by Kopp-Schneider and others (2019) and Psioda and Ibrahim (2018). It is reported that with informative prior strict type-I error control in the Frequentist sense is not possible under Bayesian setup. In this article, all reported type-I errors are “average type-I error” as defined in Gravestock and others (2017), which is essentially an average over all possible outcomes under null distribution. We thank an anonymous reviewer for pointing this out. Also, it is evident that an unbalanced allocation of the sample size in NI trial results in the reduction of the required number of patients to achieve a certain power. According to Pigeot and others (2003), an unbalanced allocation of the total sample size in a NI trial is desirable from ethical and substantial point of view. We also applied our proposed Bayesian test procedure on a clinical trial data on MS. The results suggest that the Bayesian methods perform favorably in all situations and that these methods do not depend on any asymptotic approximation as the Frequentist method.

Notably, with Poisson distributed outcomes, rate/count difference is not the only function of interest. In the binary context, apart from risk difference similar methods for risk ratio and odds ratio has been developed in both Frequentist (Chowdhury and others, 2018b) and Bayesian context Chowdhury and others (2018a) very recently. In a similar line one may frame two-arm NI trial using the ratio of Poisson rates as done in Stucke and Kieser (2013) for two-arm trial. However, for a three-arm trial, defining such a functional (in ratio form) is non-trivial. We are currently developing both the Frequentist and Bayesian methods for these types of functionals. Also for the count type outcome over-dispersion (and under-dispersion) is a frequent issue and Poisson model is not an ideal choice. However, given the dearth of Bayesian article for count data, we did not consider those issues in the current paper. One could use negative binomial (Mütze and others, 2016) or generalized Poisson distribution instead, however the resulting Bayesian (and Frequentist) procedure will be much more involved and as a result left as a future work.

Software

The open source R codes for all the simulation studies and real data analyses performed in this manuscript are available at https://github.com/erina633/Poisson3armNI. Also, there is a README.md file which describes the contents of the R files and all the source functions. All the proofs and additional results are placed in Supplementary Appendix available at Biostatistics online.

Supplementary material

Supplementary material is available at http://biostatistics.oxfordjournals.org.

Acknowledgments

Conflict of Interest: None declared. The article reflects the views of the author and should not be construed to represent FDA’s views or policies.

Funding

The research of first author is partly supported by PCORI (contract number ME-1409-21410); and NIH (P30-ES020957).

References

Arnold,

B. C.

and

Beaver,

R. J.

(

1993

).

The nontruncated marginal of a truncated bivariate normal distribution

.

Psychometrika

58

,

471

–

488

.

Berger,

R. L.

(

1997

).

Likelihood ratio tests and intersection-union tests

. In

Advances in Statistical Decision Theory and Applications

.

Boston

:

Birkhäuser

, pp.

225

–

237

.

Calabrese,

M.

,

Bernardi,

V.

,

Atzori,

M.

,

Mattisi,

I.

,

Favaretto,

A.

,

Rinaldi,

F.

,

Perini,

P.

, and

Gallo,

P.

(

2012

).

Effect of disease-modifying drugs on cortical lesions and atrophy in relapsing-remitting multiple sclerosis

.

Multiple Sclerosis Journal

18

,

418

–

424

.

Chowdhury,

S.

,

Tiwari,

R.

, and

Ghosh,

S.

(

2018a

).

Approaches for testing non-inferiority in two-arm trial for risk ratio and odds ratio

.

Journal of Biopharmaceutical Statistics

29

,

425

–

445

.

Chowdhury,

S.

,

Tiwari,

R. C.

, and

Ghosh,

S.

(

2018b

).

Non-inferiority testing for risk ratio, odds ratio and number needed to treat in three-arm trial

.

Computational Statistics & Data Analysis

.

Chuang-Stein,

C.

,

Stryszak,

P.

,

Dmitrienko,

A.

, and

Offen,

W.

(

2007

).

Challenge of multiple co-primary endpoints: a new approach

.

Statistics in Medicine

26

,

1181

–

1192

.

D’Agostino Sr,

R. B.

,

Massaro,

J. M.

, and

Sullivan,

L. M.

(

2003

).

Non-inferiority trials: design concepts and issues—the encounters of academic consultants in statistics

.

Statistics in Medicine

22

,

169

–

186

.

EMA (

2005

).

Guideline on the Choice of the Noninferiority Margin (Doc. Ref. EMEA/CPMP/EWP/2158/99).

European Medicines Agency: Pre-authorisation Evaluation of Medicines for Human Use

.

Google Preview

FDA (

2016

).

Non-inferiority Clinical Trials to Establish Effectiveness: Guidance for Industry

.

Silver Spring, MD

:

US Department of Health and Human Services, Food and Drug Administration, Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER)

.

Google Preview

Friede,

T.

and

Schmidli,

H.

(

2010

).

Blinded sample size reestimation with count data: methods and applications in multiple sclerosis

.

Statistics in Medicine

29

,

1145

–

1156

.

Gamalo,

M. A.

,

Tiwari,

R. C.

, and

LaVange,

L. M.

(

2014

).

Bayesian approach to the design and analysis of non-inferiority trials for anti-infective products

.

Pharmaceutical Statistics

13

,

25

–

40

.

Gamalo,

M. A.

,

Wu,

R.

, and

Tiwari,

R. C.

(

2011

).

Bayesian approach to noninferiority trials for proportions

.

Journal of Biopharmaceutical Statistics

,

21

,

902

–

919

.

Gamalo,

M. A.

,

Wu,

R.

, and

Tiwari,

R. C.

(

2016

).

Bayesian approach to non-inferiority trials for normal means

.

Statistical Methods in Medical Research

25

,

221

–

240

.

Gbur,

E. E.

(

1981

).

On the poisson index of dispersion: On the poisson index

.

Communications in Statistics-Simulation and Computation

10

,

531

–

535

.

Gelman,

A.

,

Carlin,

J. B.

,

Stern,

H. S.

, and

Rubin,

D. B.

(

2014

).

Bayesian Data Analysis

, Volume

2

.

Boca Raton, FL

:

Chapman

.

Google Preview

Ghosh,

P.

,

Nathoo,

F.

,

Gonen,

M.

, and

Tiwari,

R. C.

(

2011

).

Assessing noninferiority in a three-arm trial using the Bayesian approach

.

Statistics in Medicine

30

,

1795

–

1808

.

Ghosh,

S.

,

Chatterjee,

A.

, and

Ghosh,

S.

(

2017

).

Non-inferiority test based on transformations for non-normal distributions

.

Computational Statistics & Data Analysis

113

,

73

–

87

.

Ghosh,

S.

,

Ghosh,

S.

, and

Tiwari,

R.

(

2016

).

Bayesian approach for assessing non-inferiority in a three-arm trial with pre-specified margin

.

Statistics in Medicine

35

,

695

–

708

.

Gravestock,

I.

,

Held,

L.

; COMBACTE-Net Consortium. (

2017

).

Adaptive power priors with empirical Bayes for clinical trials

.

Pharmaceutical Statistics

16

,

349

–

360

.

Hida,

E.

and

Tango,

T.

(

2013

).

Three-arm noninferiority trials with a prespecified margin for inference of the difference in the proportions of binary endpoints

.

Journal of Biopharmaceutical Statistics

23

,

774

–

789

.

Huang,

L.

,

Zalkikar,

J.

, and

Tiwari,

R. C.

(

2011

).

A likelihood ratio test based method for signal detection with application to FDA’s drug safety data

.

Journal of the American Statistical Association

106

,

1230

–

1241

.

Hung,

H. M. J.

and

Wang,

S. J.

(

2004

).

Multiple testing of noninferiority hypotheses in active controlled trials

.

Journal of Biopharmaceutical Statistics

14

,

327

–

335

.

ICH Steering

Committee

(

1998

).

ICH harmonised tripartite guideline: statistical principles for clinical trials

.

ICH Steering

Committee

(

2000

).

ICH harmonised tripartite guideline: choice of control group and related issues in clinical trials

.

Kerman,

J.

(

2011

).

Neutral noninformative and informative conjugate beta and gamma prior distributions

.

Electronic Journal of Statistics

5

,

1450

–

1470

.

Kieser,

M.

and

Friede,

T.

(

2007

).

Planning and analysis of three-arm non-inferiority trials with binary endpoints

.

Statistics in Medicine

26

,

253

–

273

.

Kieser,

M.

and

Stucke,

K.

(

2016

).

Assessing additional benefit in noninferiority trials

.

Biometrical Journal

58

,

154

–

169

.

Koch,

A.

and

Röhmel,

J.

(

2004

).

Hypothesis testing in the “gold standard” design for proving the efficacy of an experimental treatment relative to placebo and a reference

.

Journal of Biopharmaceutical Statistics

14

,

315

–

325

.

Koch,

G. G.

and

Tangen,

C. M.

(

1999

).

Non parametric analysis of covariance and its role in non-inferiority clinical trials

.

Drug Information Journal

33

,

1145

–

1159

.

Kopp-Schneider,

A.

,

Calderazzo,

S.

, and

Wiesenfarth,

M.

(

2019

).

Power gains by using external information in clinical trials are typically not possible when requiring strict type I error control

.

Biometrical Journal

62

,

361

–

374

.

Kulldorff,

M.

(

1997

).

A spatial scan statistic

.

Communications in Statistics-Theory and Methods

26

,

1481

–

1496

.

Lu,

N. T.

,

Xu,

Y.

, and

Yang,

Y.

(

2018

).

Incorporating a companion test into the noninferiority design of medical device trials

.

Journal of Biopharmaceutical Statistics

29

,

143

–

150

.

McIntosh,

J.

(

2001

).

Analyzing counts, durations, and recurrences in clinical trials

.

Journal of Biopharmaceutical Statistics

11

,

65

–

74

.

Mielke,

M.

,

Munk,

A.

, and

Schacht,

A.

(

2008

).

The assessment of non-inferiority in a gold standard design with censored, exponentially distributed endpoints

.

Statistics in Medicine

27

,

5093

–

5110

.

Mütze,

T.

,

Munk,

A.

, and

Friede,

T.

(

2016

).

Design and analysis of three-arm trials with negative binomially distributed endpoints

.

Statistics in Medicine

35

,

505

–

521

.

Noseworthy,

J. H.

(

2003

).

Management of multiple sclerosis: current trials and future options

.

Current Opinion in Neurology

16

,

289

–

297

.

Pigeot,

I.

,

Schäfer,

J.

,

Röhmel,

J.

, and

Hauschke,

D.

(

2003

).

Assessing noninferiority of a new treatment in a three-arm clinical trial including a placebo

.

Statistics in Medicine

22

,

883

–

899

.

Plummer,

M.

,

Stukalov,

A.

, and

Denwood,

M.

(

2016

).

Bayesian graphical models using MCMC

.

R News

364

,

1

.

Psioda,

M. A.

and

Ibrahim,

J. G.

(

2018

).

Bayesian clinical trial design using historical data that inform the treatment effect

.

Biostatistics

20

,

400

–

415

.

Schumi,

J.

and

Wittes,

J. T.

(

2011

).

Through the looking glass: understanding non-inferiority

.

Trials

12

,

106

–

118

.

Silcocks,

P.

,

Whitham,

D.

, and

Whitehouse,

W. P.

(

2010

).

P3MC: a double blind parallel group randomised placebo controlled trial of Propranolol and Pizotifen in preventing migraine in children

.

Trials

,

11

,

71

.

Simon,

R.

(

1999

).

Bayesian design and analysis of active control clinical trials

.

Biometrics

55

,

484

–

487

.

Soeiro-de Souza,

M. G.

,

Andreazza,

A. C.

,

Carvalho,

A. F.

,

Machado-Vieira,

R.

,

Young,

L. T.

, and

Moreno,

R. A.

(

2013

).

Number of manic episodes is associated with elevated DNA oxidation in bipolar I disorder

.

International Journal of Neuropsychopharmacology

16

,

1505

–

1512

.