357
Views
3
CrossRef citations to date
0
Altmetric
ARTICLES

Testing hypotheses under covariate-adaptive randomisation and additive models

ORCID Icon
Pages 96-101 | Received 05 Apr 2018, Accepted 13 May 2018, Published online: 25 May 2018

ABSTRACT

Covariate-adaptive randomisation has a long history of applications in clinical trials. Shao, Yu, and Zhong [(2010). A theory for testing hypotheses under covariate-adaptive randomization. Biometrika, 97, 347–360] and Shao and Yu [(2013). Validity of tests under covariate-adaptive biased coin randomization and generalized linear models. Biometrics, 69, 960–969] showed that the simple t-test is conservative under covariate-adaptive biased coin (CABC) randomisation in terms of type I error, and proposed a valid test using the bootstrap. Under a general additive model with CABC randomisation, we construct a calibrated t-test that shares the same property as the bootstrap method in Shao et al. (2010), but do not need large computation required by the bootstrap method. Some simulation results are presented to show the finite sample performance of the calibrated t-test.

1. Introduction

In clinical trials and medical studies, patients arrive sequentially and must be treated immediately. When two treatments are compared under simple randomisation (SR), patients are allocated randomly into two treatment groups. The statistical inference may suffer from the disadvantage of not balancing patients' prognostic factors such as the age category, gender, disease stage, prior chemotherapy and geographical region that may influence the outcomes, although simple randomisation still produces valid statistical tests. Various randomisation methods have been proposed in the literature and they have advantages such as minimising imbalance between treatment groups, reducing selection bias, minimising accidental bias and improving efficiency in inference; see, for example, Efron (Citation1971), Taves (Citation1974), Pocock Simon (Citation1975), Kalish Begg (Citation1985), Aickin (Citation2001), Weir Lees (Citation2003), Shao, Yu, Zhong (Citation2010), Shao Yu (Citation2013) and Ma, Hu, Zhang (Citation2015). A common characteristic of these methods is the use of a randomised treatment allocation that depends on covariates or prognostic factors but is conditionally independent of the outcomes given the covariates used in randomisation. Thus, they are called covariate-adaptive randomisation methods. The current paper focuses on one such method that applies the biased coin method (Efron, Citation1971) to patients grouped by prognostic factors, which is referred to as the covariate-adaptive biased coin (CABC) method by Shao et al. (Citation2010). Similar results can be obtained for the minimisation procedure (Pocock & Simon, Citation1975; Taves, Citation1974) and the stratified block randomisation (Kalish & Begg, Citation1985), which together with the CABC are the most popular covariate-adaptive randomisation methods in clinical trials.

For any given randomisation method, statistical tests valid under the particular randomisation scheme should be used for testing the possible treatment effect. A statistical test is said to be valid if the type I error rate of the test is at most α, a given significance level, at least in the limiting case when the total sample size increases to infinity. The validity of various statistical tests under SR has been extensively studied in the statistical literature. For covariate-adaptive randomisation, however, there only exist a few theoretical results about the validity of statistical tests (e.g. Ma et al., Citation2015; Shao et al., Citation2010 and Shao & Yu, Citation2013), although covariate-adaptive randomisation has been used in clinical trials for a long time and there are many empirical results regarding properties of tests under covariate-adaptive randomisation (e.g. Aickin, Citation2002; Brikett, Citation1985; Forsythe, Citation1987; Hagino et al., Citation2004; Weir & Lees, Citation2003). As Rosenberger and Sverdlov (2008, Section 4) pointed out in their review, ‘Very little theoretical work has been done in this area, despite the proliferation of papers. The original source papers are fairly uninformative about theoretical properties of the procedures’.

Under linear and generalised linear models, Shao et al. (Citation2010) and Shao and Yu (Citation2013), respectively, derived valid tests for comparing two treatments under CABC. Their tests are based on a modification of the tests developed under SR, where the modification is to apply a bootstrap variance estimation method that has a CABC component to address the variation in CABC randomisation. This bootstrap test was shown to be valid asymptotically and robust against misspecification of model and link function.

The purpose of this paper is to show that we can construct an asymptotically valid test under CABC without using the bootstrap by directly providing a consistent variance estimator in a general additive model that includes both linear and generalised linear models as special cases. The new test shares the robustness property with the bootstrap, but does not need the large computation required by the bootstrap. The same idea can be applied to the other two popular covariate-adaptive randomisation methods in clinical trials, the minimisation and stratified block randomisation.

2. Notation and preliminaries

Let N be the number of patients under two treatments, be the treatment indicator that equals j if patient i is assigned to treatment j, j=0,1 and be the outcome of patient i under treatment j. For patient i, is observed. Associated with patient i, let be a vector of covariates and prognostic factors and be a function of used in CABC, where is discrete with values , , and K is a fixed integer . We assume that , , are independent and identically distributed random vectors from some distribution.

Under SR, 's are independent with for all i and are independent of . With a fixed constant , the biased coin method in Efron (Citation1971) assigns the ith patient according to , where and is the difference between the number of patients in treatment 1 and the number of patients in treatment 0 after i−1 assignments have been made. This assignment rule tends to achieve balance between the numbers of patients in two treatment groups, since and is an imbalance metric. The CABC method applies the biased coin within each category of patients with , . The motivation is to achieve balance between treatment groups for each prognostic factor. A characteristic of CABC, which is common for all covariate-adaptive randomisation methods, is that 's and 's are conditionally independent given 's, although unconditionally 's and 's are dependent.

A statistical test T is a function of observed , , constructed such that we reject a given null hypothesis if and only if , where α is a given significance level and is a quantile of the standard normal distribution or a t-distribution. T is said to be (asymptotically) valid if, when holds, (1) with equality holds for at least some cases.

One of the main results in Shao et al. (Citation2010), followed by Shao and Yu (Citation2013), is that if a test T is constructed using covariates 's under a correctly specified model between and , and T is valid according to Equation (Equation1) under SR, then T is still valid under CABC.

However, there are practical considerations under which some covariates are not included in the construction of the test T. For example, including all covariates may lead to changing a simple test procedure to a complicated one, such as from one-way analysis of variance to two-way analysis of variance; data in some discrete covariate categories may be sparse so that including these covariates may result in some bad behaviour of the test. When is not included in the construction of T and CABC is used, the result in Shao et al. (Citation2010) indicates that the test is conservative in the sense that with a fixed . The reason for this is that typically T is a ratio of an estimated effect under SR divided by the standard error of ; although is still asymptotically valid under CABC, the standard error of valid under SR overestimates that under CABC.

To obtain a valid test under CABC, it suffices to derive a standard error of that is asymptotically consistent, or equivalently a consistent variance estimator of . Shao et al. (Citation2010) proposed a bootstrap variance estimator with a re-assigning treatment indicators in bootstrapping. This bootstrap method, however, requires a large amount of computation.

3. The main result

We consider the following general additive model: (2) where is an unknown function satisfying and , and is the response mean under treatment j=0,1. We consider either the two-sided hypotheses versus , or the one-sided hypotheses versus .

The two sample t-test is (3) or the absolute value of , where and are, respectively, the numbers of patients in treatment groups 1 and 0, and and are, respectively, the sample mean and sample variance within treatment j.

Suppose that CABC is applied within each group formed by , which is a discrete function of taking values with a fixed . As proved in Shao et al. (Citation2010), is conservative under CABC because CABC does not introduce any bias and the variance estimator in Equation (Equation3) does not account for the correlation between and . They then suggested applying a particular bootstrap method to construct a consistent variance estimator of under CABC, which leads to a valid bootstrap t-test, denoted as .

Explicitly, as shown in the appendix, under model (Equation2) and CABC, (4) where is convergence in distribution, (5) and . An interesting observation is that, under model (Equation2) and the null hypothesis, (6) which can be consistently estimated by (7) where is the sample variance of within and is the number of subjects in the data set with , . The proof is given in the appendix. This alternative way of obtaining a consistent variance estimator is not only computationally easy but also robust against any model misspecification. The two sample t-test with variance estimated by (Equation7) is (8) which is named as a calibrated t-test.

Consider the following working model, (9) This model is a special case of model (Equation2) but it is not necessary correct. Wald's test statistic under SR is (10) where and are, respectively, the sample mean and sample variance based on 's under treatment j, and is the least square estimator of β assuming model (Equation9). As shown in Shao Yu (Citation2013), under CABC and model (Equation2), (11) and (12) Under model (Equation2) and CABC, (13) where (14) Since in Equation (Equation14) and in Equation (Equation5) are related by (15) results (Equation12)–(Equation15) show that Wald's test is conservative under CABC unless is a constant, i.e. is independent of . Thus, Wald's test is not valid in the sense of Equation (Equation1), unless the working model (Equation9) is a correct model.

If we borrow the idea of consistently estimating the variance of under , a calibrated Wald's test can also be constructed as (16) which is valid and asymptotically equivalent to its counterpart in Equation  (Equation8).

This calibrated variance idea can also be extended to the case where working model (Equation9) is replaced by a more complicated one.

4. Simulation results

4.1. Linear model

A simulation study was carried out to examine the type I error of the calibrated t-test and Wald's test under CABC along with five other tests: the two sample t-test under SR, Wald's test under SR, the two sample t-test under CABC, Wald's test under CABC and the bootstrap t-test under CABC.

In the simulation study, the significance level is ; is ; the probability p in CABC is 2/3; the sample size N is 200; the bootstrap variance estimator is approximated by Monte Carlo with B=200; and the simulated type I error and power are based on 10,000 runs and 2000 runs, respectively. The simulation setting is , where and are both binary with . Both and are used in the CABC and in the construction of Wald's test, but the interaction term is ignored in the construction of Wald's test.

The simulation results and values of are shown in Table . A few conclusions from Table  are:

  1. The two sample t-test and Wald's test derived under the simplified working model are conservative under CABC.

  2. The type I errors of the bootstrap t-test , calibrated t-test and calibrated Wald's test under CABC are reasonably close to the nominal level 5%, depicting the validity of all three tests, and the consistency of .

  3. , and have almost the same empirical power, which agrees with the asymptotic equivalence of , and under CABC.

Table 1. Simulation power in % under linear model (, N=200, 10,000 simulation runs when , 2000 simulation runs when ).

The advantage of the proposed bootstrap t-test is that it directly estimates the variance of by Monte Carlo sampling, which performs well under small sample size and is robust against any model misspecification. The one-way analysis of covariance test is invalid under CABC if model is misspecified. But the calibrated one-way analysis of covariance test is robust against model misspecification, computationally easy and performs well with regard to both type I error and power. The calibrated t-test is computationally easy, but has certain requirement on sample size for the gap between variance estimator and to be ignorable.

4.2. Logistic model

The second simulation setting is , where and are both binary with . Both and are used in the CABC and in the construction of Wald's test, but the interaction term is ignored in the analysis. The rest of the parameters are the same as in Table .

The simulation results and values of are shown in Table . A few conclusions from Table  are:

  1. The two sample t-test is conservative under CABC, while Wald's test is valid though derived under the simplified working model.

  2. is valid under CABC, indicating that under the generalised linear model, the new variance estimator is still valid.

  3. and have almost the same power as Wald's test .

Table 2. Simulation power in % under logistic model (, N=200, 10,000 simulation runs when , 2000 simulation runs when ).

Acknowledgements

The author would like to thank two referees for their helpful comments and suggestions.

Disclosure statement

No potential conflict of interest was reported by the author.

Additional information

Notes on contributors

Ting Ye

Ting Ye is a Ph.D. student in Department of Statistics in University of Wisconsin-Madison. Her research interests focus on clinical trial design, survival analysis and missing data.

References

  • Aickin M. (2001). Randomization, balance, and the validity and efficiency of design-adaptive allocation methods. Journal of Statistical Planning and Inference, 94, 97–119. doi: 10.1016/S0378-3758(00)00228-7
  • Aickin M. (2002). Beyond randomization. The Journal of Alternative Medicine, 8, 765–772.
  • Brikett N. J. (1985). Adaptive allocation in randomized controlled trials. Controlled Clinical Trials, 6, 146–155. doi: 10.1016/0197-2456(85)90120-5
  • Efron B. (1971). Forcing a sequential experiment to be balanced. Biometrika, 58, 403–417. doi: 10.1093/biomet/58.3.403
  • Forsythe A. B. (1987). Validity and power of tests when groups have been balanced for prognostic factors. Computational Statistics and Data Analysis, 5, 193–200. doi: 10.1016/0167-9473(87)90015-6
  • Hagino A., Hamada C., Yoshimura I., Ohashi Y., Sakamoto J., & Nakazato H. (2004). Statistical comparison of random allocation methods in cancer clinical trials. Controlled Clinical Trials, 25, 572–584. doi: 10.1016/j.cct.2004.08.004
  • Kalish L. A., & Begg C. B. (1985). Treatment allocation methods in clinical trials: A review. Statistics in Medicine, 4, 129–144. doi: 10.1002/sim.4780040204
  • Ma W., Hu F., & Zhang L. (2015). Testing hypotheses of covariate-adaptive randomized clinical trials. Journal of the American Statistical Association, 110(510), 669–680. doi: 10.1080/01621459.2014.922469
  • Pocock S. J., & Simon R. (1975). Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics, 31, 103–115. doi: 10.2307/2529712
  • Shao J., & Yu X. (2013). Validity of tests under covariate-adaptive biased coin randomization and generalized linear models. Biometrics, 69, 960–969. doi: 10.1111/biom.12062
  • Shao J., Yu X., & Zhong B. (2010). A theory for testing hypotheses under covariate-adaptive randomization. Biometrika, 97, 347–360. doi: 10.1093/biomet/asq014
  • Taves D. R. (1974). Minimization: A new method of assigning patients to treatment and control groups. Clinical Pharmacology and Therapeutics, 15, 443–453. doi: 10.1002/cpt1974155443
  • Weir C. J., & Lees K. R. (2003). Comparison of stratification and adaptive methods for treatment allocation in an acute stroke clinical trial. Statistics in Medicine, 22, 705–726. doi: 10.1002/sim.1366

Appendix. Proofs of (4)–(7)

Proof of (Equation4).

Proof of (Equation4)

Applying result (7.9) in Efron (Citation1971) to each category defined by and using the fact that and , where , we obtain that (A1) Applying (EquationA1), we obtain Letting , we obtain that Applying result (7.9) in Efron (Citation1971) to each category defined by and using the fact that is discrete, we conclude that the last term in the previous expression is conditionally on . Thus, The asymptotic mean of is , which follows from the fact that 's are conditionally independent of given , , and by the definition of .

Since 's are of mean 0 and independent of , and Since 's and are conditionally independent given and , we obtain that where . Therefore, and Given , 's and 's are conditionally independent. Hence, by the central limit theorem and the above results, the conditional distribution of given , is asymptotically normal with mean 0 and variance , which converges to by the law of large number. Thus, conditionally on or unconditionally, the quantity in (??) is asymptotically normal with mean 0 and variance .

Proof of (Equation7).

Proof of (Equation7)

Without loss of generality, we assume that under , in the proof. From the fact that ,

where is the number of subjects satisfying . Recall that 's and 's are independent and identically distributed. By the law of large numbers, Now that can be expressed as , which together with the dominated convergence theorem and the fact that imply that .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.