Abstract
Researchers designing a clinical trial to demonstrate superiority or noninferiority of a new treatment to an established control face an important choice: to conduct a randomized controlled trial (RCT), or to take advantage of historical data on the control treatment and conduct a single-arm historically controlled trial (HCT). The primary advantage of the RCT is that it minimizes bias between the treatment and control arms with respect to known and unknown confounders (i.e., it ensures exchangeability), while the advantage of the HCT is potentially greater efficiency leading to smaller required sample size. However, a naïve comparison of sample size requirements, which suggests a 4-fold sample size ratio, is flawed because the sample size calculations involve different null hypotheses and therefore have different error rates for the null hypothesis of interest. In this article, we define four approaches for calibrating the error rates for the RCT and HCT under that common null hypothesis, which allows for a fair comparison of sample size requirements. We show that the HCT has an inflated Type I error rate for the null hypothesis of interest even in the absence of bias, that with appropriately calibrated error rates the sample size advantage of the HCT is always less than that suggested by a naïve calculation, and that the RCT can in some cases be more efficient.