121
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Modelling and simulating early stopping of RCTs: a case study of early stop due to harm

Pages 513-526 | Received 26 Mar 2012, Accepted 29 Apr 2012, Published online: 05 Sep 2012
 

Abstract

Despite efforts from regulatory agencies (e.g. NIH, FDA), recent systematic reviews of randomised controlled trials (RCTs) show that top medical journals continue to publish trials without requiring authors to report details for readers to evaluate early stopping decisions carefully. This article presents a systematic way of modelling and simulating interim monitoring decisions of RCTs. By taking an approach that is both general and rigorous, the proposed framework models and evaluates early stopping decisions of RCTs based on a clear and consistent set of criteria. The framework allows decision analysts to generate and quickly answer ‘what-if’ questions by simulating alternate trial scenarios. I illustrate the framework with a case study of an RCT that was stopped early due to harm. This was a trial of vitamin A supplement in relation to HIV transmission from mother-to-child through breastfeeding.

Acknowledgements

I thank Patrick Grim, Eric Dietrich, Jay Kadane, Teddy Seidenfeld, Hubert Wong, Alan Richardson, Nils-Eric Sahlin and the audiences at the epistemology of modelling and simulation national conference (Pittsburgh, April 2011). A modified version of this study was also presented at the conference New Science – New Risks (March 2012) at the Center for Philosophy of Science, University of Pittsburgh, of which I am also grateful. I am most grateful to Paul Bartha, for his guidance, useful comments and criticisms. This study was supported by the Department of Philosophy at UBC and travel grants from the Center for Philosophy of Science at the University of Pittsburgh.

Notes

Notes

1. The DMC's responsibilities include safeguarding the interests of trial participants by assessing the safety and efficacy of interventions, and keeping vigilance over the conduct of the trial. These responsibilities imply providing recommendations about stopping, modifying or continuing the trial on the basis of interim data analyses (Ellenberg, Fleming, and DeMets Citation2002).

2. Stanev (Citation2011) is an example of modelling early stop due to benefit, and Stanev (Citation2012) is an example of modelling early stop due to futility.

3. This point about counter-factual questions indicates a Woodward-style intuition about explanations, namely that we have explanations if we can answer questions counterfactually. ‘We see whether and how some factor or event is causally or explanatorily relevant to another when we see whether (and if so, how) changes in the former are associated with changes in the latter’ (Woodward Citation2003, p. 14).

4. Stanev (Citation2011) is an example of such evaluations.

5. I should be precise with the terms efficacy and harm. By efficacy, I follow Gordis (Citation2009). Efficacy is the extent of reduction in the primary event of interest – e.g. death, HIV mother-to-child transmission – by use of an intervention. We say that for a new intervention, its efficacy = [(rate in those who received standard) − (rate in those who received the new)]/(rate in those who received the standard). In this fashion, the risks of death or of transmitting HIV in each group can be calculated and the reduction in risk as seen in frequencies is what we refer to as efficacy (Gordis Citation2009, p. 152). The term ‘harm’ however is often ambiguous in practice. By a harmful event, it is often meant as either an unfavourable event such as a ‘negative’ trend – where the trend for efficacy is in its opposite direction, that is in favour of placebo or standard treatment – or an all independent event such as toxicity or adverse effect. In this article, harm is meant as a negative trend; the trend for efficacy is in its opposite direction, i.e. against vitamin A.

6. Including infection or death during the intrauterine period.

7. RR = 1.38, 95% CI 1.09–1.76; p-value = 0.01.

8. Benefits of vitamin A supplementation reducing morbidity and mortality rates among African children, such as infants suffering from measles – e.g. Coutsoudis, Broughton, and Coovadia (Citation1991).

9. With 16% of the transmitting mothers versus 6% of the non-transmitting mothers having severe vitamin A deficiency; p-value = 0.05 (Greenberg et al. Citation1997, p. 325).

10. Low vitamin A was associated with increased mother-to-child transmission of HIV; p-value <0 0001 (Semba et al. 1991, p. 1594).

11. ‘This may not be cost-effective to pursue unless such a counselling programme is already in place as part of a strategy to reduce mother-to-child transmission’ (Fawzi et al. Citation2002, p. 1942).

12. 95% CI 1.09–1.76; ‘15.4% were infected by 6 weeks, compared with 21.2% of those whose mothers received vitamin A’ (Fawzi et al. Citation2002).

13. The Pocock rule as a spending function of the overall type I error is given by α(t) = α ln(1 + (e–1)t), where α is the overall type I error, α(t) nominal type I error at t and t information fraction of the trial. Note: I follow Proschan et al. (Citation2006, chap. 5) in their Brownian motion approach to the computation of spending functions.

14. The O–F rule as a spending function of the overall type I error is given by α(t) = α 4(1−Φ(zα /4/t 1/2)), where α is the overall type I error, α(t) nominal type I error at t and t information fraction of the trial, see Proschan et al. (Citation2006, chap. 5) as a method of computing spending functions.

15. ‘Of the children whose mothers did not receive vitamin A, 15.4% were infected by 6 weeks, compared with 21.2% of those whose mothers received vitamin A. At 6 months, the cumulative incidences were 22.4% and 28.1%, respectively, and at 24 months they were 33.8% and 42.4%, respectively’ (Fawzi et al. Citation2002, p. 1938).

16. These are based on a two-sided statistical test of proportions, i.e. the difference between two independent proportions (z-test).

17. Efficacy = [(event rate in multivitamin group) − (event rate in vitamin A group)]/(event rate in multivitamin group) = (0.36−0.21)/(0.36) = 0.416.

18. Based on a sensitivity power analysis for computing required effect size given α = 0.05, power = 0.9 and n = 180 per group.

19. (0.16−0.06)/0.16 = 0.625.

20. Based on a sensitivity power analysis for computing required effect size given α = 0.001, power = 0.9 and n = 180 per group.

21. The assumption here is that a trial that goes to completion is capable of changing clinical practice, whereas a shorter one is not, thus the difference in the loss function above.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.