ABSTRACT
This paper compares the use of confidence intervals (CIs) and a sensitivity analysis called the number needed to disturb (NNTD), in the analysis of research findings expressed as ‘effect’ sizes. Using 1,000 simulations of randomised trials with up to 1,000 cases in each, the paper shows that both approaches are very similar in outcomes, and each one is highly predictable from the other. CIs are supposed to be a measure of likelihood or uncertainty in the results, showing a range of possible effect sizes that could have been produced by random sampling variation alone. NNTD is supposed to be a measure of the robustness of the effect size to any variation, including that produced by missing data. Given that they are largely equivalent and interchangeable under the conditions tested here, the paper suggests that both are really measures of robustness. It concludes that NNTD is to be preferred because it requires many fewer assumptions, is more tolerant of missing data, is easier to explain, and directly addresses the key question of whether the underlying effect size is zero or not.
Acknowledgements
With thanks to Jonathan Gorard for helpful discussions.
Disclosure statement
No potential conflict of interest was reported by the author.
Additional information
Notes on contributors
Stephen Gorard
Stephen Gorard is Professor of Education and Public Policy, and Director of the Evidence Centre for Education, Durham University (https://www.dur.ac.uk/). He is a Fellow of the Academy of Social Sciences, member of the British Academy grants panel, and Lead Editor for Review of Education. His work concerns the robust evaluation of education as a lifelong process, focused on issues of equity, especially regarding school intakes