2,696
Views
7
CrossRef citations to date
0
Altmetric
Articles

Knowing how effective an intervention, treatment, or manipulation is and increasing replication rates: accuracy in parameter estimation as a partial solution to the replication crisis

ORCID Icon & ORCID Icon
Pages 59-77 | Received 20 Jun 2017, Accepted 10 Feb 2020, Published online: 07 May 2020
 

Abstract

Objective

Although basing conclusions on confidence intervals for effect size estimates is preferred over relying on null hypothesis significance testing alone, confidence intervals in psychology are typically very wide. One reason may be a lack of easily applicable methods for planning studies to achieve sufficiently tight confidence intervals. This paper presents tables and freely accessible tools to facilitate planning studies for the desired accuracy in parameter estimation for a common effect size (Cohen’s d). In addition, the importance of such accuracy is demonstrated using data from the Reproducibility Project: Psychology (RPP).

Results

It is shown that the sampling distribution of Cohen’s d is very wide unless sample sizes are considerably larger than what is common in psychology studies. This means that effect size estimates can vary substantially from sample to sample, even with perfect replications. The RPP replications’ confidence intervals for Cohen’s d have widths of around 1 standard deviation (95% confidence interval from 1.05 to 1.39). Therefore, point estimates obtained in replications are likely to vary substantially from the estimates from earlier studies.

Conclusion

The implication is that researchers in psychology -and funders- will have to get used to conducting considerably larger studies if they are to build a strong evidence base.

Acknowledgements

We would like to thank Robert Calin-Jageman and Geoff Cumming for constructive corrections on the preprint of this paper, Guy Prochilo for pointing out an inconsistency in the algorithms and constructive comments, and the editor Rob Ruiter and reviewers Rink Hoekstra and David Trafimow for constructive comments during the peer review process.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Statistically, all effects are simply associations: whether an association involves variables that are manipulated or only measured is theoretically crucial but statistically irrelevant.

2 Following Lakens (Citation2013), we use the s subscript to unequivocally refer to the between-samples Cohen’s d; note that Goulet-Pelletier and Cousineau (Citation2018) use dp for this same form of Cohen’s d.

3 Note that the exact distribution of Pearson’s r is available in the R package SuppDists (Wheeler, Citation2016).

4 A number of free and easy-to-use tools exist that can help get a handle on how different values of r and d convert to each other. One is the FromR2D2 spreadsheet by Daniel Lakens, hosted at the Open Science Framework at https://osf.io/ixgcd. In addition, a family of conversion functions is available in the R package userfriendlyscience, such as convert.r.to.d and convert.d.to.r.

5 Note that whether this single confidence interval of [.10; .90] is among that 95% is not known: knowing this would require knowing the population value, knowledge of which would make collecting a sample redundant in the first place.

6 The analysis script and produced files are all available at the Open Science Framework at https://osf.io/5ejd8.

7 For some of these studies, no effect size estimate was available. For these studies, we constructed the confidence interval around zero to obtain the narrowest possible (i.e. most optimistic) confidence intervals.