Search in:

Advanced search

The American Statistician Volume 76, 2022 - Issue 2

Submit an article Journal homepage

1,275

Views

CrossRef citations to date

Altmetric

General

Publication Policies for Replicable Research and the Community-Wide False Discovery Rate

Joshua HabigerDepartment of Statistics, Oklahoma State University, Stillwater, OKCorrespondence[email protected]

Ye LiangDepartment of Statistics, Oklahoma State University, Stillwater, OK

https://orcid.org/0000-0002-6513-8962

Pages 131-141 | Received 12 Jun 2020, Accepted 17 Oct 2021, Published online: 04 Jan 2022

Cite this article
https://doi.org/10.1080/00031305.2021.1999857
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Amrhein, V., Korner-Nievergelt, F., and Roth, T. (2017), “The Earth is Flat (p > 0.05): Significance Thresholds and the Crisis of Unreplicable Research,” PeerJ 5. DOI: https://doi.org/10.7717/peerj.3544.
Google Scholar
Amrhein, V., Trafimow, D., and Greenland, S. (2019), “Inferential Statistics as Descriptive Statistics: There is No Replication Crisis If We Don’t Expect Replication,” The American Statistician, 73, 262–270. DOI: https://doi.org/10.1080/00031305.2018.1543137.
Web of Science ®Google Scholar
Anderson, C. J., Štěpán Bahník, Barnett-Cowan, M., Bosco, F. A., Chandler, J., Chartier, C. R., Cheung, F., Christopherson, C. D., Cordes, A., Cremata, E. J., Penna, N. D., Estel, V., Fedor, A., Fitneva, S. A., Frank, M. C., Grange, J. A., Hartshorne, J. K., Hasselman, F., Henninger, F., van der Hulst, M., Jonas, K. J., Lai, C. K., Levitan, C. A., Miller, J. K., Moore, K. S., Meixner, J. M., Munafò, M. R., Neijenhuijs, K. I., Nilsonne, G., Nosek, B. A., Plessow, F., Prenoveau, J. M., Ricker, A. A., Schmidt, K., Spies, J. R., Stieger, S., Strohminger, N., Sullivan, G. B., van Aert, R. C. M., van Assen, M. A. L. M., Vanpaemel, L. M., Vianello, W., Voracek, M., and Zuni, K. (2016), “Response to Comment on “Estimating the Reproducibility of Psychological Science,” Science, 351, 1037–1037.
PubMed Web of Science ®Google Scholar
Benjamin, D. J., J. O. Berger, M. Johannesson, B. A. Nosek, E. J. Wagenmakers, R. Berk, K. A. Bollen, B. Brembs, L. Brown, C. Camerer, D. Cesarini, C. D. Chambers, M. Clyde, T. D. Cook, P. De Boeck, Z. Dienes, A. Dreber, K. Easwaran, C. Efferson, E. Fehr, F. Fidler, A. P. Field, M. Forster, E. I. George, R. Gonzalez, S. Goodman, E. Green, D. P. Green, A. G. Greenwald, J. D. Hadfield, L. V. Hedges, L. Held, T. Hua Ho, H. Hoijtink, D. J. Hruschka, K. Imai, G. Imbens, J. P. A. Ioannidis, M. Jeon, J. H. Jones, M. Kirchler, D. Laibson, J. List, R. Little, A. Lupia, E. Machery, S. E. Maxwell, M. McCarthy, D. A. Moore, S. L. Morgan, M. Munafó, S. Nakagawa, B. Nyhan, T. H. Parker, L. Pericchi, M. Perugini, J. Rouder, J. Rousseau, V. Savalei, F. D. Schönbrodt, T. Sellke, B. Sinclair, D. Tingley, T. Van Zandt, S. Vazire, D. J. Watts, C. Winship, R. L. Wolpert, Y. Xie, C. Young, J. Zinman, and V. E. Johnson (2017), “Redefine Statistical Significance,” Nature Human Behaviour, 2, 6–10. DOI: https://doi.org/10.1038/s41562-017-0189-z.
Web of Science ®Google Scholar
Benjamini, Y. (2020), “Selective Inference: The Silent Killer of Replicability,” Harvard Data Science Review, 2, available at https://hdsr.mitpress.mit.edu/pub/l39rpgyc. DOI: https://doi.org/10.1162/99608f92.fc62b261.
Google Scholar
Benjamini, Y., De Veaux, R. D., Efron, B., Evans, S., Glickman, M., Graubard, B. I., He, X., Meng, X.-L., Reid, N., and Stigler, S. M. (2021), “The ASA President’s Task Force Statement on Statistical Significance and Replicability,” The Annals of Applied Statistics, 15, 1084–1085. DOI: https://doi.org/10.1214/21-AOAS1501.
Web of Science ®Google Scholar
Benjamini, Y., and Hochberg, Y. (1995), “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal Royal Statistical Society, Series B, 57, 289–300. DOI: https://doi.org/10.1111/j.2517-6161.1995.tb02031.x.
Web of Science ®Google Scholar
Berger, J. O. (1993), Statistical Decision Theory and Bayesian Analysis, New York: Springer Science & Business Media.
Google Scholar
Berger, J. O., Boukai, B., and Wang, Y. (1997), “Unified Frequentist and Bayesian Testing of a Precise Hypothesis,” Statistical Science, 12, 133–148. DOI: https://doi.org/10.1214/ss/1030037904.
Web of Science ®Google Scholar
Berger, J. O., and Delampady, M. (1987), “Testing Precise Hypotheses” (with discussion), Statististical Science, 2, 317–352.
Google Scholar
Betensky, R. A. (2019), “The p-Value Requires Context, Not a Threshold,” The American Statistician, 73(sup1.), 115–117. DOI: https://doi.org/10.1080/00031305.2018.1529624.
Web of Science ®Google Scholar
Bickel, D. R. (2020), “Null Hypothesis Significance Testing Interpreted and Calibrated by Estimating Probabilities of Sign Errors: A Bayes-Frequentist Continuum,” The American Statistician, 75, 104–112. DOI: https://doi.org/10.1080/00031305.2020.1816214.
Web of Science ®Google Scholar
Bickel, D. R (2021), “Null Hypothesis Significance Testing Defended and Calibrated by Bayesian Model Checking,” The American Statistician, 75, 249–255.
Web of Science ®Google Scholar
Cai, T. T., and Sun, W. (2009), “Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks,” Journal of American Statistical Association, 104, 1467–1481. DOI: https://doi.org/10.1198/jasa.2009.tm08415.
Web of Science ®Google Scholar
Cai, T. T., Sun, W., and Wang, W. (2019), “Covariate-Assisted Ranking and Screening for Large-Scale Two-Sample Inference,” Journal of the Royal Statistical Society, Series B, 81, 187–234. DOI: https://doi.org/10.1111/rssb.12304.
Google Scholar
Campbell, H., and Gustafson, P. (2019), “The World of Research Has Gone Berserk: Modeling the Consequences of Requiring ‘Greater Statistical Stringency’ for Scientific Publication,” The American Statistician, 73, 358–373. DOI: https://doi.org/10.1080/00031305.2018.1555101.
Web of Science ®Google Scholar
Cohen, J. (1988), Statistical Power Analysis for the Behavioral Sciences (2nd ed.), Hillsdale, NJ: Erlbaum.
Google Scholar
Cohen, J. (1994), “The Earth is Round (p <0.05),” The American Psychologist, 49, 997–1003.
Web of Science ®Google Scholar
Colquhoun, D. (2019), “The False Positive Risk: A Proposal Concerning What to Do About p-Values,” The American Statistician, 73, 192–201. DOI: https://doi.org/10.1080/00031305.2018.1529622.
Web of Science ®Google Scholar
Dumas-Mallet, E., K. S. Button, T. Boraud, F. Gonon, and M. R. Munafo (2017), “Low Statistical Power in Biomedical Science: A Review of Three Human Research Domains,” Royal Society of Open Science, 4, 160254. DOI: https://doi.org/10.1098/rsos.160254.
PubMed Web of Science ®Google Scholar
Efron, B. (2008), “Microarrays, Empirical Bayes and the Two-Group Model,” Statistical Science, 23, 1–22.
Web of Science ®Google Scholar
Efron, B. (2010), Large-Scale Inference, Volume 1 of Institute of Mathematical Statistics (IMS) Monographs, Cambridge: Cambridge University Press. Empirical Bayes methods for estimation, testing, and prediction.
Google Scholar
Efron, B., Tibshirani, R., Storey, J. D., and Tusher, P. (2001), “Empirical Bayes Analysis of a Microarray Experiment,” Journal of the American Statistical Association, 96, 1151 – 1160. DOI: https://doi.org/10.1198/016214501753382129.
Web of Science ®Google Scholar
Fisher, R. A. (1915), “Frequency Distribution of the Values of the Correlation Coefficient in Samples From an Indefinitely Large Population,” Biometrika, 10, 507–521. DOI: https://doi.org/10.2307/2331838.
Google Scholar
Fisher, R. A. (1925), Statistical Methods for the Research Worker, Edinburgh: Oliver and Boyd.
Google Scholar
Gelman, A. (2019), “Don’t Calculate Post-Hoc Power Using Observed Estimate of Effect Size,” Annals of Surgery, 269(1), 9–10. DOI: https://doi.org/10.1097/SLA.0000000000002908.
PubMed Web of Science ®Google Scholar
Genovese, C., and Wasserman, L. (2002), “Operating Characteristic and Extensions of the False Discovery Rate Procedure,” Journal of the Royal Statistical Society, Series B, 64, 499–517. DOI: https://doi.org/10.1111/1467-9868.00347.
Web of Science ®Google Scholar
Gigerenzer, G. (2004), “Mindless Statistics,” The Journal of Socio-Economics, 33, 587–606. DOI: https://doi.org/10.1016/j.socec.2004.09.033.
Google Scholar
Gilbert, D. T., King, G., Pettigrew, S., and Wilson, T. D. (2016), “Comment on “Estimating the Reproducibility of Psychological Science,” Science, 351, 1037–1037.
PubMed Web of Science ®Google Scholar
Goodman, S. N. (1999), “Toward Evidence-Based Medical Statistics. 2: The Bayes Factor,” Annals of Internal Medicine, 130, 1005–1013. DOI: https://doi.org/10.7326/0003-4819-130-12-199906150-00019.
PubMed Web of Science ®Google Scholar
Goodman, W. M., Spruill, S. E., and Komaroff, E. (2019), “A Proposed Hybrid Effect Size Plus p-Value Criterion: Empirical Evidence Supporting Its Use,” The American Statistician, 73, 168–185. DOI: https://doi.org/10.1080/00031305.2018.1564697.
Web of Science ®Google Scholar
Grimes, D. R., Bauch, C. T., and Ioannidis, J. P. A. (2018), “Modelling Science Trustworthiness Under Publish or Perish Pressure,” Royal Society of Open Science, 5, 171511. DOI: https://doi.org/10.1098/rsos.171511.
PubMed Web of Science ®Google Scholar
Habiger, J. D. (2017), “Adaptive False Discovery Rate Control for Heterogeneous Data,” Statistica Sinica, 27, 1731–1756.
Web of Science ®Google Scholar
Haller, H., and Krauss, S. (2002), “Misinterpretations of Significance: A Problem Students Share With Their Teachers,” Methods of Psychological Research, 7, 1–20.
Google Scholar
Higginson, A. D., and Munafò, M. R. (2016), “Current Incentives for Scientists Lead to Underpowered Studies With Erroneous Conclusions,” PLoS Biology, 14, e2000995. DOI: https://doi.org/10.1371/journal.pbio.2000995.
PubMed Web of Science ®Google Scholar
Hubbard, R. (2004), “Alphabet Soup: Blurring the Distinctions Between p’s Anda’s in Psychological Research,” Theory & Psychology 14, 295–327.
Web of Science ®Google Scholar
Hubbard, R. (2019), “Will the ASA’s Efforts to Improve Statistical Practice be Successful? Some Evidence to the Contrary,” The American Statistician, 73, 31–35.
Web of Science ®Google Scholar
Hurlbert, S. H., Levine, R. A., and Utts, J. (2019), “Coup de Grâce for a Tough Old Bull: ‘Statistically Significant’ Expires,” The American Statistician, 73, 352–357. DOI: https://doi.org/10.1080/00031305.2018.1543616.
Web of Science ®Google Scholar
Hurlbert, S. H., and Lombardi, C. M. (2009), “Final Collapse of the Neyman–Pearson Decision Theoretic Framework and Rise of the Neofisherian,” in Annales Zoologici Fennici, Vol. 46, pp. 311–349. BioOne. DOI: https://doi.org/10.5735/086.046.0501.
Web of Science ®Google Scholar
Ioannidis, J. P. (2005), “Why Most Published Research Findings Are False,” PLoS Medicine, 2, e124. DOI: https://doi.org/10.1371/journal.pmed.0020124.
PubMed Web of Science ®Google Scholar
Ioannidis, J. P., Hozo, I., and Djulbegovic, B. (2013), “Optimal Type I and Type II Error Pairs When the Available Sample Size is Fixed,” Journal of Clinical Epidemiology, 66, 903–910. DOI: https://doi.org/10.1016/j.jclinepi.2013.03.002.
PubMed Web of Science ®Google Scholar
Ioannidis, J. P. A. (2013), “Discussion: Why ‘An Estimate of the Science-Wise False Discovery Rate and Application to the Top Medical Literature’ is False,” Biostatistics, 15, 28–36. DOI: https://doi.org/10.1093/biostatistics/kxt036.
PubMed Web of Science ®Google Scholar
Ioannidis, J. P. A. (2019), “What Have We (Not) Learnt From Millions of Scientific Papers With p Values?” The American Statistician, 73, 20–25. DOI: https://doi.org/10.1080/00031305.2018.1447512.
Web of Science ®Google Scholar
Jager, L. R., and Leek, J. T. (2013), “An Estimate of the Science-Wise False Discovery Rate and Application to the Top Medical Literature,” Biostatistics, 15, 1–12. DOI: https://doi.org/10.1093/biostatistics/kxt007.
PubMed Web of Science ®Google Scholar
Johnson, D. H. (1999), “The Insignificance of Statistical Significance Testing,” The Journal of Wildlife Management, 63, 763–772. DOI: https://doi.org/10.2307/3802789.
Web of Science ®Google Scholar
Johnson, V. E. (2013), “Revised Standards for Statistical Evidence,” Proceedings of the National Academy of Sciences, 110, 19313–19317. DOI: https://doi.org/10.1073/pnas.1313476110.
PubMed Web of Science ®Google Scholar
Johnson, V. E., Payne, R. D., Wang, T., Asher, A., and Mandal, S. (2017), “On the Reproducibility of Psychological Science,” Journal of the American Statistical Association, 112 (517), 1–10. DOI: https://doi.org/10.1080/01621459.2016.1240079.
PubMed Web of Science ®Google Scholar
Kennedy-Shaffer, L. (2019). “Before p <0.05 to Beyond p <0.05: Using History to Contextualize p-values and Significance Testing,” The American Statistician, 73, 82–90.
PubMed Web of Science ®Google Scholar
Krantz, D. H. (1999), “The Null Hypothesis Testing Controversy in Psychology,” Journal of the American Statistical Association, 94, 1372–1381. DOI: https://doi.org/10.1080/01621459.1999.10473888.
Web of Science ®Google Scholar
Lakens, D., Adolfi, F. G., Albers, C., Anvari, F., Apps, M., Argamon, S., Baguley, T., Becker, R., Benning, S. D., Bradford, D., Buchanan, E. M., Caldwell, A. R., Calster, B., Carlsson, R., Chin Chen, S., Chung, B., Colling, L. J., Collins, G., Crook, Z., Cross, E. S., Daniels, S., Danielsson, H., DeBruine, L., Dunleavy, D. J., Earp, B., Feist, M. I., Ferrell, J. D., Field, J. G., Fox, N. W., Friesen, A., Gomes, C., Gonzalez-Marquez, M., Grange, J., Grieve, A., Guggenberger, R., Grist, J., Harmelen, A.-L., Hasselman, F., Hochard, K. D., Hoffarth, M., Holmes, N., Ingre, M., Isager, P., Isotalus, H., Johansson, C., Juszczyk, K., Kenny, D., Khalil, A., Konat, B., Lao, J., Larsen, E. G., Lodder, G., Lukavský, J., Madan, C., Manheim, D., Martin, S. R., Martin, A. E., Mayo, D., McCarthy, R. J., McConway, K., McFarland, C., Nio, A., Nilsonne, G., Oliveira, C. L., Xivry, J. O., Parsons, S., Pfuhl, G., Quinn, K., Sakon, J. J., Saribay, S. A., Schneider, I., Selvaraju, M., Sjoerds, Z., Smith, S. G., Smits, T., Spies, J. R., Sreekumar, V., Steltenpohl, C. N., Stenhouse, N., Wiatkowski, W., Vadillo, M. A., Assen, M. V., Williams, M., Williams, S. E., Williams, D. R. Yarkoni, T., Ziano, I., and Zwaan, R. A. (2018), “Justify Your Alpha,” Nature Human Behaviour, 2, 168–171. DOI: https://doi.org/10.1038/s41562-018-0311-x.
Web of Science ®Google Scholar
Matthews, R. (2021), “The p-Value Statement, Five Years On,” Significance, 18 (2), 16–19. DOI: https://doi.org/10.1111/1740-9713.01505.
Google Scholar
Matthews, R. A. (2001), “Why Should Clinicians Care About Bayesian Methods?,” Journal of Statistical Planning and Inference 94, 43–58.
Google Scholar
McCann, M. H., and Habiger, J. D. (2020), “The Detection of Nonnegligible Directional Effects With Associated Measures of Statistical Significance,” The American Statistician, 74, 213–217. DOI: https://doi.org/10.1080/00031305.2018.1497538.
Web of Science ®Google Scholar
McLachlan, G. J., and Peel, D. (2000), Finite Mixture Models, New York: Wiley Series in Probability and Statistics.
Google Scholar
McShane, B. B., Böckenholt, U., and Hansen, K. T. (2020), “Average Power: A Cautionary Note,” Advances in Methods and Practices in Psychological Science, 3, 185–199. DOI: https://doi.org/10.1177/2515245920902370.
Google Scholar
McShane, B. B., and Gal, D. (2016), “Blinding Us to the Obvious? The Effect of Statistical Training on the Evaluation of Evidence,” Management Science, 62, 1707–1718. DOI: https://doi.org/10.1287/mnsc.2015.2212.
Web of Science ®Google Scholar
McShane, B. B., and Gal, D. (2017), “Statistical Significance and the Dichotomization of Evidence,” Journal of the American Statistical Association, 112, 885–895.
Web of Science ®Google Scholar
McShane, B. B., Gal, D., Gelman, A., Robert, C., and Tackett, J. L. (2019), “Abandon Statistical Significance,” The American Statistician, 73, 235–245. DOI: https://doi.org/10.1080/00031305.2018.1527253.
Web of Science ®Google Scholar
McShane, B. B., Tackett, J. L., Böckenholt, U., and Gelman, A. (2019), “Large-Scale Replication Projects in Contemporary Psychological Research,” The American Statistician, 73, 99–105. DOI: https://doi.org/10.1080/00031305.2018.1505655.
Web of Science ®Google Scholar
Morton, N. E. (1955), “Sequential Tests for The Detection of Linkage,” American Journal of Human Genetics, 7, 277.
PubMed Web of Science ®Google Scholar
Moss, J., and R. De Bin (2021+), “Modelling Publication Bias and p-Hacking,” Biometrics, available at https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13560.
Google Scholar
OSC (2015), “Estimating the Reproducibility of Psychological Science,” Science, 349.
PubMed Web of Science ®Google Scholar
Robbins, H. (1951), “Asymptotically Subminimax Solutions of Compound Statistical Decision Problems,” in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, 1950, pp. 131–148. Berkeley: University of California Press.
Google Scholar
Sellke, T., Bayarri, M. J., and Berger, J. O. (2001), “Calibration of p-Values for Testing Precise Null Hypotheses,” The American Statistician, 55, 62–71. DOI: https://doi.org/10.1198/000313001300339950.
Web of Science ®Google Scholar
Shi, H., and Yin, G. (2021), “Reconnecting p-Value and Posterior Probability Under One- and Two-Sided Tests,” The American Statistician, 75, 265–275. DOI: https://doi.org/10.1080/00031305.2020.1717621.
Web of Science ®Google Scholar
Storey, J. (2003), “The Positive False Discovery Rate: A Bayesian Interpretation and the q-Value,” The Annals of Statistics, 31, 2012 – 2035. DOI: https://doi.org/10.1214/aos/1074290335.
Web of Science ®Google Scholar
Sun, W., and Cai, T. (2007), “Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control,” Journal of American Statistical Association, 102, 901–912. DOI: https://doi.org/10.1198/016214507000000545.
Web of Science ®Google Scholar
Szucs, D., and Ioannidis, J. (2017), “When Null Hypothesis Significance Testing is Unsuitable for Research: A Reassessment,” Frontiers in Human Neuroscience, 11, 390. DOI: https://doi.org/10.3389/fnhum.2017.00390.
PubMed Web of Science ®Google Scholar
Thomas, L. (1997), “Retrospective Power Analysis,” Conservation Biology, 11, 276–280. DOI: https://doi.org/10.1046/j.1523-1739.1997.96102.x.
Web of Science ®Google Scholar
Wasserstein, R. L., and Lazar, N. A. (2016), “The ASA’s Statement on p-Values: Context, Process, and Purpose,” The American Statistician, 70, 129–133. DOI: https://doi.org/10.1080/00031305.2016.1154108.
Web of Science ®Google Scholar
Wasserstein, R. L., Schirm, A. L., and Lazar, N. A. (2019), “Moving to a World Beyond p <0.05,” The American Statistician, 73, 1–19.
Web of Science ®Google Scholar
Yuan, K.-H., and Maxwell, S. (2005), “On the Post Hoc Power in Testing Mean Differences,” Journal of Educational and Behavioral Statistics, 30, 141–167. DOI: https://doi.org/10.3102/10769986030002141.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Publication Policies for Replicable Research and the Community-Wide False Discovery Rate

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Publication Policies for Replicable Research and the Community-Wide False Discovery Rate

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date