References
- Amrhein, V., S. Greenland, and B. McShane. 2019a. Scientists rise up against statistical significance. Nature 567:305–307. doi:https://doi.org/10.1038/d41586-019-00857-9.
- Amrhein, V., D. Trafimow, and S. Greenland. 2019b. Inferential statistics as descriptive statistics: There is no replication crisis if we don’t expect replication. The American Statistician 73 (sup1):262–270. doi:https://doi.org/10.1080/00031305.2018.1543137.
- Bayarri, M. J., D. J. Benjamin, J. O. Berger, and T. Sellke. 2016. Rejection odds and rejection ratios: A proposal for statistical practice in testing hypotheses. Journal of Mathematical Psychology 72:90–103. doi:https://doi.org/10.1016/j.jmp.2015.12.007.
- Benjamin, D. J., and J. O. Berger. 2019. Three recommendations for improving the use of p-values. The American Statistician 73 (sup1):186–191. doi:https://doi.org/10.1080/00031305.2018.1543135.
- Benjamin, D. J., J. O. Berger, M. Johannesson, B. A. Nosek, E. J. Wagenmalers, and R. Berk. 2018. Redefine statistical significance. Nature Human Behaviour 2:6–10. doi:https://doi.org/10.1038/s41562-017-0189-z.
- Betensky, R. A. 2019. The p-value requires context, not a threshold. The American Statistician 73 (sup1):115–117. doi:https://doi.org/10.1080/00031305.2018.1529624.
- Bhatt, D. L., and C. Mehta. 2016. Adaptive designs for clinical trials. New England Journal of Medicine 375 (1):65–74. doi:https://doi.org/10.1056/NEJMra1510061.
- Bickel, D. R. 2021. Null hypothesis significance testing defended and calibrated by Bayesian model checking. online ver. The American Statistician 75 (3): 249–255.
- Blume, J. D., L. D’agostino Mcgowan, W. D. Dupont, and R. A. Greevy Jr. 2018. Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS One 13 (3):e0188299. doi:https://doi.org/10.1371/journal.pone.0188299.
- Blume, J. D., R. A. Greevy, V. F. Welty, J. R. Smith, and W. D. Dupont. 2019. An introduction to second-generation p-values. The American Statistician 73 (sup1):157–167. doi:https://doi.org/10.1080/00031305.2018.1537893.
- Cao, B., Y. Wang, W. L. D.Wen, J. Wang, and G. Fan , 2020. A trial of lopinavir–ritonavir in adults hospitalized with severe Covid-19. New England Journal of Medicine 382(19):1787–1799. doi:https://doi.org/10.1056/NEJMoa2001282.
- Cole, S. R., J. K. Edwards, and S. Greenland. 2021. Surprise! American Journal of Epidemiology 190 (2):191–193. doi:https://doi.org/10.1093/aje/kwaa136.
- Colquhoun, D. 2014. An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science 1:140216. doi:https://doi.org/10.1098/rsos.140216.
- Colquhoun, D. 2019. The false positive risk: a proposal concerning what to do about p-values. The American Statistician 73 (sup1):192–201. doi:https://doi.org/10.1080/00031305.2018.1529622.
- de Bragança Pereira, C. A., and S. Wechsler. 1993. On the concept of p-value. Brazilian Journal of Probability and Statistics 7: 159–177.
- DeGroot, M. H. 1986. Probability and Statistics. Reading, MA: Addison-Wesley Publishing Company.
- Efron, B. 2010. Large-scale inference: Empirical Bayes methods for estimation, testing, and prediction. Cambridge: Cambridge University Press.
- Fisher, R. A. 1922. On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A 222 (594–604):309–368.
- Foulley, J. 2020. Benjamin, DJ, and Berger, JO (2019),‘Three recommendations for improving the use of p-values’, the American statistician, 73, 186–191: Comment by Foulley. The American Statistician 74 (1):101–102. doi:https://doi.org/10.1080/00031305.2019.1668850.
- Fraser, D. A. S. 2017. P-values: The insight to modern statistical inference. Annual Review of Statistics and Its Application 4:1–14. doi:https://doi.org/10.1146/annurev-statistics-060116-054139.
- Fraser, D. A. S. 2019. The P-value function and statistical inference. The American Statistician 73 (sup1):135–147. doi:https://doi.org/10.1080/00031305.2018.1556735.
- Fricker, R. D., Jr, K. Burke, X. Han, and W. H. Woodall. 2019. Assessing the statistical analyses used in basic and applied social psychology after their p-value ban. The American Statistician 73 (sup1):374–384. doi:https://doi.org/10.1080/00031305.2018.1537892.
- Gannon, M. A., C. A. de Bragança Pereira, and A. Polpo. 2019. Blending Bayesian and classical tools to define optimal sample-size-dependent significance levels. The American Statistician 73 (sup1):213–222. doi:https://doi.org/10.1080/00031305.2018.1518268.
- Goodman, S. N. 2018. How Sure are you of your result? Put a number on it. Nature 564:7. doi:https://doi.org/10.1038/d41586-018-07589-2.
- Goodman, W. M., S. E. Spruill, and E. Komaroff. 2019. A proposed hybrid effect size plus p-value criterion: Empirical evidence supporting its use. The American Statistician 73 (sup1):168–185. doi:https://doi.org/10.1080/00031305.2018.1564697.
- Greenland, S. 2017. Invited commentary: The need for cognitive science in methodology. American Journal of Epidemiology 186 (6):639–645. doi:https://doi.org/10.1093/aje/kwx259.
- Greenland, S. 2019. Valid P-values behave exactly as they should: Some misleading criticisms of p-values and their resolution with s-values. The American Statistician 73 (sup1):106–114. doi:https://doi.org/10.1080/00031305.2018.1529625.
- Greenland, S., S. J. Senn, K. J. Rothman, J. B. Carlin, C. Poole, S. N. Goodman, and D. G. Altman. 2016. Statistical tests, p values, confidence intervals and power: A guide to misinterpretations. European Journal of Epidemiology 31:337–350. doi:https://doi.org/10.1007/s10654-016-0149-3.
- Held, L. 2019. The assessment of intrinsic credibility and a new argument for P< .005. Royal Society Open Science 6 (3).
- Held, L., and M. Ott. 2018. On p-values and Bayes factors. Annual Review of Statistics and Its Applications 5:393–419. doi:https://doi.org/10.1146/annurev-statistics-031017-100307.
- Krueger, J. I., and P. R. Heck. 2019. Putting the p-value in its place. The American Statistician 73 (sup1):122–128. doi:https://doi.org/10.1080/00031305.2018.1470033.
- Lakens, D., F. G. Adolfi, C. J. Albers, F. Anvari, M. A. J. Apps, and S. E. Argamon. 2018. Justify Your Alpha. Nature Human Behaviour 2:168–171. doi:https://doi.org/10.1038/s41562-018-0311-x.
- Lash, T. L. 2017. The harm done to reproducibility by the culture of null hypothesis significance testing. American Journal of Epidemiology 186 (6):627–635. doi:https://doi.org/10.1093/aje/kwx261.
- Liao, J. G., Y. Lin, Z. E. Selvanayagam, and W. J. Shih. 2004. A mixture model for estimating local false discovery rate in DNA microarray analysis. Bioinformatics 20 (16):2694–2701. doi:https://doi.org/10.1093/bioinformatics/bth310.
- Matthews, R. A. J. 2018. Beyond ‘significance’: Principles and practice of the analysis of credibility. Royal Society Open Science 5 (1):171047. doi:https://doi.org/10.1098/rsos.171047.
- Matthews, R. A. J. 2019. Moving towards the post p< .05 era via the analysis of credibility. The American Statistician 73 (sup1):202–212.
- McShane, B. B., D. Gal, A. Gelman, C. Robert, and J. L. Tackett. 2019. Abandon statistical significance. The American Statistician 73 (sup1):235–245. doi:https://doi.org/10.1080/00031305.2018.1527253.
- Northfelt, D. W., B. J. Dezube, and J. A. Thommes. 1998. Pegylated-liposomal doxorubicin versus doxorubicin, bleomycin, and vincristine in the treatment of AIDS-related Kaposi’s sarcoma: Results of a randomized phase III clinical trial. Journal of Clinical Oncology 16 (7):2445–2451. doi:https://doi.org/10.1200/JCO.1998.16.7.2445.
- Pogrow, S. 2019. How effect size (Practical significance) misleads clinical practice: The case for switching to practical benefit to assess applied research findings. The American Statistician 73 (sup1):223–234. doi:https://doi.org/10.1080/00031305.2018.1549101.
- Quatto, P., E. Ripamonti, and D. Marasini. 2020. Best uses of p-values and complementary measures in medical research: Recent developments in the frequentist and Bayesian frameworks. Journal of Biopharmaceutical Statistics 30:121–142. doi:https://doi.org/10.1080/10543406.2019.1632874.
- Ripamonti, E., C. Lloyd, and P. Quatto. 2017. Contemporary frequentist views of the 2x2 binomial trial. Statistical Science 32 (4):600–615. doi:https://doi.org/10.1214/17-STS627.
- Sellke, T., M. J. Bayarri, and J. O. Berger. 2001. Calibration of p-values for testing precise null hypothesis. The American Statistician 55:62–71. doi:https://doi.org/10.1198/000313001300339950.
- Simpson, D., H. Rue, A. Riebler, T. G. Martins, and S. H. Sørbye. 2017. Penalising model component complexity: A principled, practical approach to constructing priors. Statistical Science 32 (1):1–28. doi:https://doi.org/10.1214/16-STS576.
- Trafimow, D. 2017. Using the coefficient of confidence to make the philosophical switch from a posteriori to a priori inferential statistics. Educational and Psychological Measurement 77 (5):831–854. doi:https://doi.org/10.1177/0013164416667977.
- Trafimow, D. 2019. A frequentist alternative to significance testing, p-values, and confidence intervals. Econometrics 7 (2):26. doi:https://doi.org/10.3390/econometrics7020026.
- Trafimow, D., V. Amrhein, C. N. Areshnkoff, C. Barrera-Causil, and E. J. Beh. 2018. Manipulating the alpha level cannot cure significance testing. Frontiers in Psychology 9:1–17. doi:https://doi.org/10.3389/fpsyg.2018.00699.
- Wagenmakers, E. J. 2007. A Practical Solution to the Pervasive Problems of P-Values. Psychonomic Bulletin and Review 14 (5):779–804. doi:https://doi.org/10.3758/BF03194105.
- Wasserstein, R. L., and N. A. Lazar. 2016. The ASA’s statement on p-values: Context, process, and purpose. The American Statistician 70:129–133. doi:https://doi.org/10.1080/00031305.2016.1154108.
- Wasserstein, R. L., A. L. Schirm, and N. A. Lazar. 2019. Moving to a World beyond ‘P<.05. The American Statistician 73 (S1):1–19.
- Wellek, S. 2017. A critical evaluation of the current ‘p-value controversy.’. Biometrical Journal 59:854–872. doi:https://doi.org/10.1002/bimj.201700001.
- Wetzels, R., D. Matzke, M. D. Lee, J. N. Rouder, G. J. Iverson, and E. J. Wagenmakers. 2011. Statistical evidence in experimental psychology. An empirical comparison using 855 t tests. Perspectives on Psychological Sciences 6 (3):291–298. doi:https://doi.org/10.1177/1745691611406923.