Search in:

The American Statistician Volume 73, 2019 - Issue sup1: Statistical Inference in the 21st Century: A World Beyond p < 0.05

Submit an article Journal homepage

Open access

72,039

Views

215

CrossRef citations to date

Altmetric

Adopting More Holistic Approaches

Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis if We Don’t Expect Replication

Valentin AmrheinZoological Institute, University of Basel, Basel, Switzerland; Correspondence[email protected]

David TrafimowDepartment of Psychology, New Mexico State University, Las Cruces, NM;

Sander GreenlandDepartment of Epidemiology and Department of Statistics, University of California, Los Angeles, CA

Pages 262-270 | Received 19 Mar 2018, Accepted 25 Oct 2018, Published online: 20 Mar 2019

Cite this article
https://doi.org/10.1080/00031305.2018.1543137
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Amrhein, V. (2018), “Inferential Statistics is Not Inferential,” sci five, University of Basel, available at http://bit.ly/notinfer.
Google Scholar
Amrhein, V., and Greenland, S. (2018), “Remove, Rather Than Redefine, Statistical Significance,” Nature Human Behaviour, 2, 4. DOI: 10.1038/s41562-017-0224-0.
PubMed Web of Science ®Google Scholar
Amrhein, V., Korner-Nievergelt, F., and Roth, T. (2017), “The Earth is Flat (p > 0.05): Significance Thresholds and the Crisis of Unreplicable Research,” PeerJ, 5, e3544. DOI: 10.7717/peerj.3544.
PubMed Web of Science ®Google Scholar
Amrhein, V., Trafimow, D., and Greenland, S. (2018), “Abandon Statistical Inference,” PeerJ Preprints, 6, e26857v1.
Google Scholar
Baker, M. (2016), “Is There a Reproducibility Crisis?” Nature, 533, 452–454. DOI: 10.1038/533452a.
PubMed Web of Science ®Google Scholar
Barnard, G. A. (1996), “Fragments of a Statistical Autobiography,” Student, 1, 257–268.
Google Scholar
Bayarri, M. J., and Berger, J. O. (2000), “P-values for Composite Null Models,” Journal of the American Statistical Association, 95, 1127–1142. DOI: 10.2307/2669749.
Web of Science ®Google Scholar
Boring, E. G. (1919), “Mathematical vs. Scientific Significance,” Psychological Bulletin, 16, 335–338. DOI: 10.1037/h0074554.
Google Scholar
Box, G. E. P. (1980), “Sampling and Bayes’ Inference in Scientific Modeling and Robustness,” Journal of the Royal Statistical Society, Series A, 143, 383–430. DOI: 10.2307/2982063.
Web of Science ®Google Scholar
Brown, H. K., Ray, J. G., Wilton, A. S., Lunsky, Y., Gomes, T., and Vigod, S. N. (2017), “Association Between Serotonergic Antidepressant Use During Pregnancy and Autism Spectrum Disorder in Children,” JAMA: Journal of the American Medical Association, 317, 1544–1552. DOI: 10.1001/jama.2017.3415.
PubMed Web of Science ®Google Scholar
Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., Kirchler, M., Nave, G., Nosek, B. A., Pfeiffer, T., Altmejd, A., Buttrick, N., Chan, T. Z., Chen, Y. L., Forsell, E., Gampa, A., Heikensten, E., Hummer, L., Imai, T., Isaksson, S., Manfredi, D., Rose, J., Wagenmakers, E. J., and Wu, H. (2018), “Evaluating the Replicability of Social Science Experiments in Nature and Science Between 2010 and 2015,” Nature Human Behaviour, 2, 637–644. DOI: 10.1038/s41562-018-0399-z.
PubMed Web of Science ®Google Scholar
Cohen, J. (1994), “The Earth Is Round (p <.05),” American Psychologist, 49, 997–1003.
Web of Science ®Google Scholar
Cox, D. R. (1978), “Foundations of Statistical Inference: The Case for Eclecticism,” Australian Journal of Statistics, 20, 43–59. DOI: 10.1111/j.1467-842X.1978.tb01094.x.
Google Scholar
Crane, H. (2017), “Why ‘Redefining Statistical Significance’ Will Not Improve Reproducibility and Could Make the Replication Crisis Worse,” available at https://arxiv.org/abs/1711.07801.
Google Scholar
Cumming, G. (2014), “The New Statistics: Why and How,” Psychological Science, 25, 7–29. DOI: 10.1177/0956797613504966.
PubMed Web of Science ®Google Scholar
Edgeworth, F. Y. (1885), “Methods of Statistics,” Journal of the Statistical Society of London, Jubilee Volume, 181–217.
Google Scholar
Efron, B., and Hastie, T. (2016), Computer Age Statistical Inference: Algorithms, Evidence, and Data Science, New York: Cambridge University Press.
Google Scholar
Fisher, R. A. (1937), The Design of Experiments (2nd ed.), Edinburgh: Oliver and Boyd.
Google Scholar
Gelman, A. (2016), “The Problems with P-values are not Just with P-values,” The American Statistician, this issue Supplemental Material to the ASA Statement on P-values and Statistical Significance.
Google Scholar
Gelman, A., and Hennig, C. (2017), “Beyond Subjective and Objective in Statistics,” Journal of the Royal Statistical Society, Series A, 180, 967–1033. DOI: 10.1111/rssa.12276.
Web of Science ®Google Scholar
Gelman, A., and Stern, H. (2006), “The Difference Between ‘Significant’ and ‘Not Significant’ is Not Itself Statistically Significant,” The American Statistician, 60, 328–331. DOI: 10.1198/000313006X152649.
Web of Science ®Google Scholar
Gigerenzer, G. (1993), “The Superego, the Ego, and the ID in Statistical Reasoning,” in A Handbook for Data Analysis in the Behavioral Sciences, eds G. Keren and C. Lewis, Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 311–339.
Google Scholar
Good, I. J. (1957), “Some Logic and History of Hypothesis Testing,” in Philosophical Foundations of Economics, ed. J. C. Pitt, Dordrecht, Holland: D. Reidel, pp. 149–174. (Reprinted as Ch. 14 in Good, I. J. (1983), Good Thinking, 129–148, Minneapolis, MN: University of Minnesota Press).
Google Scholar
Goodman, S. N. (1992), “A Comment on Replication, P-values and Evidence,” Statistics in Medicine, 11, 875–879. DOI: 10.1002/sim.4780110705.
PubMed Web of Science ®Google Scholar
Greenland, S. (2011), “Null Misinterpretation in Statistical Testing and Its Impact on Health Risk Assessment,” Preventive Medicine, 53, 225–228. DOI: 10.1016/j.ypmed.2011.08.010.
PubMed Web of Science ®Google Scholar
Greenland, S. (2017), “Invited Commentary: The Need for Cognitive Science in Methodology,” American Journal of Epidemiology, 186, 639–645. DOI: 10.1093/aje/kwx259.
PubMed Web of Science ®Google Scholar
Greenland, S. (2019a), “Valid P-values Behave Exactly as They Should: Some Misleading Criticisms of P-values and Their Resolution with S-values,” The American Statistician, this issue.
Google Scholar
Greenland, S. (2019b), “The Unconditional Information in P− values, and Its Refutational Interpretation via S-values,” submitted.
Google Scholar
Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. C., Poole, C., Goodman, S. N., and Altman, D. G. (2016), “Statistical Tests, Confidence Intervals, and Power: A Guide to Misinterpretations,” The American Statistician, 70, online supplement 1 at http://amstat.tandfonline.com/doi/suppl/10.1080/00031305.2016.1154108/suppl_file/utas_a_1154108_sm5368.pdf; reprinted in European Journal of Epidemiology, 31, 337–350. DOI: 10.1007/s10654-016-0149-3.
Web of Science ®Google Scholar
Halsey, L. G., Curran-Everett, D., Vowler, S. L., and Drummond, G. B. (2015), “The Fickle P-value Generates Irreproducible Results,” Nature Methods, 12, 179–185. DOI: 10.1038/nmeth.3288.
PubMed Web of Science ®Google Scholar
Hurlbert, S. H. and Lombardi, C. M. (2009), “Final Collapse of the Neyman-Pearson Decision Theoretic Framework and Rise of the neo Fisherian. Annales Zoologici Fennici, 46, 311–349. DOI: 10.5735/086.046.0501.
Web of Science ®Google Scholar
John, L. K., Loewenstein, G., and Prelec, D. (2012), “Measuring the Prevalence of Questionable Research Practices with Incentives for Truth Telling,” Psychological Science, 23, 524–532. DOI: 10.1177/0956797611430953.
PubMed Web of Science ®Google Scholar
Lakens, D., Scheel, A. M., and Isager, P. M. (2018), “Equivalence Testing for Psychological Research: A Tutorial,” Advances in Methods and Practices in Psychological Science, 1, 259–269. DOI: 10.1177/2515245918770963.
Google Scholar
Lehmann, E. L. (1986), Testing Statistical Hypotheses (2nd ed.), New York: Springer.
Google Scholar
Little, R. J. (2006), “Calibrated Bayes: A Bayes/Frequentist Roadmap,” The American Statistician, 60, 213–223. DOI: 10.1198/000313006X117837.
Web of Science ®Google Scholar
Locascio, J. (2017), “Results Blind Science Publishing,” Basic and Applied Social Psychology, 39, 239–246. DOI: 10.1080/01973533.2017.1336093.
PubMed Web of Science ®Google Scholar
Martinson, B. C., Anderson, M. S., and de Vries, R. (2005), “Scientists Behaving Badly,” Nature, 435, 737–738. DOI: 10.1038/435737a.
PubMed Web of Science ®Google Scholar
McShane, B. B., Gal, D., Gelman, A., Robert, C., and Tackett, J. L. (2019), “Abandon Statistical Significance,” The American Statistician.
Web of Science ®Google Scholar
Meehl, P. E. (1990), “Why Summaries of Research on Psychological Theories Are Often Uninterpretable,” Psychological Reports, 66, 195–244. DOI: 10.2466/pr0.1990.66.1.195.
Web of Science ®Google Scholar
Neyman, J., and Pearson, E. S. (1933), “The Testing of Statistical Hypotheses in Relation to Probabilities a priori,” Mathematical Proceedings of the Cambridge Philosophical Society, 29, 492–510. DOI: 10.1017/S030500410001152X.
Google Scholar
Open Science Collaboration. (2015), “Estimating the Reproducibility of Psychological Science,” Science, 349, aac4716.
PubMed Web of Science ®Google Scholar
Poole, C. (1987a), “Beyond the Confidence Interval,” American Journal of Public Health, 77, 195–199.
PubMed Web of Science ®Google Scholar
Poole, C. (1987b), “Confidence Intervals Exclude Nothing,” American Journal of Public Health, 77, 492–493. DOI: 10.2105/AJPH.77.4.492.
PubMed Web of Science ®Google Scholar
Popper, K. R. (1968), The Logic of Scientific Discovery (2nd English ed.), London: Routledge.
Google Scholar
Robins, J. M., van der Vaart, A., and Ventura, V. (2000), “Asymptotic Distribution of P-values in Composite Null Models,” Journal of the American Statistical Association, 95, 1143–1156. DOI: 10.2307/2669750.
Web of Science ®Google Scholar
Rothman, K., Greenland, S., and Lash, T. L. (2008), Modern Epidemiology (3rd ed., Ch. 10), Philadelphia, PA: Lippincott Williams & Wilkins.
Google Scholar
Senn, S. J. (2001), “Two Cheers for P-values?” Journal of Epidemiology and Biostatistics, 6, 193–204.
PubMedGoogle Scholar
Senn, S. J. (2002), “‘Letter to the Editor’ Re: Goodman 1992,” Statistics in Medicine, 21, 2437–2444. DOI: 10.1002/sim.1072.
PubMed Web of Science ®Google Scholar
Senn, S. J. (2011), “You May Believe You Are a Bayesian But You Are Probably Wrong,” Rational Markets and Morals, 2, 48–66.
Google Scholar
Stark, P. B., and Saltelli, A. (2018), “Cargo-Cult Statistics and Scientific Crisis,” Significance, 15, 40–43. DOI: 10.1111/j.1740-9713.2018.01174.x.
Google Scholar
Trafimow, D., Amrhein, V., Areshenkoff, C. N., Barrera-Causil, C., Beh, E. J., Bilgiç, Y., Bono, R., Bradley, M. T., Briggs, W. M., Cepeda-Freyre, H. A., Chaigneau, S. E., Ciocca, D. R., Carlos Correa, J., Cousineau, D., de Boer, M. R., Dhar, S. S., Dolgov, I., Gómez-Benito, J., Grendar, M., Grice, J., Guerrero-Gimenez, M. E., Gutiérrez, A., Huedo-Medina, T. B., Jaffe, K., Janyan, A., Karimnezhad, A., Korner-Nievergelt, F., Kosugi, K., Lachmair, M., Ledesma, R., Limongi, R., Liuzza, M. T., Lombardo, R., Marks, M., Meinlschmidt, G., Nalborczyk, L., Nguyen, H. T., Ospina, R., Perezgonzalez, J. D., Pfister, R., Rahona, J. J., Rodríguez-Medina, D. A., Romão, X., Ruiz-Fernández, S., Suarez, I., Tegethoff, M., Tejo, M., van de Schoot, R., Vankov, I., Velasco-Forero, S., Wang, T., Yamada, Y., Zoppino, F. C., and Marmolejo-Ramos, F. (2018), “Manipulating the Alpha Level Cannot Cure Significance Testing,” Frontiers in Psychology, 9, 699.
Google Scholar
Trafimow, D., and Marks, M. (2015), “Editorial,” Basic and Applied Social Psychology, 37, 1–2. DOI: 10.1080/01973533.2015.1012991.
Web of Science ®Google Scholar
Wellek, S. (2010), Testing Statistical Hypotheses of Equivalence and Noninferiority (2nd ed.), New York: Chapman & Hall.
Google Scholar
Ziliak, S. T., and McCloskey, D. N. (2008), The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives. Ann Arbor, MI: University of Michigan Press.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis if We Don’t Expect Replication

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis if We Don’t Expect Replication

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date