26,275
Views
64
CrossRef citations to date
0
Altmetric
Getting to a Post “p<0.05” Era

Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing

Pages 82-90 | Received 09 Mar 2018, Accepted 03 Oct 2018, Published online: 20 Mar 2019

References

  • Abelson, R. P. (1997), “A Retrospective on the Significance Test Ban of 1999 (If There Were No Significance Tests, They Would Be Invented),” in What If There Were No Significance Tests?, eds. L. L. Harlow, S. A. Mulaik and J. H. Steiger, Multivariate Applications, Mahwah, NJ: Lawrence Erlbaum Associates, pp. 117–141.
  • AP (2010), “Statistics Course Description,” available at https://secure-media.collegeboard.org/ap-student/course/ap-statistics-2010-course-exam-description.pdf.
  • Arbuthnott, J. (1710), “An Argument for Divine Providence, Taken From the Constant Regularity Observ’d in the Births of Both Sexes,” Philosophical Transactions (1683–1775), 27, 186–190.
  • Bakan, D. (1966), “The Test of Significance in Psychological Research,” Psychological Bulletin, 66, 423–437. DOI: 10.1037/h0020412.
  • Benjamin, D. J. and Berger, J. O. (2016), “Comment: A Simple Alternative to p-values,” The American Statistician, Online Discussion 70, 1–2.
  • Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., J. Wagenmakers, E., Berk, R., Bollen, K. A., Brembs, B., Brown, L., Camerer, C., Cesarini, D., Chambers, C. D., Clyde, M., Cook, T. D., De Boeck, P., Dienes, Z., Dreber, A., Easwaran, K., Efferson, C., Fehr, E., Fidler, F., Field, A. P., Forster, M., George, E. I., Gonzalez, R., Goodman, S., Green, E., Green, D. P., Greenwald, A. G., Hadfield, J. D., Hedges, L. V., Held, L., Ho, T. H., Hoijtink, H., Hruschka, D. J., Imai, K., Imbens, G., Ioannidis, J. P. A., Jeon, M., Jones, J. H., Kirchler, M., Laibson, D., List, J., Little, R., Lupia, A., Machery, E., Maxwell, S. E., McCarthy, M., Moore, D. A., Morgan, S. L., Munafó, M., Nakagawa, S., Nyhan, B., Parker, T. H., Pericchi, L., Perugini, M., Rouder, J., Rousseau, J., Savalei, V., Schönbrodt, F. D., Sellke, T., Sinclair, B., Tingley, D., Van Zandt, T., Vazire, S., Watts, D. J., Winship, C., Wolpert, R. L., Xie, Y., Young, C., Zinman, J. and Johnson, V. E. (2018), “Redefine Statistical Significance,” Nature Human Behaviour, 2, 6–10. DOI: 10.1038/s41562-017-0189-z.
  • Benjamini, Y. (2016), “It’s Not the p-values’ Fault,” The American Statistician, Online Discussion, 70, 1–2.
  • Berkson, J. (1938), “Some Difficulties of Interpretation Encountered in the Application of the Chi-square Test,” Journal of American Statistical Association, 33, 526–536. DOI: 10.1080/01621459.1938.10502329.
  • Berkson, J. (1942), “Tests of Significance Considered as Evidence,” Journal of American Statistical Association, 37, 325–335. DOI: 10.1080/01621459.1942.10501760.
  • Berry, D. A. (2016), “p-values Are Not What They’re Cracked Up to Be,” The American Statistician, Online Discussion, 70, 1–2.
  • Box, J. F. (1978), R. A. Fisher, the Life of a Scientist, Wiley Series in Probability and Mathematical Statistics, New York: Wiley.
  • Chambers, C. (2017), The Seven Deadly Sins of Psychology: A Manifesto for Reforming the Culture of Scientific Practice, Princeton, NJ: Princeton University Press.
  • Cournot, A. A. (1843), Exposition de la Théorie des Chances et des Probabilités, Paris: L. Hachette.
  • David, H. A. and Edwards, A. W. F. (2001), Annotated Readings in the History of Statistics, Springer Series in Statistics: Perspectives in Statistics. New York: Springer.
  • De Finetti, B. (1937), “La Prévision: Ses Lois Logiques, Ses Sources Subjectives,” Annales de l’Institut Henri Poincaré, 7, 1–68.
  • Edgeworth, F. Y. (1885), “Methods of Statistics,” Journal of Statistical Society, London, Jubilee Volume, 181–217.
  • Elderton, W. P. (1902), “Tables for Testing the Goodness of Fit of Theory to Observation,” Biometrika, 1, 155–163. DOI: 10.2307/2331485.
  • Fauvel, J. (1991), “Using History in Mathematics Education,” For the Learning of Mathematics, 11, 3–6.
  • Fisher, R. A. (1922a), “The Goodness of Fit of Regression Formulae, and the Distribution of Regression Coefficients,” Journal of Royal Statistical Society, 85, 597–612. DOI: 10.2307/2341124.
  • Fisher, R. A. (1922b), “On the Interpretation of χ2 from Contingency Tables, and the Calculation of P,” Journal of Royal Statistical Society, 85, 87–94. DOI: 10.2307/2340521.
  • Fisher, R. A. (1922c), “On the Mathematical Foundations of Theoretical Statistics,” Philosophical Transactions of the Royal Society of London, Series A, 222, 309–368. DOI: 10.1098/rsta.1922.0009.
  • Fisher, R. A. (1925), Statistical Methods for Research Workers, Edinburgh: Oliver and Boyd.
  • Fisher, R. A. (1935), The Design of Experiments, Edinburgh: Oliver and Boyd.
  • Fisher, R. A. (1956), Statistical Methods and Scientific Inference, Edinburgh: Oliver and Boyd.
  • Fisher, R. A. and Yates, F. (1938), Statistical Tables for Biological, Agricultural and Medical Research, Edinburgh: Oliver and Boyd.
  • Gelman, A. (2016), “The Problems With p-values Are Not Just with p-values,” The American Statistician, Online Discussion, 70, 1–2.
  • Gigerenzer, G. (2004), “Mindless Statistics,” Journal of Socio-Economics, 33, 587–606. DOI: 10.1016/j.socec.2004.09.033.
  • Gigerenzer, G., Swijtink, Z., and Daston, L. (1989), The Empire of Chance: How Probability Changed Science and Everyday Life, New York: Cambridge University Press.
  • Goodman, S. N. (2016), “The Next Questions: Who, What, When, Where, and Why?” The American Statistician, Online Discussion, 70, 1–2.
  • Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N. and Altman, D. G. (2016), “Statistical Tests, p-values, Confidence Intervals, and Power: A Guide to Misinterpretations,” The American Statistician, Online Discussion, 70, 1–12.
  • Hogben, L. T. (1957), Statistical Theory: The Relationship of Probability, Credibility, and Error, London: Allen & Unwin.
  • Hubbard, R. (2004), “‘Alphabet Soup: Blurring the Distinctions Between p’s and α’s in Psychological Research,” Theory of Psychology, 14, 295–327. DOI: 10.1177/0959354304043638.
  • Hubbard, R. (2016), Corrupt Research: The Case for Reconceptualizing Empirical Management and Social Science, Thousand Oaks, CA: SAGE Publications.
  • Ioannidis, J. P. A. (2005), “Why Most Published Research Findings Are False,” PLoS Medicine, 2, e124. DOI: 10.1371/journal.pmed.0020124.
  • Jeffreys, H. (1939), The Theory of Probability, Oxford: Oxford University Press.
  • Kadane, J. B., Fienberg, S. E. and DeGroot, M. H. (1986), Statistics and the Law, Wiley Series in Probability and Mathematical Statistics, New York: Wiley.
  • Kennedy-Shaffer, L. (2017), “When the Alpha is the Omega: p-values, Substantial Evidence, and the 0.05 Standard at FDA,” Food & Drug Law Journal, 72, 595–635.
  • Laplace, P. S. (1827), Traité de Mécanique Céleste, Supplément, Paris: Duprat.
  • Lehmann, E. L. (1993), “The Fisher, Neyman–Pearson Theories of Testing Hypotheses: One Theory or Two?” Journal of American Statistical Association, 88, 1242–1249. DOI: 10.1080/01621459.1993.10476404.
  • Lenhard, J. (2006), “Models and Statistical Inference: The Controversy Between Fisher and Neyman–Pearson,” British Journal for the Philosophy of Science, 57, 69–91. DOI: 10.1093/bjps/axi152.
  • Lew, M. J. (2016), “Three Inferential Questions, Two Types of p-value,” The American Statistician, Online Discussion, 70, 1–2.
  • Little, R. J. (2016), “Comment,” The American Statistician, Online Discussion, 70, 1–2.
  • Liu, P.-H. (2003), “Do Teachers Need to Incorporate the History of Mathematics in Their Teaching?” The Mathematics Teacher, 96, 416–421.
  • MacKenzie, D. A. (1981), Statistics in Britain, 1865–1930; The Social Construction of Scientific Knowledge. Edinburgh: Edinburgh University Press. DOI: 10.1086/ahr/87.4.1091.
  • McGrayne, S. B. (2011), The Theory that Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, & Emerged Triumphant from Two Centuries of Controversy, New Haven, CT: Yale University Press.
  • Meehl, P. (1967), “Theory-testing in Psychology and Physics: A Methodological Paradox,” Philosophy of Science, 34, 103–115. DOI: 10.1086/288135.
  • Millar, A. M. (2016), “ASA Statement on p-values: Some Implications for Education,” The American Statistician, Online Discussion, 70, 1–2.
  • Morrison, D. E., and Henkel, R. E. (1970), The Significance Test Controversy: A Reader, Methodological Perspectives. Chicago, IL: Aldine Publishing.
  • Neyman, J. and Pearson, E. S. (1933), ‘The Testing of Statistical Hypotheses in Relation to Probabilities a Priori,” Mathematical Proceedings of Cambridge Philosophical Society, 29, 492–510. DOI: 10.1017/S030500410001152X.
  • Pearson, E. S., Plackett, R. L. and Barnard, G. A. (1990), ‘Student’: A Statistical Biography of William Sealy Gosset, Oxford: Clarendon Press.
  • Pearson, K. (1900), “X. On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it can be Reasonably Supposed to Have Arisen from Random Sampling,” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science. 50, 157–175. DOI: 10.1080/14786440009463897.
  • Poisson, S.-D. (1837), Recherches sur la Probabilité des Jugements en Matière Criminelle et en Matière Civile: Précédées des Règles Générales du Calcul des Probabilités, Paris:Bachelier.
  • Porter, T. M. (1986), The Rise of Statistical Thinking, 1820–1900, Princeton, NJ: Princeton University Press. DOI: 10.1086/ahr/93.1.116.
  • Rothman, K. J. (2016), “Disengaging From Statistical Significance,” The American Statistician, Online Discussion, 70, 1.
  • Rozeboom, W. W. (1960), “The Fallacy of the Null-Hypothesis Significance Test,” Psychological Bulletin. 57, 416–428.
  • Salsburg, D. (2001), The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century, New York: W.H. Freeman and Co.
  • Savage, L. J. (1954), The Foundations of Statistics, New York: Wiley.
  • Skellam, J. G. (1969), “Models, Inference, and Strategy,” Biometrics 25, 457–475.
  • Skipper, J. K., Guenther, A. L., and Nass, G. (1967), “The Sacredness of .05: A Note Concerning the Uses of Statistical Levels of Significance in Social Science,” American Sociology, 2, 16–18.
  • Stangl, D. (2016), “Comment,” The American Statistician, Online Discussion, 70, 1.
  • Stigler, S. M. (1986), The History of Statistics: The Measurement of Uncertainty Before 1900, Cambridge, MA: Belknap Press of Harvard University Press. DOI: 10.1086/ahr/93.4.1019.
  • Student (1908), “The Probable Error of a Mean,” Biometrika, 6, 1–25.
  • Tippett, L. H. C. (1931), The Methods of Statistics: An Introduction Mainly for Workers in the Biological Sciences, London: Williams and Norgate.
  • Wasserstein, R. L., and Lazar, N. A. (2016), “The ASA’s Statement on p-values: Context, Process, and Purpose,” The American Statistician, 70, 129–133. DOI: 10.1080/00031305.2016.1154108.
  • Weisberg, H. I. (2014), Willful Ignorance: The Mismeasure of Uncertainty, Hoboken, NJ: Wiley.
  • Yates, F. (1951), “The Influence of Statistical Methods for Research Workers on the Development of the Science of Statistics,” Journal of American Statistical Association, 46, 19–34. DOI: 10.2307/2280090.
  • Ziliak, S. T. (2016), “The Significance of the ASA Statement on Statistical Significance and p-values,” The American Statistician, Online Discussion, 70, 1–2.
  • Ziliak, S. T. and McCloskey, D. N. (2008), The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives, Ann Arbor, MI: University of Michigan Press.