14,794
Views
105
CrossRef citations to date
0
Altmetric
Articles

Coup de Grâce for a Tough Old Bull: “Statistically Significant” Expires

, &
Pages 352-357 | Received 16 Mar 2018, Accepted 18 Oct 2018, Published online: 20 Mar 2019

References

  • Altman, D. G. (1991), Practical Statistics for Medical Research, London: Chapman and Hall.
  • Barnard, G. A. (1982), “Conditionality Versus Similarity in the Analysis of 2 × 2 Tables,” in Statistics and Probability: Essays in Honor of C.R. Rao, eds. G. Kallianpur, P. R. Krishnaiah, and J. K. Ghosh, New York: North Holland Publishing, pp. 59–65.
  • Carmer, S. G., and Walker, W. M. (1982), “Baby Bear’s Dilemma: A Statistical Tale,” Agronomy Journal, 74, 122–124. DOI: 10.2134/agronj1982.00021962007400010031x.
  • Christensen, R. (2005), “Testing Fisher, Neyman, Pearson, and Bayes,” The American Statistician, 59, 121–126. DOI: 10.1198/000313005X20871.
  • Colquhoun, D. (2014), “An Investigation of the False Discovery Rate and the Misinterpretation of p-Values,” Royal Society Open Science, 1, 140216. DOI: 10.1098/rsos.140216.
  • Cox, D. R. (1965), “A Remark on Multiple Comparison Methods,” Technometrics, 7, 223–224. DOI: 10.1080/00401706.1965.10490250.
  • Eysenck, H. J. (1960), “The Concept of Statistical Significance and the Controversy About One-Tailed Tests,” Psychological Review, 67, 269–271. DOI: 10.1037/h0048412.
  • Finney, D. J. (1988), “Was This in Your Statistics Textbook? III. Design and Analysis,” Experimental Agriculture, 24, 421–432.
  • Greenland, S. (2017), “The Need for Cognitive Science in Methodology,” American Journal of Epidemiology, 186, 639–645. DOI: 10.1093/aje/kwx259.
  • Greenland, S., and Poole, C. (2011), “Problems in Common Interpretations of Statistics in Scientific Articles, Expert Reports, and Testimony,” Jurimetrics Journal, 51, 113–129.
  • Greenland, S., Senn, S. J., Carlin, J. B., Poole, C., Goodman, S. N., and Altman, D. G. (2016), “Statistical Tests, P Values, Confidence Intervals, and Power: A Guide to Misinterpretations,” European Journal of Epidemiology, 31, 337–350. DOI: 10.1007/s10654-016-0149-3.
  • Hankins, M. C. (2013), “Still not significant,” available at http://mchankins.wordpress.com/2013/04/21/still-notsignificant-2/.
  • Hurlbert, S. H., and Lombardi, C. M. (2009), “Final Collapse of the Neyman-Pearson Decision-Theoretic Framework and Rise of the NeoFisherian,” Annales Zoologici Fennici, 46, 311–349. DOI: 10.5735/086.046.0501.
  • Hurlbert, S. H., and Lombardi, C. M. (2012), “Lopsided Reasoning on Lopsided Tests and Multiple Comparisons,” Australian & New Zealand Journal of Statistics, 54, 23–42. DOI: 10.1111/j.1467-842X.2012.00652.x.
  • Keppel, G. (1991), Design and Analysis: A Researcher’s Handbook (3rd ed.), Englewood Cliffs, NJ: Prentice Hall.
  • Lombardi, C. M., and Hurlbert, S. H. (2009), “Misprescription and Misuse of One-Tailed Tests,” Austral Ecology, 34, 447–468.
  • McShane, B. B., and Gal, D. (2017), “Statistical Significance and the Dichotomization of Evidence,” Journal of the American Statistical Association, 112, 885–908. DOI: 10.1080/01621459.2017.1289846.
  • Mead, R. (1988), The Design of Experiments, Cambridge: Cambridge University Press.
  • Nakagawa, S. (2004), “A Farewell to Bonferroni: The Problems of Low Statistical Power and Publication Bias,” Behavioral Ecology, 15, 1044–1045. DOI: 10.1093/beheco/arh107.
  • O’Neill, R., and Wetherill, G. B. (1971), “The Present State of Multiple Comparison Methods,” Journal of the Royal Statistical Society, Series B, 33, 218–250. DOI: 10.1111/j.2517-6161.1971.tb00874.x.
  • Pearce, S. C. (1993), “Data Analysis in Agricultural Experimentation. III. Multiple Comparisons,” Experimental Agriculture, 29, 1–8. DOI: 10.1017/S0014479700020354.
  • Perry, J. N. (1986), “Multiple-Comparison Procedures: A Dissenting View,” Journal of Economic Entomology, 79, 1149–1155. DOI: 10.1093/jee/79.5.1149.
  • Pocock, S. J., Bakris, G., Rhatt, D. L., Brar, S., Fahy, M., and Gersh, B. J. (2016), “Regression to the Mean in SYMPLICITY HTN-3: Implications for Design and Reporting of Future Trials,” Journal of the American College of Cardiology, 68, 2016–2025. DOI: 10.1016/j.jacc.2016.07.775.
  • Reifel, K. M., Trees, C. C., Olivo, E., Swan, B. K., Watts, J. M., and Hurlbert, S. H. (2007), “Influence of River Inflows on Spatial Variation of Phytoplankton Around the Southern End of the Salton Sea, California,” Hydrobiologia, 576, 167–183. DOI: 10.1007/s10750-006-0300-3.
  • Rothman, K. J. (1990), “No Adjustments Are Needed for Multiple Comparisons,” Epidemiology, 1, 43–46.
  • Schulz, K. F., and Grimes, D. A. (2005), “Multiplicity in Randomized Trials I: Endpoints and Treatments,” Lancet, 365, 1591–1595. DOI: 10.1016/S0140-6736(05)66461-6.
  • Skipper, K. S. Jr., Guenther, A. L., and Nass, G. (1967), “The Sacredness of.05: A Note Concerning the Uses of Statistical Levels of Significance in Social Science,” American Sociologist, 2, 16–18.
  • Sterne, J. A. C. (2002), “Teaching Hypothesis Tests—Time for Significant Change?,” Statistics in Medicine, 21, 985–994. DOI: 10.1002/sim.1129.
  • Stewart-Oaten, A. (1995), “Rules and Judgments in Statistics: Three Examples,” Ecology, 76, 2001–2009. DOI: 10.2307/1940736.
  • Stigler, S. (2008), “Fisher and the 5% Level,” Chance, 21, 12. DOI: 10.1007/s00144-008-0033-3.
  • Stoehr, A. M. (1999), “Are Significance Thresholds Appropriate for the Study of Animal Behaviour?,” Animal Behaviour, 57, F22–F25. DOI: 10.1006/anbe.1998.1016.
  • Wasserstein, R., and Lazar, N. (2016), “ASA Statement on Statistical Significance and p-Values,” The American Statistician, 70, 131–133.
  • Wasserstein, R., Lazar, N., and Schirm, A. (2019), “Editorial: Moving to a World Beyond p < 0.05,” The American Statistician, 73(1).
  • Wilson, W. (1962), “A Note on the Inconsistency Inherent in the Necessity to Perform Multiple Comparisons,” Psychological Bulletin, 59, 296–300. DOI: 10.1037/h0040447.