222
Views
103
CrossRef citations to date
0
Altmetric
Original Articles

What Statistical Significance Testing Is, and What It Is Not

Pages 293-316 | Published online: 15 Apr 2014

Keep up to date with the latest research on this topic with citation updates for this article.

Read on this site (27)

Melanie Hwalek, Cassandra Solomon-Filer & Deborah Wasserman. (2022) Retrospective Pretests: Recent Use in Visitor Studies Research and Ways to Make Them More Informative. Visitor Studies 25:1, pages 1-21.
Read now
Sharmila Gamlath. (2021) Business undergraduates’ progress and satisfaction with learning experiences: the role of group assessment. Assessment & Evaluation in Higher Education 46:3, pages 360-375.
Read now
Jonathan Scourfield, Gualtiero Colombo, Pete Burnap, Rhiannon Evans, Nina Jacob, Matthew Williams & Sarah Caul. (2019) The Number and Characteristics of Newspaper and Twitter Reports on Suicides and Road Traffic Deaths in Young People. Archives of Suicide Research 23:3, pages 507-522.
Read now
Ana Ortiz de Guinea & Jane Webster. (2015) The missing links: cultural, software, task and personal influences on computer self-efficacy. The International Journal of Human Resource Management 26:7, pages 905-931.
Read now
AnaOrtiz de Guinea, Ryad Titah & Pierre-Majorique Léger. (2014) Explicit and Implicit Antecedents of Users' Behavioral Beliefs in Information Systems: A Neuropsychological Investigation. Journal of Management Information Systems 30:4, pages 179-210.
Read now
Per Davidsson. (2013) Some reflection on research ‘Schools’ and geographies. Entrepreneurship & Regional Development 25:1-2, pages 100-110.
Read now
NitiB. Mishra, KelleyA. Crews & AmyL. Neuenschwander. (2012) Sensitivity of EVI-based harmonic regression to temporal resolution in the lower Okavango Delta. International Journal of Remote Sensing 33:24, pages 7703-7726.
Read now
Jianjun Wang. (2008) Effect size and practical importance: a non‐monotonic match. International Journal of Research & Method in Education 31:2, pages 125-132.
Read now
Lennart Sjöberg & Elisabeth Engelberg. (2005) Lifestyles, and Risk Perception Consumer Behavior. International Review of Sociology 15:2, pages 327-362.
Read now
KelliM. Paul & JonathanA. Plucker. (2004) Two steps forward, one step back: Effect size reporting in gifted education research from 1995–2000 . Roeper Review 26:2, pages 68-72.
Read now
Lisa L. Harlow. (2002) Book Review of Using Multivariate Statistics by Barbara G. Tabachnick and Linda S. Fidell. Structural Equation Modeling: A Multidisciplinary Journal 9:4, pages 621-636.
Read now
Denise de Souza Fleith, Joseph S. Renzulli & Karen L. Westberg. (2002) Effects of a Creativity Training Program on Divergent Thinking Abilities and Self-Concept in Monolingual and Bilingual Classrooms. Creativity Research Journal 14:3-4, pages 373-386.
Read now
Xitao Fan. (2001) Statistical Significance and Effect Size in Education Research: Two Sides of a Coin. The Journal of Educational Research 94:5, pages 275-282.
Read now
ThomasA. Devaney. (2001) Statistical Significance, Effect Size, and Replication: What Do the Journals Say?. The Journal of Experimental Education 69:3, pages 310-320.
Read now
ToddC. Riniolo & LouisA. Schmidt. (2000) Searching for Reliable Relationships With Statistics Packages: An Empirical Example of the Potential Problems. The Journal of Psychology 134:2, pages 143-151.
Read now
RebeccaP. Ang. (1998) Use of the Jackknife Statistic to Evaluate Result Replicability. The Journal of General Psychology 125:3, pages 218-228.
Read now
Tammi Vacha-Haase & Johanna E. Nilsson. (1998) Statistical Significance Reporting: Current Trends and Uses in MECD. Measurement and Evaluation in Counseling and Development 31:1, pages 46-57.
Read now
Deborah Durso Cupal. (1998) Psychological interventions in sport injury prevention and rehabilitation. Journal of Applied Sport Psychology 10:1, pages 103-123.
Read now
JeffreyD. Kromrey & Lynn Foster-johnson. (1996) Determining the Efficacy of Intervention: The Use of Effect Sizes for Data Analysis in Single-Subject Research. The Journal of Experimental Education 65:1, pages 73-93.
Read now
Xitao Fan & Lin Wang. (1996) Comparability of Jackknife and Bootstrap Results: An Investigation for a Case of Canonical Correlation Analysis. The Journal of Experimental Education 64:2, pages 173-189.
Read now
Isadore Newman, JohnW. Fraas & Timothy Norfolk. (1995) Binomial index of model fit: An elaboration. Structural Equation Modeling: A Multidisciplinary Journal 2:2, pages 155-162.
Read now
Per Davidsson. (1995) Culture, structure and regional levels of entrepreneurship. Entrepreneurship & Regional Development 7:1, pages 41-62.
Read now
William Asher. (1993) The Role of Statistics in Research. The Journal of Experimental Education 61:4, pages 388-393.
Read now
William D. Schafer. (1993) Interpreting Statistical Significance and Nonsignificance. The Journal of Experimental Education 61:4, pages 383-387.
Read now
Joel R. Levin. (1993) Statistical Significance Testing From Three Perspectives. The Journal of Experimental Education 61:4, pages 378-382.
Read now
Ronald P. Carver. (1993) The Case Against Statistical Significance Testing, Revisited. The Journal of Experimental Education 61:4, pages 287-292.
Read now

Articles from other publishers (76)

Antonio Calvani, Antonio Marzano, Lorena Montesano, Marta Pellegrini, Amalia Rizzo, Marianna Traversetti & Giuliano Vivanet. (2023) Improving Reading Comprehension and Summarising Skills in Primary School: A Quasi-Experimental Study. Journal of Educational, Cultural and Psychological Studies (ECPS Journal):28.
Crossref
H. R. GaneshaP. S. Aithal. (2022) Why and When Statistics is Required, and How to Simplify Choosing Appropriate Statistical Techniques During Ph.D. Program in India?. International Journal of Management, Technology, and Social Sciences, pages 514-547.
Crossref
Mustafa Semih Sadak, Nihan Kahraman & Umut Uludağ. (2022) Dynamic and static feature fusion for increased accuracy in signature verification. Signal Processing: Image Communication 108, pages 116823.
Crossref
Mark Rubin. (2019) What type of Type I error? Contrasting the Neyman–Pearson and Fisherian approaches in the context of exact and direct replications. Synthese 198:6, pages 5809-5834.
Crossref
Michael C. AcreeMichael C. Acree. 2021. The Myth of Statistical Inference. The Myth of Statistical Inference 393 443 .
Pierre Sindambiwe. 2020. Applied Social Science Approaches to Mixed Methods Research. Applied Social Science Approaches to Mixed Methods Research 201 215 .
Tasha L. Olson, Lori A. Roggman & Mark S. Innocenti. 2020. Encyclopedia of Infant and Early Childhood Development. Encyclopedia of Infant and Early Childhood Development 456 468 .
Carsten DormannCarsten Dormann. 2020. Environmental Data Analysis. Environmental Data Analysis 177 184 .
Alexander Koplenig. (2019) A non-parametric significance test to compare corpora. PLOS ONE 14:9, pages e0222703.
Crossref
J. Gregory Jenkins, Velina Popova & Mark D. Sheldon. (2016) In Support of Public or Private Interests? An Examination of Sanctions Imposed Under the AICPA Code of Professional Conduct. Journal of Business Ethics 152:2, pages 523-549.
Crossref
Nick D. Jeffery, Simon T. Bate, Sina Safayi, Matthew A. HowardIIIIII, Lawrence Moon & Unity Jeffery. (2018) When neuroscience met clinical pathology: partitioning experimental variation to aid data interpretation in neuroscience. European Journal of Neuroscience 47:5, pages 371-379.
Crossref
Laura Badenes-Ribera & Dolores Frias-Navarro. (2017) Falacias sobre el valor p compartidas por profesores y estudiantes universitarios. Universitas Psychologica 16:3, pages 1.
Crossref
Alonso Ortega & Gorka Navarrete. 2017. Bayesian Inference. Bayesian Inference.
Andrew T. Jebb, Scott Parrigon & Sang Eun Woo. (2017) Exploratory data analysis as a foundation of inductive research. Human Resource Management Review 27:2, pages 265-276.
Crossref
Sheera Joy Olasky & David F. Greenberg. 2016. On the Cross Road of Polity, Political Elites and Mobilization. On the Cross Road of Polity, Political Elites and Mobilization 93 119 .
Lawrence D. Igl & Douglas H. Johnson. (2016) Effects of haying on breeding birds in CRP grasslands. The Journal of Wildlife Management 80:7, pages 1189-1204.
Crossref
Pia Borlund. (2016) Framing of different types of information needs within simulated work task situations: An empirical study in the school context. Journal of Information Science 42:3, pages 313-323.
Crossref
Stephen Gorard. (2016) Damaging Real Lives through Obstinacy: Re-Emphasising Why Significance Testing is Wrong. Sociological Research Online 21:1, pages 102-115.
Crossref
Per DavidssonPer Davidsson. 2016. Researching Entrepreneurship. Researching Entrepreneurship 247 284 .
Roger E. Kirk. 2014. Wiley StatsRef: Statistics Reference Online. Wiley StatsRef: Statistics Reference Online 1 13 .
Roger E. Kirk. 2014. Wiley StatsRef: Statistics Reference Online. Wiley StatsRef: Statistics Reference Online.
Pia Borlund & Sabine Dreier. (2014) An investigation of the search behaviour associated with Ingwersen’s three types of information needs. Information Processing & Management 50:4, pages 493-507.
Crossref
William L. Gross & Jeffrey R. Binder. (2014) Alternative thresholding methods for fMRI data optimized for surgical planning. NeuroImage 84, pages 554-561.
Crossref
Curtis A. Olson & Conor A. Richardson. (2014) How Significant Is Statistical Significance? Observations on the Independence of P Values and Importance in Effectiveness Studies of Educational Interventions. Journal of Continuing Education in the Health Professions 34:3, pages 151-154.
Crossref
Jon Sprouse, Carson T. Schütze & Diogo Almeida. (2013) A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001–2010. Lingua 134, pages 219-248.
Crossref
Thijs Bosker, Joseph F. Mudge & Kelly R. Munkittrick. (2013) Statistical reporting deficiencies in environmental toxicology. Environmental Toxicology and Chemistry 32:8, pages 1737-1739.
Crossref
Jesper W. Schneider. (2013) Caveats for using statistical significance tests in research assessments. Journal of Informetrics 7:1, pages 50-62.
Crossref
Andrea L. Behrman, Mark G. Bowden & Dorian K. Rose. 2013. Neurological Rehabilitation. Neurological Rehabilitation 61 66 .
Michael L. Morrison. (2012) The habitat sampling and analysis paradigm has limited value in animal conservation: A prequel. The Journal of Wildlife Management 76:3, pages 438-450.
Crossref
Pia Borlund, Sabine Dreier & Katriina Byström. (2012) What does time spent on searching indicate?. What does time spent on searching indicate?.
Edward Purssell & Alison While. (2011) P=nothing, or why we should not teach healthcare students about statistics. Nurse Education Today 31:8, pages 837-840.
Crossref
김민성. (2011) Quantitative Methods in Geography Education Research: Concept and Application of Effect Size. The Journal of The Korean Association of Geographic and Environmental Education 19:2, pages 205-220.
Crossref
Michael Dickson & Davis Baird. 2011. Philosophy of Statistics. Philosophy of Statistics 199 229 .
Daniel T. L. Shek. (2009) Using Students’ Weekly Diaries to Evaluate Positive Youth Development Programs: Are Findings Based on Multiple Studies Consistent?. Social Indicators Research 95:3, pages 475-487.
Crossref
Stuart H. Hurlbert & Celia M. Lombardi. (2009) Final Collapse of the Neyman-Pearson Decision Theoretic Framework and Rise of the neoFisherian. Annales Zoologici Fennici 46:5, pages 311-349.
Crossref
Daniel T. L. Shek. 2011. Quality of Life of Chinese People in a Changing World. Quality of Life of Chinese People in a Changing World 119 131 .
Roger E. Kirk. (2007) Effect magnitude: A different focus. Journal of Statistical Planning and Inference 137:5, pages 1634-1646.
Crossref
Thomas J. Kehle, Melissa A. Bray, Sandra M. Chafouleas & Takuji Kawano. (2007) Lack of statistical significance. Psychology in the Schools 44:5, pages 417-422.
Crossref
Daniel T. L. Shek. (2016) A Longitudinal Study of Perceived Differences in Parental Control and Parent-Child Relational Qualities in Chinese Adolescents in Hong Kong. Journal of Adolescent Research 22:2, pages 156-188.
Crossref
Daniel T. L. Shek & T. Y. Lee. (2006) Family Life Quality and Emotional Quality of Life in Chinese Adolescents with and Without Economic Disadvantage. Social Indicators Research 80:2, pages 393-410.
Crossref
Pélagie M. Beeson & Randall R. Robey. (2006) Evaluating Single-Subject Treatment Research: Lessons Learned from the Aphasia Literature. Neuropsychology Review 16:4, pages 161-169.
Crossref
Glenn Suter Ii. 2006. Ecological Risk Assessment, Second Edition. Ecological Risk Assessment, Second Edition.
Clint D Kelly. (2006) Replicating Empirical Research In Behavioral Ecology: How And Why It Should Be Done But Rarely Ever Is. The Quarterly Review of Biology 81:3, pages 221-236.
Crossref
Christopher W. Kuhar. (2006) In the deep end: pooling data and other statistical challenges of zoo and aquarium research. Zoo Biology 25:4, pages 339-352.
Crossref
Ross D. Crosby, Stephen A. Wonderlich, James E. Mitchell, Martina de Zwaan, Scott G. Engel, Kevin Connolly, Chris Flessner, Jennifer Redlin, Mary Markland, Heather Simonich, Traci L. Wright, Jodi M. Swanson & Mohammad Taheri. (2006) An empirical analysis of eating disorders and anxiety disorders publications (1980-2000)—part II: Statistical hypothesis testing. International Journal of Eating Disorders 39:1, pages 49-54.
Crossref
Matthew D Alexander & Kerry TB MacQuarrie. (2005) The measurement of groundwater temperature in shallow piezometers and standpipes. Canadian Geotechnical Journal 42:5, pages 1377-1390.
Crossref
Roger E. Kirk. 2005. Encyclopedia of Statistics in Behavioral Science. Encyclopedia of Statistics in Behavioral Science.
Nekane Balluerka, Juana Gómez & Dolores Hidalgo. (2005) The Controversy over Null Hypothesis Significance Testing Revisited. Methodology 1:2, pages 55-70.
Crossref
James Miller. (2004) Statistical significance testing––a panacea for software technology experiments?. Journal of Systems and Software 73:2, pages 183-192.
Crossref
Ratnawati Mohd Asraf & James K. Brewer. (2004) Conducting tests of hypotheses: The need for an adequate sample size. The Australian Educational Researcher 31:1, pages 79-94.
Crossref
Randall R. Robey. (2004) Reporting point and interval estimates of effect-size for planned contrasts: fixed within effect analyses of variance. Journal of Fluency Disorders 29:4, pages 307-341.
Crossref
Anthony J. Onwuegbuzie & Joel R. Levin. (2003) Without Supporting Statistical Evidence, Where Would Reported Measures of Substantive Importance Lead? To No Good Effect. Journal of Modern Applied Statistical Methods 2:1, pages 133-151.
Crossref
Timothy B. Smith, Christopher R. Stones & Anthony Naidoo. (2003) Racial Attitudes among South African Young Adults: A Four-year Follow-up Study. South African Journal of Psychology 33:1, pages 39-43.
Crossref
Xitao Fan. (2016) Using Commonly Available Software For Bootstrapping In Both Substantive And Measurement Analyses. Educational and Psychological Measurement 63:1, pages 24-50.
Crossref
Roger E. Kirk. 2003. Handbook of Research Methods in Experimental Psychology. Handbook of Research Methods in Experimental Psychology 83 105 .
John C. Hanes. (2016) A Nonparametric Approach to Program Evaluation: Utilizing Number Needed to Treat, L’Abbé Plots, and Event Rate Curves for Outcome Analysis. American Journal of Evaluation 23:2, pages 165-182.
Crossref
Shlomo S. Sawilowsky & Jina S. Yoon. (2002) The Trouble With Trivials (p > .05). Journal of Modern Applied Statistical Methods 1:1, pages 143-144.
Crossref
Daniel T. L. Shek. (2016) Psychometric Properties of the Chinese Version of the Self-Report Family Inventory: Findings Based on a Longitudinal Study. Research on Social Work Practice 11:4, pages 485-502.
Crossref
Tammi Vacha-Haase. (2016) Statistical Significance should not be Considered one of Life’s Guarantees: Effect Sizes are Needed. Educational and Psychological Measurement 61:2, pages 219-224.
Crossref
Douglas H. Johnson & Lawrence D. Igl. (2001) Area Requirements of Grassland Birds: A Regional Perspective. The Auk 118:1, pages 24-34.
Crossref
Jill A. Marshall & James T. Dorward. (2000) Inquiry experiences as a lecture supplement for preservice elementary teachers and general education students. American Journal of Physics 68:S1, pages S27-S36.
Crossref
Sorel Cahan. (2016) Research news and Comment: Statistical Significance is not a “Kosher Certificate” for Observed Effects: A Critical Analysis of the Two-Step Approach to the Evaluation of Empirical Results. Educational Researcher 29:1, pages 31-34.
Crossref
Timothy B. Smith & Christopher R. Stones. (2016) Identities and Racial Attitudes of South African and American Adolescents: A Cross-Cultural Examination. South African Journal of Psychology 29:1, pages 23-29.
Crossref
Mark A. Nafziger, Gwena C. Couillard & Timothy B. Smith. (2011) Research: Evaluating Therapy Outcome at a University Counseling Center With the College Adjustment Scales. Journal of College Counseling 2:1, pages 3-13.
Crossref
Michael J. Chen & Xitao Fan. (1998) The relationship between variance components and mean difference effect size. Current Psychology 17:4, pages 301-311.
Crossref
Bruce Thompsons & Patricia A. Snyder. (2011) Statistical Significance and Reliability Analyses in Recent Journal of Counseling & Development Research Articles . Journal of Counseling & Development 76:4, pages 436-441.
Crossref
Rebecca P. Ang. (2016) Use of Double Cross-Validation and Bootstrap Methods to Estimate Replicability of Results of Multiple Regression. Perceptual and Motor Skills 86:3_suppl, pages 1143-1152.
Crossref
J. Thomas Kellow. (2016) Beyond Statistical Significant Tests: The Importance of Using Other Estimates of Treatment Effects to Interpret Results Evaluation. American Journal of Evaluation 19:1, pages 123-134.
Crossref
JUAN MIGUEL CAMPANARIO. (2016) Peer Review for Journals as it Stands Today—Part 1. Science Communication 19:3, pages 181-211.
Crossref
Michael Borenstein. 1998. Comprehensive Clinical Psychology. Comprehensive Clinical Psychology 313 349 .
Raymond HubbardRahul A. Parsa & Michael R. Luthy. (2016) The Spread of Statistical Significance Testing in Psychology. Theory & Psychology 7:4, pages 545-554.
Crossref
Marley W. Watkins. (2016) Diagnostic Utility of the WISC-III Developmental Index as a Predictor of Learning Disabilities. Journal of Learning Disabilities 29:3, pages 305-312.
Crossref
Bruce Thompson. (2016) Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms. Educational Researcher 25:2, pages 26-30.
Crossref
Xitao FanWilliam G. Jacoby. (2016) BOOTSREG: An SAS Matrix Language Program for Bootstrapping Linear Regression Models. Educational and Psychological Measurement 55:5, pages 764-768.
Crossref
Melisa Genaux, Daniel P. Morgan & S. G. Friedman. (2017) Substance Use and Its Prevention: A Survey of Classroom Practices. Behavioral Disorders 20:4, pages 279-289.
Crossref
Kenneth J. Ottenbacher. (1995) Why rehabilitation research does not work (As well as we think it should). Archives of Physical Medicine and Rehabilitation 76:2, pages 123-129.
Crossref

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.