12,809
Views
40
CrossRef citations to date
0
Altmetric
Supplementing or Replacing p

How Effect Size (Practical Significance) Misleads Clinical Practice: The Case for Switching to Practical Benefit to Assess Applied Research Findings

Pages 223-234 | Received 16 Feb 2018, Accepted 12 Nov 2018, Published online: 20 Mar 2019

References

  • Begley, C. G., and Ellis, L. M. (2012), “Drug Development: Raise Standards for Preclinical Cancer Research,” Nature, 483(7391), 531–533. DOI: 10.1038/483531a.
  • Berwick, D.M. (2008), “The Science of Improvement,” JAMA, 299(10), 1182–1184. DOI: 10.1001/jama.299.10.1182.
  • Borman, G. D., and Hewes, G. M. (2002), “The Long-term Effects and Cost-Effectiveness of Success for All,” Educational Evaluation and Policy Analysis, 24, 243–266. DOI: 10.3102/01623737024004243.
  • Borman, G. D., Slavin, R. E., Cheung, A. C., Chamberlain, A. M., Madden, N. A., and Chambers, B. (2005), “The National Randomized Field Trial of Success for All: Second-year Outcomes,” American Educational Research Journal, 42, 673–696. DOI: 10.3102/00028312042004673.
  • Borman, G. D., Slavin, R. E., Cheung, A. C., Chamberlain, A. M., Madden, N. A., and Chambers, B. (2007), “Final Reading Outcomes of the National Randomized Field Trial of Success for All,” American Educational Research Journal, 44, 701–731. DOI: 10.3102/0002831207306743.
  • Borman, G. D., Slavin, R. E., Cheung, A., Chamberlain, A. M., Madden, N. A., and Chambers, B. (2005), “Success for All: First-year Results From the National Randomized Field Trial,” Educational Evaluation and Policy Analysis, 27, 1–22. DOI: 10.3102/01623737027001001.
  • Borman, G. D., Grigg, J., and Hanselman, P. (2016), “An Effort to Close Achievement Gaps at Scale Through Self-affirmation,” Educational Evaluation and Policy Analysis, 38, 21–42. DOI: 10.3102/0162373715581709.
  • Burdumy, J. S., Mansfield, W., Deke, J., Carey, N., Lugo-Gil, J., Hershey, A., and Douglas, A. (2009, June 8), “Effectiveness of Selected Supplemental Reading Comprehension Interventions: Impacts on a First Cohort of Fifth-Grade Students,” Mathematica Inc., Presentation at The Institute of Education Sciences research conference, available at http://www.mathematica-mpr.com/∼/media/publications/pdfs/education/ies_readcomp_james-burdumy0609.pdf.
  • Cohen, J. (1988), Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum.
  • CREDO (2013), National Charter School Study, Center for Research on Education Outcomes, Stanford, CA: Stanford University, available at http://credo.stanford.edu/documents/NCSS\%202013\%20Final\%20Draft.pdf.
  • Deke, J., Wei, T, and Kautz, T. (2017), Asymdystopia: The Threat of Small Biases in Evaluation of Education Interventions That Need to be Powered to Detect Small Impacts, Washington, DC: Institute of Education Sciences, National Center for Educational Evaluation and Regional Assistance.
  • Gawande, A. (2007), Better: A Surgeon’s Notes on Performance, New York: Metropolitan Books.
  • Ginsburg, A., and Smith, M. S. (2016), Do Randomized Control Trials Meet the “Gold Standard”? A Study of the Usefulness of RCTs in the What Works Clearinghouse, Washington, DC: American Enterprise Institute.
  • Glass, G. V. (2016), “One Hundred Years of Research: Prudent Aspirations,” Educational Researcher, 45, 69–72. DOI: 10.3102/0013189X16639026.
  • Hattie, J. A. C. (2009), Visible Learning: A Synthesis of 800+ Meta-analyses on Achievement, Abingdon, UK: Routledge.
  • Hojat, M. and Xu, G. (2004), “A Visitor’s Guide to Effect Sizes – Statistical Significance Versus Practical (clinical) Importance of Research Findings,” Advances in Health Sciences Education Theory and Practice 9, 241–249. DOI: 10.1023/B:AHSE.0000038173.00909.f6.
  • Ioannidis, J. P. (2005), “Why Most Published Research Findings are False,” PLoS Med, 2, available at http://journals.plos.org/plosmedicine/article/authors?id=10.1371{\%}2Fjournal.pmed.0020124.
  • Kerwin, J. and Thornton, R. (2018), “Making the Grade: The Sensitivity of Education Program Effectiveness to Input Choices and Outcome Measures,” Paper presented at RISE Annual Conference, Oxford, UK, June 21–22, available at https://www.riseprogramme.org/sites/www.riseprogramme.org/files/inline-files/Thornton.pdf.
  • Kirk, R. E. (1996), “Practical Significance: A Concept Whose Time has Come,” Educational and psychological measurement, 56, 746–759. DOI: 10.1177/0013164496056005002.
  • Kraemer, H. C. (2016), “Messages for Clinicians: Moderators and Mediators of Treatment Ooutcome in Randomized Clinical Trials,” American Journal of Psychiatry, 173(7), 672–679. DOI: 10.1176/appi.ajp.2016.15101333.
  • Kolata, G. (2015), “A Faster Way to try Many Drugs on Many Cancers,” New York Times, available at http://www.nytimes.com/2015/02/26/health/fast-track-attacks-on-cancer-accelerate-hopes.html?_r=0.
  • Lipsey, M. W., Puzio, K, Yun, C, Hebert, M. A., Steinka-Fry, K., Cole, M. W., Roberts, M., and Busick, M. D. (2012), Translating the Statistical Representation of the Effects of Education Interventions Into More Readily Interpretable Forms, U.S. Department Of Education. Institute of Education Sciences, available at https://ies.ed.gov/ncser/pubs/20133000/.
  • Madden, N. A., Slavin, R. E., Karweit, N. L., Dolan, L. J., and Wasik, B. A. (1993), “Success for All: Longitudinal Effects of a Restructuring Program for Inner-City Elementary Schools,” American Educational Research Journal, 30, 123–148. DOI: 10.3102/00028312030001123.
  • Maul, A., and McClelland, A. (2013), Review of the National Charter School Study, National Education Policy Center. Boulder, CO: University of Colorado.
  • McCartney, K., and Rosenthal, R. (2000), “Effect Size, Practical Importance, and Social Policy for Children,” Child Development, 71, 173–180.
  • Open Science Collaboration (2015), “Estimating the Reproducibility of Psychological Science.,” Science, 349(6251), aac4716-1–aac4716-8.
  • Plsek, P. E. (1999), “Quality Improvement Methods in Clinical Medicine,” Pediatrics, 203–214.
  • Pogrow, S. (1998), “What is an Exemplary Program and why Should Anyone Care? A Reaction to Slavin and Klein,” Educational Researcher, 27, 22–29. DOI: 10.2307/1176057.
  • Pogrow, S. (1999), “Rejoinder: Consistent Large Gains and High Levels of Achievement are the Best Measures of Program Quality: Author Responds to Slavin,” Educational Researcher, 28, 24–26, 31. DOI: 10.3102/0013189X028008024.
  • Pogrow, S. (2000a), “The Unsubstantiated ’Success’ of Success for All. Implications for Policy, Practice, and the Soul of the Profession,” Phi Delta Kappan, 81, 596–600.
  • Pogrow, S. (2000b), “Success for All Does not Produce Success for Students,” Phi Delta Kappan, 82, 67–80. DOI: 10.1177/003172170008200114.
  • Pogrow, S. (2002), “Success for All is a Failure,” Phi Delta Kappan, 83, 463–468. DOI: 10.1177/003172170208300612.
  • Pogrow, S. (2005), “HOTS Revisited: A Thinking Development Approach to Reducing the Learning Gap After Grade 3,” Phi Delta Kappan, 64–75. DOI: 10.1177/003172170508700111.
  • Quint, J. C., Balu, R., DeLaurentis, M., Rappaport, S., Smith, T. J., and Zhu, P. (2013), The Success for All Model of School Reform: Early Findings from the Investing in Innovation (i3) Scale-Up. MDRC, available at https://www.mdrc.org/sites/default/files/The_Success_for_All_Model_FR_0.pdf
  • Ross, S. M., Smith, L. J., Casey, J., and Slavin, R. E. (1995), Increasing the Academic Success of Disadvantaged Children: An Examination of Alternative Early Intervention Programs, American Educational Research Journal, 32, 773–800. DOI: 10.2307/1163335.
  • Ruffini, S. et al. (1992), Assessment of Success for All [Unpublished research study] Baltimore, MD: Baltimore City Public Schools.
  • Scammacca, N., Vaughn, S., Roberts, G., Wanzek, J., and Torgesen, J. K. (2007), Extensive Reading Interventions in Grades k– 3: From Research to Practice, Portsmouth, NH: RMC Research Corporation, Center on Instruction.
  • Sohn, K. (2010), “A Skeptic’s Guide to Project STAR,” KEDI Journal of Educational Policy, 7, 257–272.
  • Sohn, K. (2015), “Nonrobustness of the Carryover Effects of Small Classes in Project STAR. Teachers College Record, 117, 1–26.
  • Slavin, R. E., Madden, N. A., Karweit, N. L., Livermon, B. J., and Dolan, L. (1990), “Success for All: First-year Outcomes of a Comprehensive Plan for Reforming Urban Education,” American Educational Research Journal, 27, 255–278. DOI: 10.3102/00028312027002255.
  • Sparks, S. D. (2013, October 30), “School Improvement Model Shows Promise in First i3 Evaluation,” Education Week (online), available at http://www.edweek.org/ew/articles/2013/10/30/11successforall.h33.html?qs=sparks+AND+%22Success+for+All%22.
  • Sullivan, G. M., and Feinn, R. (2012), “Using Effect Size—or why the P value is not Enough,” Journal of graduate medical education, 4, 279–282. DOI: 10.4300/JGME-D-12-00156.1.
  • Urdegar, S. (2000), Evaluation of the Success for All Program: 1998-1999, [Unpublished study]. Dade County: MD, Office of Evaluation and Research, Miami-Dade County Public Schools.
  • Venezky, R. L. (1998), “An Alternative Perspective on Success for All,” in Advances in Educational Policy, ed. K. Wong, Vol. 4, Greenwich, CT: JAI Press, pp. 145–165.
  • Willingham, D. T. (2007). “Critical Thinking: Why is it so Hard to Teach,” American Educator, Summer, 31, 8–19.
  • Ziliak, S. T., and McClosky, D. N. (2004), “Size Matters: The Standard Error of Regressions in the American Economic Review,” The Journal of Socio-Economics, 33, 527–546. DOI: 10.1016/j.socec.2004.09.024.
  • Ziliak, S. T., and McClosky, D. N. (2008), The Cult of Statistical Significance: How the Standard Error Costs us Jobs, Justice, and Lives, Ann Arbor, MI: University of Michigan Press.