801
Views
9
CrossRef citations to date
0
Altmetric
Articles

Separating arguments from conclusions: the mistaken role of effect size in educational policy research

ORCID Icon
Pages 99-109 | Published online: 02 Jun 2019
 

ABSTRACT

Effect size is the basis of much evidence-based education policymaking. In particular, it is assumed to measure the educational effectiveness of interventions. Policy is being driven by the influential work of John Hattie, the Education Endowment Foundation, and others, which is grounded in this assumption. This article demonstrates the assumption is false and notes that, when criticized, proponents either attempt to inoculate themselves by listing (without checking) assumptions or use the specious reasoning that, however flawed their argument, no-one has disproved their conclusions.

Disclosure statement

No potential conflict of interest was reported by the author.

Notes

1. Throughout this paper, “effect size” will stand for “standardised effect size” (as it does in the literature cited). While not exempt from all critique, raw effect size (where, e.g., differences in scores are not scaled by a measure of variance) is not subject to some of the most serious problems noted here.

2. The argument from some meta-analysts is that all of the different design influences will “wash out” when a large number of studies gets combined. But this relies on the obviously absurd assumption that studies are drawn independently and at random with the same distribution from a population of those design decisions: control activities, intervention-testing intervals, dosages, and so forth. As if, for example, tests used in studies are selected at random from a population of possible tests so that the number of open-answer versus multiple-choice tests “balances out” in some way (see Berk, Citation2007, Citation2011).

3. Cheung and Slavin (Citation2016) noted that, across their sample of studies, researcher-designed measures resulted in effect sizes on average twice the size of those from studies with independently designed measures. The reason for this follows directly from the thought experiment here: Researchers can (and do) reduce noise by designing tests which target only the impact of the intervention; standardised tests will not do so. Though, of course, researchers can (and do) still select standardised tests to reduce noise and amplify the signal.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 235.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.