308
Views
1
CrossRef citations to date
0
Altmetric
Articles

Auditing for Score Inflation Using Self-Monitoring Assessments: Findings From Three Pilot Studies

, , , , &
 

Abstract

Test-based accountability often produces score inflation. Most studies have evaluated inflation by comparing trends on a high-stakes test and a lower stakes audit test. However, Koretz and Beguin (2010) noted weaknesses of audit tests and suggested self-monitoring assessments (SMAs), which incorporate audit items into high-stakes tests. This article reports the first three trials of SMAs, evaluating whether SMAs can detect inflation that had already been documented. The studies were conducted with mathematics tests in three grades. Despite severe conservative biases, the audit component functioned as expected in many of the trials. The difference in performance between nonaudit and audit items was associated with factors that earlier research showed to be related to test preparation and score inflation, such as scoring just below the Proficient cut in the previous year and school poverty. However, a number of null findings underscore the need for additional research into the design of audit items.

Funding

The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305AII0420, and by the Spencer Foundation, through Grants 201100075 and 201200071, to the President and Fellows of Harvard College.

Acknowledgments

We thank the New York State Education Department for providing the data used in this study. The opinions expressed are those of the authors and do not represent views of the Institute of Education Sciences, the U.S. Department of Education, the Spencer Foundation, or the New York State Education Department or its staff.

Notes

1 Score inflation has typically been operationalized as the divergence in trends between scores on a high-stakes test and on a lower stakes audit test designed to support similar inferences, using either identical students or randomly equivalent groups. For a discussion of methods for validating scores under high-stakes conditions, see Koretz and Hamilton (Citation2006).

2 We also dropped a very small number of students with mismatched form booklets. In addition, we dropped P.S. 184 Shuang Wen School, a public school in New York City with an immersion program in Mandarin Chinese. A large percent of the school's students were Asian, and an extreme value relative to the rest of the schools in our sample inflated coefficients for the school proportion–Asian variable.

3 In most cases, parents reported race. When parents did not report race, districts were responsible for assigning classifications.

4 For detailed information about the criteria for the low-income variable, see University of the State of New York (Citation2011, p. 44).

5 Conceptually, reliability cannot be negative. In practice, when reliability is very low, one can obtain negative estimates from sampling error. One characteristic of our data increases the probability of negative sample estimates. Using the classical model, the estimated reliability of a difference score will be negative whenever , where x and y are the two tests that are differenced. This inequality is more likely to hold when the test with a larger variance has a considerably lower reliability—precisely what our data produce. Following convention, we set all negative reliability estimates to zero.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.