The Consequences of Using one Assessment System to Pursue two Objectives: The Journal of Economic Education: Vol 44 , No 4

Sample our Humanities journals, sign in here to start your FREE access for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/00220485.2013.825112?needAccess=true

Abstract

Education officials often use one assessment system both to create measures of student achievement and to create performance metrics for educators. However, modern standardized testing systems are not designed to produce performance metrics for teachers or principals. They are designed to produce reliable measures of individual student achievement in a low-stakes testing environment. The design features that promote reliable measurement provide opportunities for teachers to profitably coach students on test-taking skills, and educators typically exploit these opportunities whenever modern assessments are used in high-stakes settings as vehicles for gathering information about their performance. Because these coaching responses often contaminate measures of both student achievement and educator performance, it is likely possible to acquire more accurate measures of both student achievement and education performance by developing separate assessment systems that are designed specifically for each measurement task.

Keywords:

accountability
assessment systems
coaching
incentives
testing

JEL codes:

Acknowledgments

The author thanks the Searle Freedom Trust for research support and also thanks Lindy and Michael Keiser for research support through a gift to the University of Chicago's Committee on Education. He further thanks Michael Greenstone, Diane Whitmore Schanzenbach, and Robert S. Gibbons for useful comments, Robert D. Gibbons and David Thissen for their insights on psychometrics and assessment development, and Ian Fillmore, Sarah A. G. Komisarow, and Richard Olson for excellent research assistance.

Notes

1. The SMARTER Balanced Assessment Consortium (SBAC) and the Partnership for Assessment of Readiness for College and Careers (PARCC) are the two groups developing assessment systems for the Common Core State Standards using funds awarded as part of the Obama Administration's Race to the Top initiative.

2. The pattern described here is not definitive proof that student test score gains on high-stakes assessments do not reflect real gains in subject mastery because the parallel low-stakes assessments are never identical in terms of subject content. See Koretz (Citation2002) for more on this point. However, many of the studies in this literature present results that are difficult to explain in a credible way without some story that explains how high-stakes assessments scores can rise quickly without commensurate improvements in true levels of math or reading achievement.

3. Glewwe, Ilias, and Kremer (2010), Jacob (Citation2005), Klein et al. (Citation2000), Koretz and Barron (Citation1998), Koretz (Citation2002), and Vigdor (Citation2009) all present results that show divergence between student assessment results on two assessments of the same subject matter in settings where one assessment became a high-stakes assessment for educators and the other assessment continued to involve relatively low stakes. Neal (Citation2012) provides a detailed discussion of this literature.

4. See Campbell (Citation1979), Kerr (Citation1995), and Rothstein (Citation2009) for many examples.

5. The effort distortions induced by assessment-based accountability are not one-time costs. Any system that induces teachers to adopt teaching methods that raise test scores but degrade the true quality of instruction imposes an ongoing cost on students, and students bear these costs throughout all grades and classes where their teachers are subject to assessment-based accountability.

6. See Hambleton, Swaminathan, and Rogers (1991, 135).

7. See Kolen and Brennan (2010, 19).

8. For more on IRT models, see Hambleton, Swaminathan, and Rogers (1991).

9. See Bay-Borelli et al. (2010, 25).

10. Kolen and Brennan (Citation2010) asserted that proper ex post equating of the results from different exam forms is not possible without an ex ante commitment to systematic procedures that govern item and form development, and they gave several examples of cases where equating procedures did not work well ex post because different assessment forms in a series were not developed and administered in a consistent manner (see Chapter 8).

11. See http://www.k12.wa.us/smarter/ and http://www.parcconline.org/.

12. The new exam was not given in the first quarter of 2004, and pass rates historically vary by quarter, with pass rates for first quarters being below the corresponding year-wide averages. The pass rate in the final three quarters of 2004 was almost identical to the 2005 annual pass rate and may have been slightly below if the exam was given in all four quarters of 2004.

13. The pass rates for the other three components of the exam follow a similar pattern, but the patterns on other exams are more difficult to interpret because both the format and the item content of the other three exams changed substantially to reflect new international standards for accounting. The 2011 drop in annual pass rates is less than one percent for BEC and roughly five percent for AUD and FAR. For all three exams, the decline in pass rates is more pronounced when one compares pass rates from the first two quarters of 2011 to the pass rates from the first two quarters of 2010. The changes in content specifications for all exams were announced more than a year before the 2011 exams were administered.

14. See Lazear and Rosen (Citation1981) as well as chapters 10 and 11 in Lazear and Gibbs (Citation2008).

15. The performance metric we propose is called the Percentile Performance Index (PPI). It is similar in construction to Student Growth Percentile measures (SGP) that are already being used in some states as accountability measures (see Betebenner Citation2009). Free PPI software is available at http://sites.google.com/site/dereknealresearch/home/pay-for-percentile-software.

16. Standard results on optimal incentive contracts show that if educators are risk-neutral, a reduction in reliability does not hamper efficient incentive provision. On the other hand, if educators are risk-averse, they will demand to be compensated for assuming the extra risk created by any drop in reliability. However, as the number of students that any educator or group of educators teaches grows large, this effect may well become a second-order concern.

17. See Prendergast (Citation1999) and Neal (Citation2012).

Koretz , D. M. 2002 . Limitations in the use of achievement tests as measures of educators’ productivity . Journal of Human Resources , 37 ( 4 ) : 752 – 77 .

Web of Science ®Google Scholar

Jacob , B. 2005 . Accountability incentives and behavior: The impact of high stakes testing in the Chicago Public Schools . Journal of Public Economics , 89 ( 5 ) : 761 – 96 .

Web of Science ®Google Scholar

Klein , S. P. , Hamilton , L. S. , McCaffrey , D. F. and Stecher , B. M. 2000 . What do test scores in Texas tell us , Santa Monica , CA/Arlington, VA : RAND Corporation .

Google Scholar

Koretz , D. M. and Barron , S. 1998 . The validity of gains on the Kentucky Instructional Results Information System (KIRIS). , Santa Monica , CA : RAND Corporation Monograph Reports .

Google Scholar

Koretz , D. M. 2002 . Limitations in the use of achievement tests as measures of educators’ productivity . Journal of Human Resources , 37 ( 4 ) : 752 – 77 .

Web of Science ®Google Scholar

Vigdor , J. 2009 . “ Teacher salary bonuses in North Carolina ” . In Performance incentives: Their growing impact on American K–12 education , Edited by: Springer , M. 227 – 49 . Washington , DC : Brookings Institution .

Google Scholar

Neal , D. 2012 . “ Providing incentives for educators ” . In Handbook of economics of education , Edited by: Hanushek , E. , Machin , S. and Woessmann , L. Amsterdam : Elsevier . ch. 4

Google Scholar

Campbell , D. T. 1979 . Assessing the impact of planned social change . Evaluation and Program Planning , 2 : 67 – 90 .

Google Scholar

Kerr , S. 1995 . On the folly of rewarding A while hoping for B . The Academy of Management Executive , 9 ( 1 ) : 7 – 14 .

Google Scholar

Rothstein , R. 2009 . “ Holding accountability to account ” . In Performance incentives: Their growing impact on American K–12 education , Edited by: Springer , M. 87 – 109 . Washington , DC : Brookings Institution .

Google Scholar

Kolen , M. J. and Brennan , R. J. 2010 . Test equating, scaling, and linking: Methods and practices. , Philadelphia : Springer Science .

Google Scholar

Lazear , E. and Rosen , S. 1981 . Rank order tournaments as optimum labor contracts . Journal of Political Economy , 89 ( 5 ) : 841 – 64 .

Web of Science ®Google Scholar

Lazear , E. and Gibbs , M. 2008 . Personnel economics in practice. , Hoboken , NJ : John Wiley & Sons .

Google Scholar

Betebenner , D. W. 2009 . Norm- and criterion-referenced student growth . Educational Measurement: Issues and Practice , 28 ( 4 ) : 42 – 51 .

Google Scholar

Prendergast , C. 1999 . The provision of incentives in firms . Journal of Economic Literature , 37 ( 1 ) : 7 – 63 .

Web of Science ®Google Scholar

Neal , D. 2012 . “ Providing incentives for educators ” . In Handbook of economics of education , Edited by: Hanushek , E. , Machin , S. and Woessmann , L. Amsterdam : Elsevier . ch. 4

Google Scholar

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 53.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 130.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

The Consequences of Using one Assessment System to Pursue two Objectives

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

The Consequences of Using one Assessment System to Pursue two Objectives

Abstract

Acknowledgments

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature