Search in:

Advanced search

Measurement: Interdisciplinary Research and Perspectives Volume 13, 2015 - Issue 1

Submit an article Journal homepage

641

Views

CrossRef citations to date

Altmetric

Focus Article

Adapting Educational Measurement to the Demands of Test-Based Accountability

Daniel KoretzHarvard Graduate School of EducationCorrespondence[email protected]
View further author information

Pages 1-25 | Published online: 20 Mar 2015

Cite this article
https://doi.org/10.1080/15366367.2015.1000712
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

REFERENCES

Barlevy, G., & Neal, D. (2012). Pay for percentile. American Economic Review, 102(5), 1805–1831.
Web of Science ®Google Scholar
Bejar, I. I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003). A feasibility study of on-the-fly item generation in adaptive testing. Journal of Technology, Learning, and Assessment, 2(3). Retrieved from http://www.jtla.org.
Google Scholar
Common Core State Standards Initiative. (2012). Mathematics. Washington, DC: Council of Chief State School Officers. Retrieved from http://www.corestandards.org/Math.
Google Scholar
Fitzpatrick, A. R. (2008). NCME 2008 Presidential Address: The impact of anchor test configuration on student proficiency rates. Educational Measurement: Issues and Practice, 27(4), 34–40.
Google Scholar
Fuller, B., Gesicki, K., Kang, E., & Wright, J. (2006). Is the No Child Left Behind Act working? The reliability of how states track achievement. University of California, Berkeley, Policy Analysis for California Education. Retrieved from http://www.eric.ed.gov/PDFS/ED492024.pdf
Google Scholar
Haertel, E. (2013). How is testing supposed to improve schooling? Measurement: Interdisciplinary Research and Perspectives, 11(1–2), 1–18.
Google Scholar
Hambleton, R. K., Jaeger, R. M., Koretz, D., Linn, R. L., Millman, J., & Phillips, S. E. (1995). Review of the measurement quality of the Kentucky Instructional Results Information System, 1991–1994. Frankfort, KY: Office of Education Accountability, Kentucky General Assembly.
Google Scholar
Hamilton, L. S., Stecher, B. M., Marsh, J. A., McCombs, J. S., Robyn, A., Russell, J. … Barney, H. (2007). Standards-based accountability under No Child Left Behind: Experiences of teachers and administrators in three states. Santa Monica, CA: RAND.
Google Scholar
Hamilton, L. S., Stecher, B. M., & Yuan, K. (2008). Standards-based reform in the United States: History, research, and future directions. Santa Monica, CA: RAND. Retrieved from http://www.rand.org/pubs/reprints/2009/RAND_RP1384.pdf
Google Scholar
Hanushek, E. (2009). Building on No Child Left Behind. Science, 326, 802–803.
PubMed Web of Science ®Google Scholar
Herman, J. L. (2013). Assessing the new Common Core tests. Harvard Education Letter, July/August, 6–8.
Google Scholar
Ho, A. D. (2007). Discrepancies between score trends from NAEP and state tests: A scale invariant perspective. Educational Measurement: Issues and Practice, 26(4), 11–20.
Google Scholar
Holcombe, R., Jennings, J., & Koretz, D. (2013). The roots of score inflation: An examination of opportunities in two states’ tests. In G. Sunderman (Ed.), Charting reform, achieving equity in a diverse nation (pp. 163–189). Greenwich, CT: Information Age Publishing. http://dash.harvard.edu/handle/1/10880587.
Google Scholar
Holmstrom, B., & Milgrom, P. (1991). Multitask principal-agent analyses: Incentive contracts, asset ownership, and job design. Journal of Law, Economics and Organization, 7, 24–52.
Web of Science ®Google Scholar
Hoover, H. D., Dunbar, S. B., Frisbie, D. A., Oberly, K. R., Bray, G. B., Naylor, R. J. … Qualls, A. L. (2003). The Iowa Tests interpretive guide for teachers and counselors, forms A and B, levels 9–14. Chicago, IL: Riverside.
Google Scholar
Jacob, B. A. (2005). Accountability, incentives and behavior: The impact of high-stakes testing in the Chicago public schools. Journal of Public Economics, 89(5–6), 761–796.
Web of Science ®Google Scholar
Jacob, B. A. (2007). Test-based accountability and student achievement: An investigation of differential performance on NAEP and state assessments (NBER Working Paper No. 12817). Cambridge, MA: National Bureau of Economic Research. Retrieved from http://www.nber.org/papers/w12817
Google Scholar
Jennings, J. L., & Bearak, J. M. (2014). “Teaching to the test” in the NCLB era: How test predictability affects our understanding of student performance. Educational Researcher, 43(8), 381–389.
Google Scholar
Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education/Praeger.
Google Scholar
Klein, S. P., Hamilton, L. S., McCaffrey, D. F., & Stecher, B. M. (2000). What do test scores in Texas tell us? Education Policy Analysis Archives, 8(41). Retrieved from http://epaa.asu.edu/ojs/article/view/440/563
Google Scholar
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking (2nd ed.). New York, NY: Springer-Verlag.
Google Scholar
Koretz, D. (1986 , April). Trends in educational achievement. Washington, DC: Congressional Budget Office.
Google Scholar
Koretz, D. (2007). Using aggregate-level linkages for estimation and validation: Comments on Thissen & Braun & Qian. In N. J. Dorans, M. Pommerich, & P. W. Holland (Eds.), Linking and aligning scores and scales (pp. 339–353). New York, NY: Springer-Verlag.
Google Scholar
Koretz, D., & Barron, S. I. (1998). The validity of gains on the Kentucky Instructional Results Information System (KIRIS). MR-1014-EDU, Santa Monica: RAND.
Google Scholar
Koretz, D., & Beguin, A. (2010). Self-monitoring assessments for educational accountability systems. Measurement: Interdisciplinary Research and Perspectives, 8(2–3), 92–109.
Google Scholar
Koretz, D., & Hamilton, L. S. (2006). Testing for accountability in K–12. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 531–578). Westport, CT: American Council on Education/Praeger.
Google Scholar
Koretz, D., Jennings, J. L., Ng, H. L., Yu, C., Braslow, B., & Langi, M. (2014). Auditing for score inflation using self-monitoring assessments: Findings from three pilot studies. A working paper of the Education Accountability Project at the Harvard Graduate School of Education. Retrieved from http://projects.iq.harvard.edu/files/eap/files/combined_audit_paper_submission_11_25_14_wp.pdf.
Google Scholar
Koretz, D., Linn, R. L., Dunbar, S. B., & Shepard, L. A. (1991, April). The effects of high-stakes testing: Preliminary evidence about generalization across tests. In R. L. Linn (Chair), The effects of high stakes testing. Symposium conducted at the annual meetings of the American Educational Research Association and the National Council on Measurement in Education, Chicago, IL. Retrieved from http://dash.harvard.edu/handle/1/10880553.
Google Scholar
Koretz, D., McCaffrey, D., & Hamilton, L. (2001). Toward a framework for validating gains under high-stakes conditions (CSE Technical Report No. 551). Los Angeles, CA: National Center for Research on Evaluation, Standards, and Student Testing.
Google Scholar
Koretz, D., Mitchell, K., Barron, S., & Keith, S. (1996). The perceived effects of the Maryland School Performance Assessment Program (CSE Technical Report No. 409). Los Angeles, CA: National Center for Research on Evaluation, Standards, and Student Testing.
Google Scholar
Koretz, D., Yu, C., & Braslow, D. (2013). Auditing for score inflation using newly tested standards. Working paper of the Educational Accountability Project. Retrieved from http://projects.iq.harvard.edu/files/eap/files/2011_op_audit_paper_11.14_wp.pdf
Google Scholar
Lindquist, E. F. (1951). Preliminary considerations in objective test construction. In E. F. Lindquist (Ed.), Educational measurement (2nd ed., pp. 119–158). Washington, DC: American Council on Education.
Google Scholar
Linn, R. L. (1997). Evaluating the validity of assessments: The consequences of test use. Educational Measurement: Issues and Practice, 16(2), 14–16.
Google Scholar
Linn, R. L., Baker, E. L., & Dunbar, S. B. (1991). Complex, performance-based assessment: Expectations and validation criteria. Educational Measurement: Issues and Practice, 20(8), 15–21.
Google Scholar
Madaus, G. F. (1988). The distortion of teaching and testing: High-stakes testing and instruction. Peabody Journal of Education, 65(3), 29–46.
Web of Science ®Google Scholar
Massachusetts Department of Education. (2000). Massachusetts Mathematics Curriculum Framework. Malden, MA: Author.
Google Scholar
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: American Council on Education/Macmillan.
Google Scholar
Michaelides, M. P., & Haertel, E. H. (2004). Sampling of common items: An unrecognized source of error in test equating (CSE Report 636). Los Angeles, CA: Center for Research on Evaluation, Standards, and Student Testing (CRESST).
Google Scholar
Mislevy, R. J., & Haertel, G. D. (2006). Implications of evidence-centered design for educational testing. Educational Measurement: Issues and Practice, 25(1), 6–20.
Google Scholar
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–67.
Google Scholar
Morley, M. E., Bridgeman, B., & Lawless, R. R. (2004). Transfer between variants of quantitative items (GRE Board Research Report No. 00-06R). Princeton, NJ: Educational Testing Service.
Google Scholar
Neal, D. (2013). The consequences of using one assessment system to pursue two objectives (NBER Working Paper No. 19214). Cambridge, MA: National Bureau of Economic Research.
Google Scholar
New York State Education Department. (2005). Mathematics core curriculum. Albany, NY: The University of the State of New York.
Google Scholar
Ng, H. L., & Koretz, D. (2013). Sensitivity of school-performance ratings to the test used. A working paper of the Education Accountability Project at the Harvard Graduate School of Education. Retrieved from http://projects.iq.harvard.edu/files/eap/files/houston_paper_wpdraft_032513_1.pdf
Google Scholar
Partnership for Assessment of Readiness for College and Careers. (2012). PARCC Model content frameworks, mathematics, grades 3–11, Version 3.0. Retrieved from http://www.parcconline.org/sites/parcc/files/PARCCMCFMathematicsNovember2012V3_FINAL.pdf
Google Scholar
Pedulla, J. J., Abrams, L. M., Madaus, G. F., Russell, M. K., Ramos, M. A., & Miao, J. (2003). Perceived effects of state-mandated testing programs on teaching and learning: Findings from a national survey of teachers. Boston, MA: National Board on Educational Testing and Public Policy. Retrieved from http://www.bc.edu/research/nbetpp/statements/nbr2.pdf
Google Scholar
Porter, A., McMaken, J., Hwang, J., & Yang, R. (2011). Common Core Standards: The new U.S. intended curriculum. Educational Researcher, 40(3), 103–116.
Web of Science ®Google Scholar
Romberg, T. A., Zarinia, E. A., & Williams, S. R. (1989). The influence of mandated testing on mathematics instruction: Grade 8 teachers’ perceptions. Madison: National Center for Research in Mathematical Science Education, University of Wisconsin–Madison.
Google Scholar
Rothstein, R. (2008). Holding accountability to account: How scholarship and experience in other fields inform exploration of performance incentives in education. Nashville, TN: National Center on Performance Incentives, Vanderbilt University. Retrieved from http://s4.epi.org/files/2014/holding-accountability-to-account.pdf
Google Scholar
Rubinstein, J. (2000). Cracking the MCAS grade 10 math. New York, NY: Princeton Review Publishing.
Google Scholar
Severson, K. (2011, September 7). A scandal of cheating, and a fall from grace. New York Times, A16. Retrieved from http://www.nytimes.com/2011/09/08/us/08hall.html?pagewanted=all&_r=0
Google Scholar
Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5–8, 13, 24.
Google Scholar
Shepard, L. A. (1988). The harm of measurement-driven instruction. Paper presented at the Annual Meeting of the American Educational Research Association in Washington, DC.
Google Scholar
Shepard, L. A., & Dougherty, K. C. (1991, April). Effects of high-stakes testing on instruction. Paper presented at the annual meeting of the American Educational Research Association and National Council on Measurement in Education, Chicago, IL.
Google Scholar
Singapore Examinations and Assessment Board. (2009). PSLE examination questions 2005–2009. Singapore: Author.
Google Scholar
Smarter Balanced Assessment Consortium. (2012, April 16). General item specifications, Draft 1. Retrieved from http://www.smarterbalanced.org/wordpress/wp-content/uploads/2012/05/TaskItemSpecifications/ItemSpecifications/GeneralItemSpecifications.pdf
Google Scholar
Stecher, B. M., & Mitchell, K. J. (1995). Portfolio driven reform: Vermont teachers’ understanding of mathematical problem solving (CSE Technical Report No. 400). Los Angeles, CA: Center for Research on Evaluation, Standards, and Student Testing.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Adapting Educational Measurement to the Demands of Test-Based Accountability

REFERENCES

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Adapting Educational Measurement to the Demands of Test-Based Accountability

REFERENCES

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date