REFERENCES
- American Educational Research Association, American Psychological Association, National Council on Measurement in Education [AERA, APA, NCME]. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
- Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., … Shepard, L. A. (2010, August 27). Problems with the use of student test scores to evaluate teachers. Economic Policy Institute. Retrieved from http://www.epi.org/publication/bp278/
- Barlevy, G., & Neal, D. (2012). Pay for percentile. American Economic Review, 102, 1805–1831.
- Bejar, I. I. (2002). Generative testing: From conception to implementation. In Sidney H. Irvine, Patrick C. Kyllonen (Eds.). Item generation for test development (pp. 199–217). Mahwah, NJ: Lawrence Erlbaum.
- Bock, R. D., Thissen, D., & Zimowski, M. F. (1997). IRT estimation of domain scores. Journal of Educational Measurement, 34, 197–211.
- Embretson, S. E. (1999). Generating items during testing: Psychometric issues and models. Psychometrika, 64, 407–433.
- Koretz, D. (2015). Adapting educational measurement to the demands of test-based accountability. Measurement: Interdisciplinary Research & Perspectives, 13, 1–25.
- Koretz, D., & Beguin, A. (2010). Self-monitoring assessments for educational accountability systems. Measurement: Interdisciplinary Research and Perspectives, 8(2–3), 92–109.
- Neal, D. (2013). The consequences of using one assessment system to pursue two objectives. ( NBER Working Paper No. 19214). Cambridge, MA: National Bureau of Economic Research.
- Rose, M. (2015) School reform fails the test. American Scholar, 84, 18–30.
- Swift, J. (1729). A modest proposal for preventing the children of poor people from being a burden to their parents or country, and making them beneficial to the publick. Dublin, Ireland: S. Harding.
- Thissen, D. (2001). Psychometric engineering as art. Psychometrika, 66, 473–486.