References
- American Educational Research Association [AERA]. 1999. American Psychological Association & National Council on Measurement in Education. The standards for educational and psychological testing, Washington (DC): American Educational Research Association.
- Black P, Wiliam D. 1998. Assessment and classroom learning. Assess Educ: Principles Policy Pract. 5:7–73.
- Cizek G. 1993. Reconsidering standards and criteria. J Educ Meas. 30:93–106.
- Cizek G. 1996. Setting passing scores. Educ Meas: Issues Pract. 15:20–31.
- Clauser B, Mee J, Baldwin S, Margolis M, Dillon G. 2009. Judges' use of examinee performance data in an Angoff standard-setting exercise for a medical licensing examination: an experimental study. J Educ Meas. 46:390–407.
- Downing S. 2002. Threats to the validity of locally developed multiple-choice tests in medical education: construct-irrelevant variance and construct underrepresentation. Adv Health Sci Educ. 7:235–241.
- Downing S, Tekian A, Yudkowsky R. 2006. Procedures for establishing defensible absolute passing scores on performance examinations in health professions education. Teach Learn Med. 18:50–57.
- Ebel R. 1972. Essentials of educational measurement. London: Prentice-Hall International, Inc.
- Epstein R. 2007. Assessment in medical education. N Engl J Med. 356:387–396.
- Haladyna T, Hess R. 1999. An evaluation of conjunctive and compensatory standard-setting strategies for test decisions. Educ Assess. 6:129–153.
- Haladyna TM, Downing S. 1988. Functional Distractors: Implications for Test-Item Writing and Test Design. [accessed 2015 Aug 10]. http://files.eric.ed.gov/fulltext/ED293851.pdf.
- Hambleton R, Itoniak M, Copella J. 2012. Essential steps in setting performance standards on educational tests and strategies for assessing the reliability of results. In: Cizek G, editor. Setting performance standards. London: Routledge.
- Hurtz G, Auerbach M. 2003. A meta-analysis of the effects of modification to the Angoff method on cut-off scores and judgment consensus. Educ Psychol Meas. 63:584–601.
- Kane M. 2002. Validating high-stakes testing programs. Educ Meas: Issues Pract. 21:31–41.
- Kelley T, Ebel R, Linacre J. 2002. Item discrimination indices. Rasch Meas Trans. 16:883–884.
- Kolen M. 2006. Scaling and norming. In: Brennan R, editor. Educational measurement. Westport (CT): American Council on Education.
- Mckinley D, Norcini J. 2014. How to set standards on performance-based examinations: AMEE Guide No. 85. Med Teach. 36:97–110.
- Miller M, Linn R, Gronlund N. 2013. Measurement and assessment in teaching. Boston: Pearson.
- Norcini J, Dawson-Saunders B. 1994. Issues in recertification in North America. In: Newble D, Jolly B, Wakeford R, editors. The certification and recertification of doctors. Cambridge: Cambridge University Press.
- Schmeiser C, Welch C. 2006. Test development. In: Brennan RL, editor. Educational measurement. USA: American Council on Education.
- Shepard L. 2006. Classroom assessment. In: Brennan R, editor. Educational measurement. Westport (CT): American Council on Education.
- Tavakol M, Dennick R. 2016. Post-examination analysis: a means of improving the exam cycle. Acad Med. 91:1324.
- Zieky M, Perie M. 2006. A primer on setting cut scores on tests of educational achievement. ETS. [accessed 2016 Jun 10]. https://www.ets.org/Media/Research/pdf/Cut_Scores_Primer.pdf.