ABSTRACT
Classroom observation is a common approach to teacher evaluation. Yet, concerns about differences in rater judgment are widespread. Despite this concern, few researchers have examined the practical impact of such differences in rater judgments on teachers’ judged effectiveness. This study fills that gap. Using data from a large-scale teacher evaluation system, we found substantial differences in principal severity that affected teachers’ classification within performance categories. We then demonstrate a technique that researchers and practitioners can use to control for differences in rater severity – thus limiting the degree to which rater severity differences can threaten the fairness of classroom observations. We discuss implications for research and practice.
Disclosure statement
No potential conflict of interest was reported by the authors.
Additional information
Notes on contributors
Stefanie A. Wind
Stefanie A. Wind is an Associate Professor of Educational Measurement in the Department of Educational Studies in Psychology, Research Methodology, and Counseling in the College of Education at The University of Alabama.
Eli Jones
Eli Jones is an Assistant Professor of Educational Research in Counseling, Educational Psychology & Research in the College of Education at The University of Memphis.
Christi Bergin
Christi Bergin is Associate Dean for Research & Innovation in the Department of Educational, School & Counseling Psychology in the College of Education at the University of Missouri. She is also the co-founder of the Network for Educator Effectiveness.