173
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

A Case Study of a Multi-Faceted Approach to Evaluating Teacher Candidate Ratings

ORCID Icon, ORCID Icon, , &

References

  • Allen, D., & Tanner, K. (2006). Rubrics: Tools for making learning goals and evaluation criteria explicit for both teachers and learners. CBE Life Sciences Education, 5(3), 197–203. https://doi.org/10.1187/cbe.06-06-0168
  • Andrich, D. (1978). Application of a psychometric rating model to ordered categories which are scored with successive integers. Applied Psychological Measurement, 2(4), 581–594. https://doi.org/10.1177/014662167800200413
  • Barrett, S. (2001). The impact of training on rater variability. International Education Journal, 2(1), 49–58.
  • Bell, C. A., Gitomer, D. H., McCaffrey, D. F., Hamre, B. K., Pianta, R. C., & Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17(2–3), 62–87. https://doi.org/10.1080/10627197.2012.715014
  • Bergin, C., Wind, S. A., Grajeda, S., & Tsai, C.-L. (2017). Teacher evaluation: Are principals’ classroom observations accurate at the conclusion of training? Studies in Educational Evaluation, 55, 19–26. https://doi.org/10.1016/j.stueduc.2017.05.002
  • Bond, T. (2015). Applying the rasch model: Fundamental measurement in the human sciences (3rd ed.). Routledge.
  • Bourke, T., Ryan, M., & Ould, P. (2018). How do teacher educators use professional standards in their practice? Teaching and Teacher Education, 75, 83–92. https://doi.org/10.1016/j.tate.2018.06.005
  • Bryant, C. L., Maarouf, S., Burcham, J., & Greer, D. (2016). The examination of a teacher candidate assessment rubric: A confirmatory factor analysis. Teaching and Teacher Education, 57, 79–96. https://doi.org/10.1016/j.tate.2016.03.012
  • Casabianca, J. M., McCaffrey, D. F., Gitomer, D. H., Bell, C. A., Hamre, B. K., & Pianta, R. C. (2013). Effect of observation mode on measures of secondary mathematics teaching. Educational and Psychological Measurement, 73(5), 757–783. https://doi.org/10.1177/0013164413486987
  • Cash, A. H., Hamre, B. K., Pianta, R. C., & Myers, S. S. (2012). Rater calibration when observational assessment occurs at large scale: Degree of calibration and characteristics of raters associated with calibration. Early Childhood Research Quarterly, 27(3), 529–542. https://doi.org/10.1016/j.ecresq.2011.12.006
  • Caughlan, S., & Jiang, H. (2014). Observation and teacher quality: Critical analysis of observational instruments in preservice teacher performance assessment. Journal of Teacher Education, 65(5), 375–388. https://doi.org/10.1177/0022487114541546
  • Choi, H., Benson, N. F., & Shudak, N. J. (2016). Assessment of teacher candidate dispositions: Evidence of reliability and validity. Teacher Education Quarterly, 43(3), 71–89.
  • Congdon, P. J., & MeQueen, J. (2000). The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 37(2), 163–178. https://doi.org/10.1111/j.1745-3984.2000.tb01081.x
  • Council of Chief State School Officers. (2011). Interstate teacher assessment and support Consortium (InTASC). In Model core teaching standards: A resource for state dialogue. Council of Chief State School Officers.
  • Danielson, C. (1996). Enhancing professional practice: A framework for teaching. Association for supervision and curriculum development. Association for Supervision and Curriculum Development.
  • Danielson, C. (2011). The framework for teaching evaluation instrument. The Danielson Group.
  • Darling-Hammond, L. (2006). Assessing teacher education: The usefulness of multiple measures for assessing program outcomes. Journal of Teacher Education, 57(2), 120–138. https://doi.org/10.1177/0022487105283796
  • Darling-Hammond, L. (2010). Teacher education and the American future. Journal of Teacher Education, 61(1–2), 35–47. https://doi.org/10.1177/0022487109348024
  • Darling-Hammond, L., & Cook-Harvey, C. M. (2018). Educating the whole child: Improving school climate to support student success. Learning Policy Institute.
  • De Ayala, R. J. (2009). The theory and practice of item response theory. Guilford Publications.
  • Eisenman, G., Edwards, S., & Cushman, C. A. (2015). Bringing reality to classroom management in teacher education. Professional Educator, 39(1), 1–12.
  • Engelhard, G., & Wind, S. A. (2018). Invariant measurement with raters and rating scales: Rasch models for rater-mediated assessments. Routledge.
  • Graham, M., Milanowski, A., & Miller, J. (2012). Measuring and promoting inter-rater agreement of teacher and principal performance ratings. Center for Educator Compensation reform.
  • Haj-Ali, R., & Feil, P. (2006). Rater reliability: Short-and long-term effects of calibration training. Journal of Dental Education, 70(4), 428–433. https://doi.org/10.1002/j.0022-0337.2006.70.4.tb04097.x
  • Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56–64. https://doi.org/10.3102/0013189X12437203
  • Ho, A. D., & Kane, T. J. (2013). The reliability of classroom observations by school personnel. Research paper. MET project. Bill & Melinda Gates Foundation. https://eric.ed.gov/?id=ED540957
  • Hoyt, W. T., & Kerns, M.-D. (1999). Magnitude and moderators of bias in observer ratings: A meta-analysis. Psychological Methods, 4(4), 403–424. https://doi.org/10.1037/1082-989X.4.4.403
  • Jackson, E. D., Kelsey, K. D., & Rice, A. H. (2018). A case study of technology mediated observation in pre-service teaching experiences for edTPA implementation. NACTA Journal, 62(1), 1–10.
  • Jones, E., & Bergin, C. (2019). Evaluating teacher effectiveness using classroom observations: A Rasch analysis of the rater effects of principals. Educational Assessment, 24(2), 91–118. https://doi.org/10.1080/10627197.2018.1564272
  • Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing Writing, 12(1), 26–43. https://doi.org/10.1016/j.asw.2007.04.001
  • Ladd, K. L. (2000). A comparison of teacher education programs and graduates' perceptions of experiences. University of Missouri-Columbia.
  • Linacre, J. M. (1989). Many-facet Rasch measurement. MESA Press.
  • Linacre, J. M. (1994). Sample size and item calibration stability. Rasch Measurement Transactions, 7, 328.
  • Linacre, J. M. (2002). Optimizing rating scale category effectiveness. Journal of Applied Measurement, 3(1), 85–106.
  • Linacre, J. M. (2020). Facets computer program for many-facet Rasch measurement. (version 3.83.4). Winsteps.com.
  • Linacre, J. M. (2021). Re: Minimum sample size for many-facets Rasch measurement [Discussion post]. Rasch Measurement Forum. https://raschforum.boards.net/thread/3521/minimum-sample-facet-rasch-measurement
  • Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54–71. https://doi.org/10.1177/026553229501200104
  • Mantzicopoulos, P., French, B. F., Patrick, H., Watson, J. S., & Ahn, I. (2018). The stability of kindergarten teachers’ effectiveness: A generalizability study comparing the framework for teaching and the classroom assessment scoring system. Educational Assessment, 23(1), 24–46. https://doi.org/10.1080/10627197.2017.1408407
  • Marzano, R. J. (2007). The art and science of teaching: A comprehensive framework for effective instruction. Association for Supervision and Curriculum Development.
  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174. https://doi.org/10.1007/BF02296272
  • Murray, F. B. (2005). On building a unified system of accreditation in teacher education. Journal of Teacher Education, 56(4), 307–317. https://doi.org/10.1177/0022487105279842
  • Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4(4), 386–422.
  • Raczynski, K. R., Cohen, A. S., Engelhard Jr, G., & Lu, Z. (2015). Comparing the effectiveness of self-paced and collaborative frame-of-reference training on rater accuracy in a large-scale writing assessment. Journal of Educational Measurement, 52(3), 301–318. https://doi.org/10.1111/jedm.12079
  • Raths, J., & Lyman, F. (2003). Summative evaluation of student teachers: An enduring problem. Journal of Teacher Education, 54(3), 206–216. https://doi.org/10.1177/0022487103054003003
  • Reagan, E. M., Terrell, D. G., Rogers, A. P., Schram, T., Tompkins, P., Ward, C., Birch, M. L., McCurdy, K., & McHale, G. (2019). Performance assessment for teacher candidate learning. Teacher Education Quarterly, 46(2), 114–141.
  • Sandholtz, J. H., & Shea, L. M. (2012). Predicting performance: A comparison of university supervisors’ predictions and teacher candidates’ scores on a teaching performance assessment. Journal of Teacher Education, 63(1), 39–50. https://doi.org/10.1177/0022487111421175
  • Wei, R. C., & Pecheone, R. L. (2010). Assessment for learning in preservice teacher education: Performance-based assessments. In M. M. Kennedy (Ed.), Teacher assessment and the quest for teacher quality (pp. 69–132). Jossey-Bass.
  • Wind, S. A. (2019). A nonparametric procedure for exploring differences in rating quality across test-taker subgroups in rater-mediated writing assessments. Language Testing, 36(4), 595–616. https://doi.org/10.1177/0265532219838014
  • Wind, S. A., & Jones, E. (2019). Not just generalizability: A case for multifaceted latent trait models in teacher observation systems. Educational Researcher, 48(8), 521–533. https://doi.org/10.3102/0013189X19874084
  • Wolfe, E. W. (2013). A bootstrap approach to evaluating person and item fit to the Rasch model. Journal of Applied Measurement, 14(1), 1–9.
  • Wu, M., & Adams, R. J. (2013). Properties of Rasch residual fit statistics. Journal of Applied Measurement, 14(4), 339–355.
  • Youngs, P., & Whittaker, A. (2015). The role of EdTPA in assessing content specific instructional practices. In P. Youngs & J. Grissom (Eds.), Improving teacher evaluation systems (pp. 89–101). Teachers College Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.