555
Views
2
CrossRef citations to date
0
Altmetric
Articles

Strategies for Assessing Classroom Teaching: Examining Administrator Thinking as Validity Evidence

, , &

References

  • American Educational Research Association, American Psychological Association, and National Council on Measurement in Education. (2007). Standards for reporting on empirical social science research in AERA publications. Washington, DC: American Educational Research Association.
  • American Educational Research Association, American Psychological Association, and National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC.: American Educational Research Association.
  • Archer, J., Cantrell, S., Holtzman, S. L., Joe, J. N., Tocci, C. M., & Wood, J. (2016). Better feedback for better teaching: A practical guide to improving classroom observations. New York, NY: John Wiley & Sons.
  • Bejar, I. I. (2012). Rater cognition: Implications for validity. Educational Measurement: Issues and Practice, 31(3), 2–9. doi:10.1111/j.1745-3992.2012.00238.x
  • Bell, C., Jones, N., Lewis, J., Qi, Y., Stickler, L., Liu, S., & McLeod, M. (2016). Understanding consequential assessment systems of teaching: Year 1 final report to Los Angeles Unified School District (Research Memorandum No. RM-16-12). Princeton, NJ: Educational Testing Service.
  • Bell, C. A., Gitomer, D. H., McCaffrey, D., Hamre, B., Pianta, R., & Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17(2–3), 62–87. doi:10.1080/10627197.2012.715014
  • The Bill and Melinda Gates Foundation. (2012). Gathering feedback for teaching: Combining high quality observations with student surveys and achievement gains. Seattle, WA: Author.
  • Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing, 12(1), 1–15.
  • Brown, M. W. (2009). The teacher-tool relationship: Theorizing the design and use of curriculum materials. In J. T. Remillard, B. A. Herbel-Eisenmann, & G. M. Lloyd (Eds.), Mathematics teachers at work: Connecting curriculum materials and classroom instruction. New York, NY: Routledge.
  • Casabianca, J. M., McCaffrey, D. F., Gitomer, D. H., Bell, C. A., Hamre, B. K., & Pianta, R. C. (2013). Effect of observation mode on measures of secondary mathematics teaching. Educational and Psychological Measurement, 73(5), 757–783. doi:10.1177/0013164413486987
  • Cash, A. H., Hamre, B. K., Pianta, R. C., & Myers, S. S. (2012). Rater calibration when observational assessment occurs at large scale: Degree of calibration and characteristics of raters associated with calibration. Early Childhood Research Quarterly, 27(3), 529–542. doi:10.1016/j.ecresq.2011.12.006
  • Cohen, J., & Goldhaber, D. (2016). Building a more complete understanding of teacher evaluation using classroom observations. Educational Researcher, 45(6), 378–387. doi:10.3102/0013189x16659442
  • Crisp, V. (2008). Exploring the nature of examiner thinking during the process of examination marking. Cambridge Journal of Education, 38(2), 247–264. doi:10.1080/03057640802063486
  • Crisp, V. (2012). An investigation of rater cognition in the assessment of projects. Educational Measurement: Issues and Practice, 31(3), 10–20. doi:10.1111/j.1745-3992.2012.00239.x
  • Danielson, C. (1996). Enhancing professional practice: A framework for teaching. Alexandria, VA: Association for Supervision and Curriculum Development.
  • Doherty, K., & Jacobs, S. (2013). Connect the dots: Using evaluations of teacher effectiveness to inform policy and practice. Washington, DC: National Council on Teacher Quality.
  • Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155–185. doi:10.1177/0265532207086780
  • Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. Chapman & Hall/CRC monograph series on statistics and applied probability 57. Boca Raton, FL: Chapman & Hall/CRC.
  • Freedman, S. W., & Calfee, R. C. (1983). Holistic assessment of writing: Experimental design and cognitive theory. In P. Mosenthal, L. Tamor, & S. A. Walmsley (Eds.), Research on writing: Principles and methods (pp. 75–98). New York, NY: Longman.
  • Gitomer, D. H., Crouse, K., & Joyce, J. (2015). A review of the DC IMPACT teacher evaluation system. Comissioned paper for the Committee on the Five-Year Summative Evaluation of the District of Columbia’s Public Schools. Washington, DC: National Academies.
  • Goldring, E., Grissom, J. A., Rubin, M., Neumerski, C. M., Cannata, M., Drake, T., & Schuermann, P. (2015). Make room value added: Principals’ human capital decisions and the emergence of teacher observation data. Educational Researcher, 44(2), 96–104. doi:10.3102/0013189X15575031
  • Greatorex, J., & Suto, W. I. (2006). An empirical exploration of human judgement in the marking of school examinations. Paper presented at the International Association for Educational Assessment Conference, Singapore. Retrieved from http://iaea.info/documents/paper_1162a2471.pdf.
  • Grossman, P. L., Compton, C., Igra, D., Ronfeldt, M., Shahan, E., & Williamson, P. (2009). Teaching practice: A cross-professional perspective. Teachers College Record, 111(9), 2055–2100. Retrieved from. ID Number: 15018 http://www.tcrecord.org
  • Grossman, P. L., & Stodolsky, S. S. (1995). Content as context: The role of school subjects in secondary school teaching. Educational Researcher, 24(1), 5–11. doi:10.3102/0013189X024008005
  • Harris, D. N., & Sass, T. R. (2014). Skills, productivity and the evaluation of teacher performance. Economics of Education Review, 40(Supplement C), 183–204. doi:10.1016/j.econedurev.2014.03.002
  • Heck, R., & Hallinger, P. (2009). Assessing the contribution of principal and teacher leadership to school improvement. American Educational Research Journal, 46, 659–689. doi:10.3102/0002831209340042
  • Heneman, R. L., Wexley, K. N., & Moore, M. L. (1987). Performance-rating accuracy: A critical review. Journal of Business Research, 15(5), 431–448. doi:10.1016/0148-2963(87)90011-7
  • Herlihy, C., Karger, E., Pollard, C., Hill, H. C., Kraft, M. A., Williams, M., & Howard, S. (2014). State and local efforts to investigate the validity and reliability of scores from teacher evaluation systems. Teachers College Record, 116(1), 1–28. ID Number: 17292 http://www.tcrecord.org
  • Hill, H. C., & Grossman, P. L. (2013). Teacher observations: Challenges and opportunities posed by new teacher evaluation systems. Harvard Educational Review, 83(2), 371–384.
  • Hull, J. (2013). Trends in teacher evaluation: How states are measuring teacher performance. Alexandria, VA: Center for Public Education. Retrieved from http://www.centerforpubliceducation.org/Main-Menu/Evaluating-performance/Trends-in-Teacher-Evaluation-At-A-Glance/Trends-in-Teacher-Evaluation-Full-Report-PDF.pdf
  • Kachur, D. S., Edwards, C. L., & Stout, J. A. (2010). Classroom walkthroughs to improve teaching and learning. Larchmont, NY: Routledge.
  • Kraft, M. A., & Gilmour, A. F. (2016). Can principals promote teacher development as evaluators? A case study of principals’ views and experiences. Educational Administration Quarterly, 52(5), 711–753. doi:10.1177/0013161X16653445
  • Kraft, M. A., & Gilmour, A. F. (2017). Revisiting the widget effect: Teacher evaluation reforms and the distribution of teacher effectiveness. Educational Researcher, 46(5), 234–249. doi:10.3102/0013189X17718797
  • Krishnakumar, P. (2015, December 15). LAUSD by the numbers. L.A.Times. Retrieved from: http://www.latimes.com/visuals/graphics/la-me-g-0818-lausd-numbers-20150818-htmlstory.html
  • Lamb, M. L., & Weick, K. J. (1975). A historical overview of teacher observation. Educational Forum, 39(2), 238–247.
  • Los Angeles Unified School District (LAUSD). (2017). L.A.Unified fingertip facts 2017-2018: Destination graduation. Los Angeles, CA: Author. Retrived from. https://achieve.lausd.net/cms/lib/CA01000043/Centricity/Domain/32/NewlyUpdatedFingertip%20Facts2017-18_English.pdf
  • McClellan, C., Atkinson, M., & Danielson, C. (2012, March). Teacher evaluator training and certification: Lessons learned from the Measures of Effective Teaching project (Practitioner series for teacher evaluation). San Francisco, CA: Teachscape. Retrieved from http://www.teachscape.com/binaries/content/assets/teachscape-marketing-website/resources/march_13whitepaperteacherevaluatortraining.pdf
  • Moss, P. A., Sutherland, L. M., Haniford, L., Miller, R., Johnson, D., Geist, P. K., … Pecheone, R. L. (2004). Interrogating the generalizability of portfolio assessments of beginning teachers: A qualitative study. Education Policy Analysis Archives, 12(32), 1–70.
  • Murphy, J. F., Goldring, E. B., Cravens, X. C., Elliott, S. N., & Porter, A. C. (2011). The vanderbilt assessment of leadership in education: Measuring learning-centered leadership. Journal of East China Normal University, 29(1), 1–10.
  • Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational, and goal-based perspectives. Thousand Oaks, CA: Sage Publications.
  • Myford, C. M. (2012). Rater cognition research: Some possible directions for the future. Educational Measurement: Issues and Practice, 31(3), 48–49.
  • Nelson, B. S. (2010). How elementary school principals with different leadership content knowledge profiles support teachers’ mathematics instruction. New England Mathematics Journal, 42(1), 43–53.
  • Papay, J. P., & Johnson, S. M. (2012). Is PAR a good investment? Understanding the costs and benefits of teacher peer assistance and review programs. Educational Policy, 26(5), 696–729.
  • Porter, A. C., Polikoff, M. S., Goldring, E., Murphy, J., Elliott, S. N., & May, H. (2010a). Developing a psychometrically sound assessment of school leadership: The VAL-ED as a case study. Educational Administration Quarterly, 46(2), 135–173. doi:10.1177/1094670510361747
  • Porter, A. C., Polikoff, M. S., Goldring, E. B., Murphy, J., Elliott, S. N., & May, H. (2010b). Investigating the validity and reliability of the vanderbilt assessment of leadership in education. The Elementary School Journal, 111(2), 282–313.
  • Prendergast, C., & Topel, R. H. (1996). Favoritism in organizations. Journal of Political Economy, 104(5), 958–978. doi:10.2307/2138948
  • Rockoff, J. E., & Speroni, C. (2010). Subjective and objective evaluations of teacher effectiveness. The American Economic Review, 100(2), 261–266.
  • Sartain, L., Stoelinga, S. R., & Brown, E. R. (2011). Rethinking teacher evaluation in Chicago: Lessons learned from classroom observations, principal-teacher conferences, and district implementation. Chicago, IL: Consortium on Chicago School Research.
  • Song, T., Wolfe, E. W., Hahn, L., Less-Petersen, M., Sanders, R., & Vickers, D. (2014). Relationship between rater background and rater performance. Pearson. Retrieved from: https://www.pearson.com/content/dam/one-dot-com/one-dot-com/global/Files/efficacy-and-research/schools/022_Song_RaterBackground_04_21_2014.pdf
  • Steinberg, M. P., & Garrett, R. (2016). Classroom composition and measured teacher performance: What do teacher observation scores really measure? Educational Evaluation and Policy Analysis, 38(2), 293–317. doi:10.3102/0162373715616249
  • Stodolsky, S. S. (1993). A framework for subject matter comparisons in high schools. Teaching and Teacher Education, 9(4), 333–346. doi:10.1016/0742-051X(93)90001-W
  • Suto, I. (2012). A critical review of some qualitative research methods used to explore rater cognition. Educational Measurement: Issues and Practice, 31(3), 21–30. doi:10.1111/j.1745-3992.2012.00240.x
  • Vaughan, C. (1991). Holistic assessment: What goes on in the rater’s mind? In L. Hamp-Lyons (Ed.), Assessing second language writing in academic contexts. Hillsdale, NJ: Ablex.
  • White, T. (2014a). Evaluating teachers more strategically: Using performance results to streamline evaluation systems. Stanford, CA: Carnegie Foundation for the Advancement of Teaching. Retrieved from https://www.carnegiefoundation.org/wp-content/uploads/2014/12/BRIEF_evaluating_teachers_strategically_Jan2014.pdf
  • White, T. (2014b). Adding eyes: The rise, rewards, and risks of multi-rater teacher observation systems. Stanford, CA: Carnegie Foundation for the Advancement of Teaching. Retrieved from http://cdn.carnegiefoundation.org/wp-content/uploads/2014/12/BRIEF_Multi-rater_evaluation_Dec2014.pdf
  • Whitehurst, G. J., Chingos, M. M., & Lindquist, K. M. (2014). Evaluating teachers with classroom observations: Lessons learned in four districts. Providence, RI: Brown Center on Education Policy at Brookings.
  • Wolfe, E. M. (1997). The relationship between essay reading style and scoring proficiency in a psychometric scoring system. Assessment Writing, 4(1), 83–106. doi:10.1016/S1075-2935(97)80006-2
  • Wolfe, E. M. (2006). Uncovering rater’s cognitive processing and focus using think-aloud protocols. Journal of Writing Assessment, 2(1), 37–56.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.