CrossRef citations to date

How reliable are students’ evaluations of teaching quality? A variance components approach



  • Baayen, R. H., D. J. Davidson, and D. M. Bates. 2008. “Mixed Effects Modeling with Crossed Random Effects for Subjects and Items.” Journal of Memory and Language 59: 390–412.10.1016/j.jml.2007.12.005
  • Basow, S. A., S. Codos, and J. L. Martin. 2013. “The Effects of Professors’ Race and Gender on Student Evaluations and Performance.” College Student 47: 352–363.
  • Bates, D., M. Maechler, B. Bolker, and S. Walker. 2015. “Fitting Linear Mixed-effects Models Using lme4.” Journal of Statistical Software 67: 1–48.
  • Brown, W. 1910. “Some Experimental Results in the Correlation of Mental Abilities.” British Journal of Psychology 2: 296–322.
  • Campbell, H., K. Gerdes, and S. Steiner. 2005. “What’s Looks Got to Do with It? Instructor Appearance and Student Evaluations of Teaching.” Journal of Policy Analysis and Management 24: 611–620.10.1002/(ISSN)1520-6688
  • Clayson, D. E., and M. J. Sheffet. 2006. “Personality and the Student Evaluation of Teaching.” Journal of Marketing Education 28: 149–160.10.1177/0273475306288402
  • Dommeyer, C. J., P. Baum, and R. W. Hanna. 2002. “College Students’ Attitudes toward Methods of Collecting Teaching Evaluations: In-class versus On-line.” Journal of Education for Business 78: 11–15.10.1080/08832320209599691
  • Gillmore, G. M., M. T. Kane, and R. W. Naccarato. 1978. “The Generalizability of Student Ratings of Instruction: Estimation of the Teacher and Course Components.” Journal of Educational Measurement 15: 1–13.10.1111/jedm.1978.15.issue-1
  • Ginns, P., M. Prosser, and S. Barrie. 2007. “Students’ Perceptions of Teaching Quality in Higher Education: The Perspective of Currently Enrolled Students.” Studies in Higher Education 32: 603–615.10.1080/03075070701573773
  • Hattie, J., and H. W. Marsh. 1996. “The Relationship between Research and Teaching: A Meta-analysis.” Review of Educational Research 66: 507–542.10.3102/00346543066004507
  • Kenny, D. A. 1994. Interpersonal Perception: A Social Relations Analysis. New York: Guilford Press.
  • Leamon, M. H., and L. Fields. 2005. “Measuring Teaching Effectiveness in a Pre-clinical Multi-instructor Course: A Case Study in the Development and Application of a Brief Instructor Rating Scale.” Teaching and Learning in Medicine 17: 119–129.10.1207/s15328015tlm1702_5
  • Marsh, H. W. 1982. “The Use of Path Analysis to Estimate Teacher and Course Effects in Student Ratings of Instructional Effectiveness.” Applied Psychological Measurement 6: 47–59.10.1177/014662168200600106
  • Marsh, H. W. 1984. “Students’ Evaluations of University Teaching: Dimensionality, Reliability, Validity, Potential Biases, and Utility.” Journal of Educational Psychology 76: 707–754.10.1037/0022-0663.76.5.707
  • Marsh, H. W. 2007. “Students’ Evaluations of University Teaching: Dimensionality, Reliability, Validity, Potential Biases and Usefulness.” In The Scholarship of Teaching and Learning in Higher Education: An Evidence-based Perspective, edited by R. P. Perry and J. C. Smart, 319–383. Dordrecht: Springer.10.1007/1-4020-5742-3
  • Marsh, H. W., B. Muthén, T. Asparouhov, O. Lüdtke, A. Robitzsch, Alexandre J. S. Morin, and U. Trautwein. 2009. “Exploratory Structural Equation Modeling, Integrating CFA and EFA: Application to Students’ Evaluations of University Teaching.” Structural Equation Modeling: A Multidisciplinary Journal 16: 439–476.10.1080/10705510903008220
  • Marsh, H. W., and L. A. Roche. 1997. “Making Students’ Evaluations of Teaching Effectiveness Effective: The Critical Issues of Validity, Bias, and Utility.” American Psychologist 52: 1187–1197.10.1037/0003-066X.52.11.1187
  • Patrick, C. L. 2011. “Student Evaluations of Teaching: Effects of the Big Five Personality Traits, Grades and the Validity Hypothesis.” Assessment & Evaluation in Higher Education 36: 239–249.10.1080/02602930903308258
  • Rantanen, P. 2013. “The Number of Feedbacks Needed for Reliable Evaluation: A Multilevel Analysis of the Reliability, Stability and Generalisability of Students’ Evaluation of Teaching.” Assessment & Evaluation in Higher Education 38: 224–239.10.1080/02602938.2011.625471
  • Rasbash, J., and W. J. Browne. 2008. “Non-hierarchical Multilevel Models.” In Handbook of Multilevel Analysis, edited by J. de Leeuw and E. Meijer, 301–334. New York: Springer.10.1007/978-0-387-73186-5
  • Raudenbush, S. W., and A. S. Bryk. 2006. Hierarchical Linear Models Applications and Data Analysis Methods: Applications and Data Analysis Methods. 2nd ed. Thousand Oaks, CA: Sage.
  • R Core Team. 2015. R: A Language and Environment for Statistical Computing [ Computer program]. Vienna: R Foundation for Statistical Computing. https://www.R-project.org/.
  • Richter, T. 2006. “What is Wrong with ANOVA and Multiple Regression? Analyzing Sentence Reading Times with Hierarchical Linear Models.” Discourse Processes 41: 221–250.10.1207/s15326950dp4103_1
  • Rindermann, H., and N. Schofield. 2001. “Generalizability of Multidimensional Student Ratings of University Instruction across Courses and Teachers.” Research in Higher Education 42: 377–399.10.1023/A:1011050724796
  • Sax, L. J., S. K. Gilmartin, and A. N. Bryant. 2003. “Assessing Response Rates and Non-Response Bias in Web and Paper Surveys.” Research in Higher Education 44: 409–432.10.1023/A:1024232915870
  • Shrout, P. E., and J. L. Fleiss. 1979. “Intraclass Correlations: Uses in Assessing Rater Reliability.” Psychological Bulletin 86: 420–428.10.1037/0033-2909.86.2.420
  • Solomon, D. J., A. J. Speer, C. J. Rosebraugh, and D. J. DiPette. 1997. “The Reliability of Medical Student Ratings of Clinical Teaching.” Evaluation & the Health Professions 20: 343–352.10.1177/016327879702000306
  • Spearman, C. 1910. “Correlation Calculated from Faulty Data.” British Journal of Psychology 3: 271–295.
  • Spooren, P. 2010. “On the Credibility of the Judge: A Cross-classified Multilevel Analysis on Students’ Evaluation of Teaching.” Studies in Educational Evaluation 36: 121–131.10.1016/j.stueduc.2011.02.001
  • Staufenbiel, T. 2000. “Fragebogen zur Evaluation von universitären Lehrveranstaltungen durch Studierende und Lehrende.” [Students Course Assessment Questionnaire for Evaluation of University Courses.] Diagnostica 46: 169–181.10.1026//0012-1924.46.4.169
  • Staufenbiel, T., T. Seppelfricke, and J. Rickers. 2016. “Prädiktoren studentischer Lehrveranstaltungsevaluationen.” [Predictors of Student Evaluations of Teaching.] Diagnostica 62: 44–59.10.1026/0012-1924/a000142
  • Thompson, R. A., and B. L. Zamboanga. 2003. “Prior Knowledge and its Relevance to Student Achievement in Introduction to Psychology.” Teaching of Psychology 30: 96–101.10.1207/S15328023TOP3002_02
  • Tigelaar, D. E., D. H. Dolmans, I. H. Wolfhagen, and C. P. M. van der Vleuten. 2004. “The Development and Validation of a Framework for Teaching Competencies in Higher Education.” Higher Education 48: 253–268.10.1023/B:HIGH.0000034318.74275.e4
  • Wolbring, T., and P. Riordan. 2016. “How Beauty Works: Theoretical Mechanisms and Two Empirical Applications on Students’ Evaluation of Teaching.” Social Science Research 57: 253–272.10.1016/j.ssresearch.2015.12.009