Publication Cover
School Effectiveness and School Improvement
An International Journal of Research, Policy and Practice
Volume 30, 2019 - Issue 4
CrossRef citations to date

Measuring teaching skills in elementary education using the Rasch model

ORCID Icon, ORCID Icon, &
Pages 455-486 | Received 06 Feb 2018, Accepted 30 Jan 2019, Published online: 10 Apr 2019


  • Aaronson, D., Barrow, L., & Sander, W. (2007). Teachers and student achievement in the Chicago public high schools. Journal of Labor Economics, 25, 95–135. doi:10.1086/508733
  • Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140. doi:10.1007/BF02291180
  • Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69–81. doi:10.1007/BF02293746
  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.
  • Brandsma, H. P., & Knuver, J. W. M. (1989). Effects of school and classroom characteristics on pupil progress in language and arithmetic. International Journal of Educational Research, 13, 777–788. doi:10.1016/0883-0355(89)90028-1
  • Butler, D. L., & Winne, P. H. (1995). Feedback and self-regulated learning: A theoretical synthesis. Review of Educational Research, 65, 245–281. doi:10.3102/00346543065003245
  • Cai, L., Thissen, D., & Du Toit, S. (2005–2013). IRTPRO (Version 2.1) [Computer software]. Lincolnwood, IL: Scientific Software.
  • Camilli, G., & Congdon, P. (1999). Application of a method of estimating DIF for polytomous test items. Journal of Educational and Behavioral Statistics, 24, 323–341. doi:10.2307/1165366
  • Capie, W., Johnson, C. E., Anderson, S. J., Ellet, C., & Okey, J. R. (1980). Teacher performance assessment instruments. Athens, GA: University of Georgia.
  • Chen, F., Curran, P. J., Bollen, K. A., Kirby, J., & Paxton, P. (2008). An empirical evaluation of the use of fixed cutoff points in RMSEA test statistic in structural equation models. Sociological Methods & Research, 36, 462–494. doi:10.1177/0049124108314720
  • Chen, W.-H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265–289. doi:10.3102/10769986022003265
  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
  • Commissie Evaluatie Basisonderwijs. (1994). Inhoud en opbrengsten van het basisonderwijs [Curriculum, process and output of elementary schools]. De Meern: Inspectie van het Onderwijs.
  • Cotton, K. (1995). Effective schooling practices: A research synthesis 1995 update. Portland, OR: Northwest Regional Educational Laboratory.
  • Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20, 215–242.
  • Creemers, B. P. M. (1991). Effectieve instructie [Effective instruction]. Den Haag: SVO.
  • Creemers, B. P. M. (1994). The effective classroom. London: Cassell.
  • Danielson, C. (2011). The Framework for Teaching Evaluation Instrument: 2011 Louisiana edition. Princeton, NJ: The Danielson Group.
  • Department for Education. (2012). Teacher appraisal and capability: A model policy for schools (DFE-57518-2012). Retrieved from
  • Doherty, K. M., & Jacobs, S. (2013). State of the states 2013: Connect the dots: Using evaluations of teacher effectiveness to inform policy and practice. Washington, DC: National Council on Teacher Quality. Retrieved from
  • Ellis, E. S., & Worthington, L. A. (1994). Research synthesis on effective teaching principles and the design of quality tools for educators (Technical Report No. 5). Eugene, OR: University of Oregon, National Center to Improve the Tools of Educators.
  • Evertson, C. (1987). Classroom activity record: Observation record for project STAR. Nashville, TN: Vanderbilt University.
  • Evertson, C. M., & Burry, J. A. (1989). Capturing classroom context: The observation instrument as lens for assessment. Journal of Personnel Evaluation in Education, 2, 297–320. doi:10.1007/BF00139647
  • Fischer, G. H., & Schleiblechner, H. H. (1970). Algorithmen und Programmen für das probabilistische Testmodel von Rasch [Algorithms and programs for the probabilistic test model of Rasch]. Psychologische Beiträge, 12, 23–51.
  • Flanders, N. A. (1961). Interaction analysis: A technique for quantifying teacher influence. Minneapolis, MN: University of Minnesota, College of Education, Bureau of Educational Research.
  • Flanders, N. A. (1970). Analyzing teaching behavior. Reading, MA: Addison-Wesley.
  • Florida Coalition for the Development of a Performance Measurement System. (1983). Domains: Knowledge base of the Florida performance measurement system. Tallahassee, FL: Office of Teacher Education, Certification and In-service Staff Development.
  • Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81–112. doi:10.3102/003465430298487
  • Hill, H. C., Blunk, M. L., Charalambous, C. Y., Lewis, J. M., Phelps, G. C., Sleep, L., & Ball, D. L. (2008). Mathematical knowledge for teaching and the mathematical quality of instruction: An exploratory study. Cognition and Instruction, 26, 430–511. doi:10.1080/07370000802177235
  • Hill, H. C., Umland, K., Litke, E., & Kapitula, L. R. (2012). Teacher quality and quality teaching: Examining the relationship of a teacher assessment to practice. American Journal of Education, 118, 489–519. doi:10.1086/666380
  • Hoijtink, H., & Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 53–68). New York, NY: Springer.
  • Houtveen, A. A. M. (1990). Begeleiden van vernieuwingen [Supporting educational innovations]. De Lier: Academisch Boeken Centrum.
  • Houtveen, A. A. M., Booij, N., De Jong, R., & Van de Grift, W. J. C. M. (1999). Adaptive instruction and pupil achievement. School Effectiveness and School Improvement, 10, 172–192. doi:10.1076/sesi.
  • Houtveen, A. A. M., & Overmars, A. M. (1996). Instructie bij rekenen en wiskunde [Instruction in mathematics education]. Utrecht: ISOR.
  • Houtveen, A. A. M., & Van de Grift, W. J. C. M. (2007a). Effects of metacognitive strategy instruction and instruction time on reading comprehension. School Effectiveness and School Improvement, 18, 173–190. doi:10.1080/09243450601058717
  • Houtveen, A. A. M., & Van de Grift, W. J. C. M. (2007b). Reading instruction for struggling learners. Journal of Education for Students Placed at Risk, 12, 405–424. doi:10.1080/10824660701762001
  • Houtveen, A. A. M., Van de Grift, W. J. C. M., & Brokamp, S. K. (2014). Fluent reading in special elementary education. School Effectiveness and School Improvement, 25, 555–569. doi:10.1080/09243453.2013.856798
  • Houtveen, A. A. M., Van de Grift, W. J. C. M., & Creemers, B. P. M. (2004). Effective school improvement in mathematics. School Effectiveness and School Improvement, 15, 337–376. doi:10.1080/09243450512331383242
  • Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55. doi:10.1080/10705519909540118
  • Inspectie van het Onderwijs. (1998). Schooltoezicht primair onderwijs [Inspection of primary education]. De Meern: Author.
  • Klieme, E., Schümer, G., & Knoll, S. (2001). Mathematikunterricht in der Sekundarstufe I: “Aufgabenkultur” und Unterrichtsgestaltung [Mathematics lessons in secondary I: “Task culture” and teaching design]. In E. Klieme & J. Baumert (Eds.), TIMSS – Impulse für Schule und Unterricht (pp. 43–57). Bonn: Bundesministerium für Bildung und Forschung.
  • Kline, R. B. (2005). Principles and practice of structural equation modeling (2nd ed.). New York, NY: Guilford Press.
  • Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin, 119, 254–284. doi:10.1037/0033-2909.119.2.254
  • Levine, D. U., & Lezotte, L.W. (1990). Unusually effective schools: A review and analysis of research and practice. Madison, WI: The National Center for Effective Schools Research and Development.
  • Levine, D. U., & Lezotte, L. W. (1995). Effective schools research. In J. A. Banks & C. A. M. Banks (Eds.), Handbook of research on multicultural education (pp. 525–547). New York, NY: Macmillan.
  • Lipsey, M. W. (1990). Design sensitivity: Statistical power for experimental research. Newbury Park, CA: Sage.
  • Liu, I.-M., & Agresti, A. (1996). Mantel-Haenszel-type inference for cumulative odds ratios with a stratified ordinal response. Biometrics, 52, 1223–1234. doi:10.2307/2532838
  • Mair, P., & Hatzinger, R. (2007). Extended Rasch modeling: The eRm package for the application of IRT models in R. Journal of Statistical Software, 20(9), 1–20. doi:10.18637/jss.v020.i09
  • Mantel, N. (1963). Chi-square tests with one degree of freedom; extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association, 58(303), 690–700. doi:10.1080/01621459.1963.10500879
  • Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor analysis: The effect of sample size. Psychological Bulletin, 103, 391–410. doi:10.1037/0033-2909.103.3.391
  • Marsh, H. W., Hau, K.-T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Structural Equation Modeling, 11, 320–341. doi:10.1207/s15328007sem1103_2
  • Maulana, R., Helms-Lorenz, M., & Van de Grift, W. J. C. M. (2015). A longitudinal study of induction on the acceleration of growth in teaching quality of beginning teachers through the eyes of their students. Teaching and Teacher Education, 51, 225–245. doi:10.1016/j.tate.2015.07.003
  • Mourshed, M., Chijioke, C., & Barber, M. (2010). How the world’s most improved school systems keep getting better. New York, NY: McKinsey & Company. Retrieved from
  • Muijs, D., & Reynolds, D. (Eds.). (2010). Effective teaching: Evidence and practice (3rd ed.). London: Sage.
  • Muthén, L. K., & Muthén, B. O. (1998–2015). Mplus user’s guide (7th ed.). Los Angeles, CA: Authors.
  • Office for Standards in Education. (1995). Guidance on the inspection of nursery and primary schools. London: Author.
  • Organisation for Economic Co-operation and Development. (2012). PISA 2012 results in focus: What 15-year-olds know and what they can do with what they know. Paris: Author.
  • Organisation for Economic Co-operation and Development. (2014). TALIS 2013 results: An international perspective on teaching and learning. Paris: Author.
  • Penfield, R. D., & Algina, J. (2003). Applying the Liu‐Agresti estimator of the cumulative common odds ratio to DIF detection in polytomous items. Journal of Educational Measurement, 40, 353–370. doi:10.1111/j.1745-3984.2003.tb01151.x
  • Pianta, R. C., LaParo, K. M., & Hamre, B. K. (2008). Classroom Assessment Scoring System (CLASS). Baltimore, MD: Brookes.
  • Purkey, S. C., & Smith, M. S. (1983). Effective schools: A review. The Elementary School Journal, 83, 427–452. doi:10.1086/461325
  • Purkey, S. C., & Smith, M. S. (1985). School reform: The district policy implications of the effective schools literature. The Elementary School Journal, 85, 352–389. doi:10.1086/461410
  • Rasch, G. (1960). Probabilistic model for some intelligence and achievement tests. Copenhagen: Danish Institute for Educational Research.
  • Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Copenhagen: Danish Institute for Educational Research.
  • Rizopoulos, D. (2006). ltm: An R package for latent variable modeling and item response theory analysis. Journal of Statistical Software, 17(5), 1–25. doi:10.18637/jss.v017.i05
  • Rockoff, J. E. (2004). The impact of individual teachers on student achievement: Evidence from panel data. American Economic Review, 94(2), 247–252. doi:10.1257/0002828041302244
  • Roeleveld, J. (2003). Herkomstkenmerken en begintoets: Secundaire analyses op het PRIMA-cohort onderzoek [Social background and testing in the early years: Secondary analyses on the PRIMA cohort study]. Amsterdam: SCO Kohnstamm Instituut.
  • Sammons, P., Hillman, J., & Mortimore, P. (1995). Key characteristics of effective schools: A review of school effectiveness research. London: Office for Standards in Education.
  • Schaffer, E. C., & Nesselrodt, P. S. (1992, April). The development and testing of the special strategies observation system. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
  • Scheerens, J. (1989). Wat maakt scholen effectief? Samenvattingen en analyses van onderzoeksresultaten [What explains a school’s effectivity? Summaries and analyses of research outcomes]. ‘s-Gravenhage: Instituut voor Onderzoek van het Onderwijs SVO.
  • Scheerens, J. (1992). Effective schooling: Research, theory and practice. London: Cassell.
  • Scheerens, J. (2008). Een overzichtsstudie naar school- en instructie-effectiviteit: Samenvattingen en analyses van onderzoeksresultaten [Review of school and instruction effectiveness: Summaries and analyses of research outcomes]. Enschede: Universiteit Twente.
  • Slavin, R. E. (1987). Ability grouping and achievement in elementary schools: A best-evidence synthesis. Review of Educational Research, 57, 293–336. doi:10.3102/00346543057003293
  • Stallings, J. (1980). Allocated academic learning time revisited, or beyond time on task. Educational Researcher, 9(11), 11–16. doi:10.3102/0013189X009011011
  • Stallings, J. A., & Kaskowitz, D. H. (1974). Follow through classroom observation evaluation, 1972–1973. Menlo Park, CA: SRI International.
  • Teddlie, C., Creemers, B., Kyriakides, L., Muijs, D., & Yu, F. (2006). The international system for teacher observation and feedback: Evolution of an international study of teacher effectiveness constructs. Educational Research and Evaluation, 12, 561–582. doi:10.1080/13803610600874067
  • Teddlie, C., Virgilio, I., & Oescher, J. (1990). Development and validation of the Virgilio teachers’ behavior instrument. Educational and Psychological Measurement, 50, 421–430. doi:10.1177/0013164490502021
  • Thurlings, M. C. G. (2012). Peer to peer feedback: A study on teachers’ feedback processes. Maastricht: Universitaire Pers Maastricht.
  • Tricket, E. J., & Moos, R. H. (1974). The Classroom Environment Scale (CES). Palo Alto, CA: Consulting Psychologists Press.
  • Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38(1), 1–10. doi:10.1007/BF02291170
  • Van de Grift, W. (1985). Onderwijsleerklimaat en leerlingprestaties [Educational climate and student achievement]. Pedagogische Studiën, 62, 401–414.
  • Van de Grift, W. J. C. M. (1994). Technisch rapport van het onderzoek onder 386 basisscholen ten behoeve van de evaluatie van het basisonderwijs [Report on the evaluation of 386 schools for primary education]. De Meern: Inspectie van het Onderwijs.
  • Van de Grift, W. (2007). Quality of teaching in four European countries: A review of the literature and an application of an assessment instrument. Educational Research, 49, 127–152. doi:10.1080/00131880701369651
  • Van de Grift, W. J. C. M. (2014). Measuring teaching quality in several European countries. School Effectiveness and School Improvement, 25, 295–311. doi:10.1080/09243453.2013.794845
  • Van de Grift, W., & Helms-Lorenz, M. (2013). Waarom verlaten zoveel beginnende leraren de school waar ze hun carrière begonnen? [Why do so many beginning teachers leave the school where they started their careers?]. Van Twaalf tot Achttien, 23(8), 12–15.
  • Van de Grift, W., Helms-Lorenz, M., & Maulana, R. (2014). Teaching skills of student teachers: Calibration of an evaluation instrument and its value in predicting student academic engagement. Studies in Educational Evaluation, 43, 150–159. doi:10.1016/j.stueduc.2014.09.003
  • Van de Grift, W. J. C. M., & Lam, J. F. (1998). Het didactisch handelen in het basisonderwijs [Teaching in primary education]. Tijdschrift voor Onderwijsresearch, 23(3), 224–241.
  • Van de Grift, W., Van der Wal, M., & Torenbeek, M. (2011). Ontwikkeling in de pedagogisch didactische vaardigheid van leraren in het basisonderwijs [Development of teaching skills in primary education]. Pedagogische Studiën, 88, 416–432.
  • Van den Hurk, H. T. G., Houtveen, A. A. M., & Van de Grift, W. J. C. M. (2016). Fostering effective teaching behaviour through the use of data-feedback. Teaching and Teacher Education, 60, 444–451. doi:10.1016/j.tate.2016.07.003
  • Van der Lans, R. M., Van de Grift, W. J. C. M., & Van Veen, K. (2018). Developing an instrument for teacher feedback: Using the Rasch model to explore teachers’ development of effective teaching strategies and behaviors. The Journal of Experimental Education, 86, 247–264. doi:10.1080/00220973.2016.1268086
  • Van der Lans, R. M., Van de Grift, W. J. C. M., Van Veen, K., & Fokkens-Bruinsma, M. (2016). Once is not enough: Establishing reliability criteria for feedback and evaluation decisions based on classroom observations. Studies in Educational Evaluation, 50, 88–95. doi:10.1016/j.stueduc.2016.08.001
  • Veenman, S., Lem, P., Voeten, B., Winkelmolen, B., & Lassche, H. (1986). Onderwijs in combinatie-klassen [Education in multigraded classrooms]. ’s-Gravenhage: SVO.
  • Virgilio, I. (1987). An examination of the relationships among school effectiveness in elementary and junior high schools (Doctoral dissertation). New Orleans, LA: University of New Orleans.
  • Virgilio, I., & Teddlie, C. (1989). Technical manual for the Virgilio Teacher Behavior Inventory (Unpublished manuscript). University of New Orleans, New Orleans, LA.
  • von Davier, M. (1994). WINMIRA: A program system for analyses with the Rasch model, with the latent class analysis and with the mixed Rasch model. Kiel: Institute for Science Education (IPN).
  • Vygotskij, L. S. (2002). Denken und Sprechen [Thinking and speech]. Weinheim: Beltz Verlag. (Original work published 1934)
  • Walberg, H. J., & Haertel, G. D. (1992). Educational psychology’s first century. Journal of Educational Psychology, 84, 6–19. doi:10.1037/0022-0663.84.1.6
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in the item response theory. Psychometrika, 54, 427–450. doi:10.1007/BF02294627
  • Wijnstra, J., Ouwens, M., & Béguin, A. (2003). De toegevoegde waarde van de basisschool [Added value of schools in elementary education]. Arnhem: CITOgroep.
  • Wright, S. P., Horn, S. P., & Sanders, W. L. (1997). Teacher and classroom context effects on student achievement: Implications for teacher evaluation. Journal of Personnel Evaluation in Education, 11, 57–67. doi:10.1023/A:1007999204543
  • Zwick, R., Thayer, D. T., & Mazzeo, J. (1997). Descriptive and inferential procedures for assessing differential item functioning in polytomous items. Applied Measurement in Education, 10, 321–344. doi:10.1207/s15324818ame1004_2