338
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Evaluating Human Scoring Using Generalizability Theory

, , &
Pages 198-209 | Published online: 21 Jul 2020
 

ABSTRACT

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we apply generalizability theory (G theory) to data from routine monitoring of ratings to derive an estimate for inter-rater reliability. UK examinations use a combination of double or multiple rating for routine monitoring, creating a more complex design that consists of cross-pairing of raters and overlapping of raters for different groups of candidates or items. This sampling design is neither fully crossed nor is it nested. Each double- or multiple-scored item takes a different set of candidates, and the number of sampled candidates per item varies. Therefore, the standard G theory method, and its various forms for estimating inter-rater reliability, cannot be directly applied to the operational data. We propose a method that takes double or multiple rating data as given and analyzes the datasets at the item level in order to obtain more accurate and stable variance component estimates. We adapt the variance component in observed scores for an unbalanced one-facet crossed design with some missing observations. These estimates can be used to make inferences about the reliability of the entire scoring process. We illustrate the proposed method by applying it to real scoring data.

Disclosure statement

No potential conflict of interest was reported by the authors.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 400.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.