786
Views
4
CrossRef citations to date
0
Altmetric
Articles

Complementary strengths? Evaluation of a hybrid human-machine scoring approach for a test of oral academic English

ORCID Icon & ORCID Icon
Pages 437-455 | Received 31 May 2020, Accepted 31 Aug 2021, Published online: 21 Sep 2021
 

ABSTRACT

Human raters and machine scoring systems potentially have complementary strengths in evaluating language ability; specifically, it has been suggested that automated systems might be used to make consistent measurements of specific linguistic phenomena, whilst humans evaluate more global aspects of performance. We report on an empirical study that explored the possibility of combining human and machine scores using responses from the speaking section of the TOEFL iBT® test. Human raters awarded scores for three sub-constructs: delivery, language use and topic development. The SpeechRaterSM automated scoring system produced scores for delivery and language use. Composite scores computed from three different combinations of human and automated analytic scores were equally or more reliable than human holistic scores, probably due to the inclusion of multiple observations in composite scores. However, composite scores calculated solely from human analytic scores showed the highest reliability and reliability steadily decreased as more machine scores replaced human scores.

Acknowledgments

We thank Anastassia Loukina for her assistance in producing the automated scoring models.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Ethics

This research was deemed exempt from review by the ETS Committee for Prior Review of Research. De-identified data was obtained for test takers who, at the time of test registration, had indicated that their data could be used for research purposes. Human raters used in the study were ETS employees working within the scope of their employment.

Additional information

Funding

This work was supported by Educational Testing Service.

Notes on contributors

Larry Davis

Larry Davis is a Senior Research Scientist in the Center for Language Education and Assessment Research at Educational Testing Service (ETS) in Princeton, New Jersey. He has a Ph.D in Second Language Studies from the University of Hawaiʻi, Manoa, where his dissertation work examined the effects of training and experience on rater behavior and cognition. His research interests include all aspects of the assessment of speaking, including, assessment of spoken interaction, development of technology-enhanced speaking tasks, creation of rubrics, rater cognition, and automated evaluation of speaking ability. Most recently, his work has focused on the development of speaking and writing tasks optimized for rapid assessment in technology-mediated contexts.

Spiros Papageorgiou

Spiros Papageorgiou is a Managing Senior Research Scientist in the Center for Language Education and Assessment Research at Educational Testing Service (ETS) in Princeton, New Jersey. Spiros received his doctoral degree in linguistics specializing in language testing from Lancaster University, UK. In his current position at ETS, Spiros conducts research on the assessment of English as a second language, primarily supporting the TOEFL family of assessments. His research interests include standard setting, particularly in relation to the CEFR (Common European Framework of Reference), score reporting and interpretation, and listening assessment. He has published numerous journal articles, book chapters, and technical reports on these topics and co-edited with Professor Kathleen M. Bailey the book volume Global perspectives on language assessment: Research, theory, and practice (2019). He has also been an active member of the language testing community, having served as a member-at-large of the Executive Boards of the International Language Testing Association (ILTA) and the Midwest Association of Language Testers (MwALT).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 467.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.