1,843
Views
5
CrossRef citations to date
0
Altmetric
Research Article

Comparing Rating Modes: Analysing Live, Audio, and Video Ratings of IELTS Speaking Test Performances

ORCID Icon, ORCID Icon & ORCID Icon
Pages 83-106 | Published online: 26 Aug 2020
 

ABSTRACT

This mixed methods study compared IELTS examiners’ scores when assessing spoken performances under live and two ‘non-live’ testing conditions using audio and video recordings. Six IELTS examiners assessed 36 test-takers’ performances under the live, audio, and video rating conditions. Scores in the three rating modes were calibrated using the many-facet Rasch model (MFRM). For all three modes, examiners provided written justifications for their ratings, and verbal reports were also collected to gain insights into examiner perceptions towards performance under the audio and video conditions. Results showed that, for all rating criteria, audio ratings were significantly lower than live and video ratings. Examiners noticed more negative performance features under the two non-live rating conditions, compared to the live condition. However, richer information about test-taker performance in the video mode appeared to cause raters to rely less on such negative evidence than audio raters when awarding scores. Verbal report data showed that having visual information in the video-rating mode helped examiners to understand what the test-takers were saying, to comprehend better what test-takers were communicating using non-verbal means, and to understand with greater confidence the source of test-takers’ hesitation, pauses, and awkwardness.

Acknowledgments

This research was funded and supported by the IELTS Partners: British Council, Cambridge Assessment English, and IDP: IELTS Australia under the IELTS Joint Funded Research Programme (Round 19). We would also like to express our gratitude to the IELTS examiners who participated in this study and provided their insightful comments. Special thanks go to Kate Connolly for her assistance in transcribing examiner comments.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 IELTS is a large-scale, high-stakes English test designed to measure the language proficiency of those wishing to study or work in English-speaking contexts. The test contains Listening, Reading, and Writing subtests and a face-to-face Speaking test component. IELTS uses a nine-band scale to report levels of proficiency, from non-user (band 1) through to expert user (band 9). In 2018 the test was available in over 140 countries and was taken by over 3.5 million candidates. It is owned and run jointly by the British Council, IDP: IELTS Australia, and Cambridge Assessment English (see www.ielts.org/).

2 The rating scale model was applied to the dataset rather than the partial credit model, as the level steps across the four analytical criterion scales of IELTS Speaking were designed to be comparable, consistent with the original holistic rating scale design for IELTS Speaking (Taylor & Falvey, Citation2007).

3 See http://winsteps.com/facetman/biasestimation.htm for more details on how bias/interaction analyses work in Facets.

4 Due to the straightforward nature of coding, the use of manual coding of examiner comments was thought to be appropriate. It was also considered beneficial for the researchers to engage in individual comments in the process of manual coding, in order to allow for better triangulation of findings from different data sources.

5 This research adopted the “productive for measurement” range suggested by Wright and Linacre (Citation1994) which seems to have been most widely used in validation research on operational speaking tests. It should however be acknowledged that this can potentially be a limitation of this study, as researchers in the field of educational measurement have recently started recommending a bootstrap approach (Wolfe, Citation2013) to identify sample-specific cut-off values rather than applying fixed cut-off values.

6 These three test-takers were retained in the model, as reducing the number of responses would increase measurement errors.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 232.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.