658
Views
1
CrossRef citations to date
0
Altmetric
Research Article

An Investigation of the Impact of Jagged Profile on L2 Speaking Test Ratings: Evidence from Rating and Eye-tracking Data

ORCID Icon & ORCID Icon
Pages 394-421 | Published online: 12 Jun 2022
 

ABSTRACT

The factors that influence rater scoring have been a subject of great interest to researchers in second language assessment. However, the research on the impact of test-takers’ speech profiles (e.g., a jagged or a flat profile reflecting analytic subscores) on raters’ scoring behaviors remains to be seen. To investigate the role of speech profiles in scoring, we collected analytic and holistic rating scores from 28 trained raters while they were marking the performances of three groups of speakers with distinct profiles, determined by prior ratings. We tracked eleven of the raters’ eye-movements to record how often and how long they looked at the various categories on the rating scales. We found that the raters perceived speakers who have better pronunciation as overall more competent speakers. Meanwhile, speakers’ score profiles influenced raters’ attention: raters fixated longer and more often, and made more eye-visits, to the lexical grammar category while assessing speakers with a jagged profile. Raters spent less time assessing the pronunciation of the speakers who were pre-identified as having better pronunciation. The findings shed light on the impact of speech characteristics on raters’ cognition and score assignments and therefore have important implications for rater training in L2 speaking assessments.

摘要

影响评分者评分的因素一直是二语语言测试学者非常感兴趣的一个主题。然而, 关于应试者的语音特征 (例如, 平衡或非平衡的分解评分子分数) 对评分者评分行为影响的研究仍有待于深入。为了研究语音分数概况在评分中的作用, 我们从 28 位受过培训的评分者那里收集了三组具有不同分数概况的说话者的分解和整体评分分数, 这些分数概况是根据先前的评分确定的。我们追踪了 11 名评分者的眼球运动, 以记录他们查看评分量表上各个评分标准的频率和时长。我们发现, 评分者认为发音更好的说话者是综合口语能力更强的说话者。同时, 说话者的分解分数概况影响了评估者的注意力:评估者在评估具有不平衡分解分数概况的说话者时, 在词汇语法类别上停留的时间更长, 更频繁, 眼看的次数也更多。评估者花费更少的时间来评估被预先确定为具有更好发音的说话者的发音。研究结果揭示了语音特征对评分者打分过程和结果的影响, 对二语口语测试中的评分者培训具有重要意义。

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 In an exploratory study that aimed to map how fluency scoring, potentially mediated by task design features, impacted final Aptis speaking-test scores, Tavakoli et al. (Citation2017) recorded various fluency metrics (such as those related to speech rate and repairs or repetition) within 32 Aptis-test-takers’ speech samples. However, the 32 test takers were selected by filtering out of the larger sample anyone who had a jagged score profile. The researchers did not mention how many test takers were filtered out due to having a jagged score profile, thus readers cannot ascertain how frequent such profiles are within a sample of Aptis test takers. Such information would be useful to more broadly understand the study's findings, and to better contextualize the field's more established claims about individual speech components that are frequently studied, such as fluency.

2 We originally designed the study to focus on the eye-movement behaviors of L1 English speakers of English. We did not intend to include L2 English speakers in the eye-tracking data collection phase of the study because previous eye-tracking research has shown that the language processing associated with L1 and L2 languages are different (e.g., Cop et al., Citation2017) and that raters coming from different educational or experience backgrounds may rate differently (Chalhoub-Deville, Citation1995). However, we did collect eye-movement data from L2 English speakers (international graduate students) because we recruited raters from within a university participant pool which required that all in the pool be able to participate fully if they wanted to. Thus, we did not analyze the eye-tracking data from the L2 English speakers, as they were not in the original study design, but we include their data in the published data file for this study (if 90% of their eye-movements were recorded) for future analyses and use.

3 According to the manual for Tobii TX-300, the percent is calculated by dividing the number of eye tracking samples with usable gaze data that were correctly identified, by the number of attempts.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 232.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.