2,387
Views
29
CrossRef citations to date
0
Altmetric
Articles

Contextualizing Performances: Comparing Performances During TOEFL iBTTM and Real-Life Academic Speaking Activities

&
Pages 353-373 | Published online: 14 Nov 2014
 

Abstract

In this study we compare test takers’ performance on the Speaking section of the TOEFL iBTTM and their performances during their real-life academic studies. Thirty international graduate students from mixed language backgrounds in two different disciplines (Sciences and Social Sciences) responded to two independent and four integrated speaking tasks of the TOEFL iBT and participated in semistructured interviews. For the real-life academic contexts, we recorded the performances of our participants in one in-class and one out-of-class speaking activity. On the basis of an analysis of the participants’ speaking (examining grammatical, discourse, and lexical features), we demonstrate that there are some overlapping and some distinct differences in their performances across contexts. Our findings both support and raise questions about the extrapolation inference claim of the validity argument of the Speaking section of the TOEFL iBT.

Notes

1 The TOEFL iBT Speaking tasks were scored by six ETS raters, who each scored answers to two different prompts. Each task was scored by two different raters. Of the total rating decisions, in only two instances was adjudication necessary.

2 Although yep, yeah, and yes can have a range of intended meanings and discourse functions, we did not make a distinction between literal and intended meanings of these utterances; all were classified as the literal meaning of yes.

3 We decided to use contexts instead of types of activity (presentations vs. group discussions) to report our results. However, it should be noted that we ran each analysis by activity type, and our results revealed the same patterns.

4 Following Field (Citation2009), we used Pearson’s correlation coefficient r as a measure of effect size, with an r of 0 meaning there is no effect and an r of 1 meaning there is a perfect effect. Following Cohen (Citation1992), Field suggests that r = .10 is a small effect; r = .30 is a medium effect; and r = .50 is a large effect (Field, Citation2009, p. 57).

5 As explained earlier, we decided to use the clause as the common denominator for our analyses. However, clauses based on the AS-unit may still raise the issue of comparability, though to a lesser degree than AS-units, across the three contexts. The out-of-class context in particular produced a great number of very short independent subclausal units (counted as one-clause AS-units) consisting of only one or two words (e.g., short answers, such as yes, or sure in spoken interaction). These clauses obviously have little room for errors. This type of clause never occurred in the SSTiBT context, and this may have been a major factor contributing to significantly higher grammatical accuracy in the out-of-class context as seen above. To examine if this trend would still hold if we compared the grammatical inaccuracy measure calculated with longer AS-units from the three contexts, we selected AS-units from each context that are three, four, and five clauses long and aggregated all the errors occurring in those clauses and calculated the average number of errors per clause. Results indicated that the same trend holds. The participants made .443 errors per clause in the SSTiBT, .257 in the in-class context, and .215 in the out-of-class context.

6 Our measure of informal language refers to colloquial use of language such as the use of like as a filler in conversation.

7 Our findings from the grammar measures show a clear pattern of decreasing syntactic complexity and increasing grammatical accuracy moving from the SSTiBT to the out-of-class context. To some, this pattern may imply cognitive trade-offs between syntactic complexity and grammatical accuracy (Skehan, Citation1998). However, as with other measures in our study, syntactic complexity and grammatical accuracy may have been affected by a complex interplay of different aspects of the context (both cognitive and affective). Therefore, the grammatical findings should not be taken to suggest a simple inverse relationship in which complexity and accuracy compete for attentional resources.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 232.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.