2,037
Views
19
CrossRef citations to date
0
Altmetric
Original Articles

Shifting Sands in Second Language Pronunciation Teaching and Assessment Research and Practice

ORCID Icon
 

ABSTRACT

This article brings to the fore trends in second language (L2) pronunciation research, teaching, and assessment by highlighting the ways in which pronunciation instructional priorities and assessment targets have shifted over time, social dimensions that, although presented in a different guise, appear to have remained static, and principles in need of clearer conceptualization. The reorientation of the pedagogical goal in pronunciation teaching from the traditional focus on accent reduction to the more suitable goal of intelligibility will feed into a discussion of major constructs subsumed under the umbrella term of “pronunciation.” We discuss theoretical gaps, definitional quagmires, and challenges in operationalizing major constructs in assessment instruments, with an emphasis on research findings on which pronunciation features are most consequential for intelligibility and implications for instructional priorities and assessment targets. Considerations related to social judgments of pronunciation, accent familiarity effects, the growth of lingua franca communication, and technological advances, including machine scoring of pronunciation, pervade the discussion, bridging past and present. Recommendations for advancing an ambitious research agenda are proposed to disassociate pronunciation assessment from the neglect of the past, secure its presence as an integral part of the L2 speaking construct, and propel it to the forefront of developments in assessment.

Notes

1 For example, the speaking construct being measured in direct speaking tests (e.g., IELTS) tends to be markedly different from both semidirect speaking tests that are human scored (e.g., TOEFL iBT) and the fully automated speaking tests (e.g., Versant; see De Jong, Citationthis issue; Galaczi & Taylor, Citationthis issue). This is most obviously reflected in the different way that the speaking ability is scored across tests, which often draw on qualitatively different assessment criteria. Pronunciation and pronunciation-relevant constructs, are, in turn, differentially operationalized in relation to each given speaking ability measure.

2 Flege’s (Citation1995) model posits these hypotheses at an abstract level without specifying which particular substitutions will take place for each sound, although, based on the general principles of phonetics, this could be presumed to relate to the place and manner of articulation (consonants) or to tongue height, frontness, and lip rounding (vowels; Reetz & Jongman, Citation2009).

3 Two independently established functional load systems prioritizing minimal pairs in terms of error gravity, as proposed by Brown (Citation1988) and Catford (Citation1987), are normed on Received Pronunciation and General American English, respectively (i.e., use standard NS varieties as their point of reference). The former presents rank orders of 10 contrasts with which learners often have difficulty using a 10-point scale, having first determined the probability of occurrence of phonemes and their likelihood of being conflated. The latter represents functional load on a percent scale and describes a different process for selecting and ordering these contrasts. Notably, these hypotheses about which contrasts are most and least problematic are English language specific and cannot generalize to other target languages.

4 Notably, the new incarnation of the Cambridge ESOL Common Scale for Speaking, the Overall speaking scales, lists intelligibility as a criterion without referring to accent or nativeness (Cambridge English, Citation2016). However, the “phonological features” that lead to varying degrees of intelligibility is vaguely defined in that scale, which introduces a different set of challenges in using the scale.

5 It is, as yet, unclear how accurately and consistently raters are able to channel the views of imagined or idealized listeners while rating. In the absence of further evidence, the general recommendation has been to involve raters from the target audience(s) in conducting ratings, including screening raters for the desired individual difference characteristics where possible (Isaacs & Thomson, Citation2013; Winke et al., Citation2013).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.