3,801
Views
80
CrossRef citations to date
0
Altmetric
Special Issue: Prosody in Context

Prosody in context: a review

Pages 1-31 | Received 02 Oct 2012, Accepted 06 Jun 2014, Published online: 08 Oct 2014
 

Abstract

Prosody conveys information about the linguistic context of an utterance at every level of linguistic organisation, from the word up to the discourse context. Acoustic correlates of prosody cue this rich contextual information, but interpreting prosodic cues in terms of the lexical, syntactic and discourse information they encode also requires recognising prosodic variation due to speaker, language variety, speech style and other properties of the situational context. This review reveals the complex interaction among contextual factors that influence the phonological form and phonetic expression of prosody. Empirical challenges in prosodic transcription are discussed along with production evidence that reveals striking variability in the phonological encoding of prosody and in its phonetic expression. The review points to the need for a model of prosody that is robust to contextually driven variation affecting the production and perception of prosodic form.

Acknowledgements

I thank José I. Hualde and Stefanie Shattuck-Hufnagel for critical input on many of the topics discussed here.

Funding

This work was supported by the National Science Foundation BCS 12-51343, IIS 04-14117 and IIS 07-03624.

Notes

1. The view of prosody as the organisational structure of speech goes back to early works such as Trubetzkoy (Citation1958), as noted by Beckman (Citation1996), and is well established in contemporary linguistic theory, forming the basis for metrical stress theory (Liberman, Citation1975; Liberman & Prince, Citation1977; Hayes, Citation1995) and the autosegmental-metrical theory of intonation (Beckman & Pierrehumbert, Citation1986; Gussenhoven, Citation2004, p. 58; Ladd, Citation2008, ch.8; Pierrehumbert, Citation1980). An opposing view in contemporary work holds that prosodic features are directly determined on the basis of syntactic or semantic properties of an utterance, denying a role for phonological prosodic structure (Xu & Xu, Citation2005; see also the discussion in Wagner and Watson (Citation2010, p. 936).

2. Examples of meaning conveyed by (paradigmatically contrastive) prosodic features are plentiful. Prosodic contrast involving tone features at the word level is found in many Asian and African languages. For example, in many Bantu languages the contrast between a High tone and a Low tone on the initial syllable of a verb stem (treated as a prosodic domain) can mark a lexical or grammatical distinction (Downing, Citation2011). In varieties of English, a similar contrast between High and Low tones at the end of a prosodic phrase (again, a prosodic structure) can signal a difference in pragmatic meaning. As described by Gussenhoven (Citation2004, pp. 296–301), a pitch fall on the final syllable in a declarative sentence signals new information that the speaker is introducing to the discourse, while the same sentence with a slight pitch rise at the end signals that the information in the utterance is already shared by the speaker and listener.

3. Works on English include Nespor and Vogel (Citation1986/2007); Selkirk (Citation1984, Citation1986); Frazier et al. (Citation2004); Watson and Gibson (Citation2004); and Breen, Watson, et al., Citation2010. See Wagner and Watson (Citation2010) for further discussion. For some examples of work on prosodic marking of syntactic boundaries in other languages, see Nespor and Vogel (Citation1986/2007), Jun (Citation2005b).

4. These examples are fluently produced utterances taken from data-sets the author has used for investigating prosody production and perception in conversational speech. (1a) is from the Buckeye corpus (Pitt et al., Citation2007), and (1b) is from the American MapTask Corpus, collected by Stefanie Shattuck-Hufnagel and shared with the author. The boundary transcriptions represent the consensus labelling of two or three independent, trained transcribers using the ToBI annotation system.

5. The studies employed different criteria for syntactic labelling, but both adopted an essentially clause-level analysis that identifies main clauses, relative clauses, parentheticals and clausal complements, and their internal structures, but which does not label multi-clause constituents as such.

6. Identifying pauses in speech is not entirely straightforward, since listeners' perceptions of pause do not necessarily coincide with silent intervals, but are also influenced by final lengthening and pitch (Nooteboom & Eefting, Citation1994). Moreover, the duration threshold for perceiving a silent interval as pause may depend on speech rate and other factors. Unfortunately, these concerns are not often addressed in works that report pause as an acoustic correlate of prosodic boundaries.

7. The terms ‘topic unit’ and ‘topic structure’ are used by authors who draw on the terminology of theories of discourse structure. For instance, Geluykens and Swerts (Citation1994) use the term ‘topic unit’ as an informational segment, which in their speech materials is operationally defined in terms of the components of the linguistic task performed by their speakers. These authors note that topic units may overlap with talker turns: A single topic may continue across a change of turn, and on the other hand, a single turn may comprise more than one topic unit.

8. Ladd (Citation2008, pp. 277–278) attributes the pattern of nuclear prominence on a sentence-final intransitive verb to prosodic phrasing: nuclear prominence on the predicate occurs only when the argument and predicate are in separate prosodic phrases. Ladd argues that in such structures the predicate may be perceived as having stronger prominence than the argument, which he takes as evidence for the recursive layering of prosodic phrase structure, with greater prominence assigned to the predicate's phrase, e.g.,[[DOGs]W [must be CARRIED]S].

9. The status of F0 in the prosodic encoding of focus or information status is called to question by Kochanski, Grabe, Coleman, and Rosner (Citation2005). In their corpus study of seven varieties of British English, they looked for acoustic correlates of prominence as marked by expert prosody transcribers. Kochanksi and colleagues find that among the many acoustic measures they tested, intensity and duration are correlated with prominence, but not measures of F0. Bearing in mind that prominent words do not necessarily express focus or new information (Calhoun, Citation2010a, Citation2010b), this finding leaves open the possibility that F0 does in fact mark focus or information status in their materials, but F0 may not be a reliable correlate of prominence construed more broadly to include prominence due to rhythm, accessibility or possibly other factors.

10. Another explanation for the perceived impoliteness of (9) might be that the focal prominence on anything evokes a set of alternatives (‘you know [X]’) which may be construed as impolite even without the rising intonation that Culpeper describes for this utterance. This analysis also serves to illustrate the point that prosody conveys a non-truth-conditional aspect of sentence meaning.

11. The authors do not identify the variety of English spoken by participants in their study, but they give institutional affiliations in South Africa, suggesting that the study reports on South African English.

12. Downstep patterns also arise in lexical tone systems, as widely reported for African and Chinese languages, where the second High tone in a sequence of High tones is realised with a step-wise lowered pitch. Downstep in lexical tone systems can be restricted to contexts where a Low tone intervenes between the successive High tones: H – L – H. For an overview of downstep in lexical tone systems and acoustic evidence for its interaction with phrase- and discourse-level prosody in Mandarin Chinese, see Wang and Xu (Citation2011). Laniran and Clements (Citation2003) present an acoustic study of downstep in the African language Yoruba. For an overview of downstep phenomena in a variety of African languages, see Hyman (Citation2011) and references cited there.

13. Swerts and colleagues find that other properties related to pitch, such as range, register and contour shape and slope, also contribute to the perception of finality in utterances judged as appropriate at the end of a discourse unit, as noted in Section 3.3.

14. A central argument for a phonological representation of prosodic phrase structure lies in the frequent mismatch between prosodic and syntactic structure. Similarly, a phonological representation for prominence is motivated by the fact that prominence may correspond to semantic focus, information status (discourse-new or -given) or may be motivated on purely phonological grounds, to mark the beginning of a phrase or to promote rhythmic alternation across syllables in a phrase. See Ladd (Citation2008, pp. 18–33) for further arguments supporting a phonological representation of prosody.

Additional information

Funding

Funding: This work was supported by the National Science Foundation BCS 12-51343, IIS 04-14117 and IIS 07-03624.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 444.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.