743
Views
25
CrossRef citations to date
0
Altmetric
Original Articles

Visual speech segmentation: using facial cues to locate word boundaries in continuous speech

&
Pages 771-780 | Received 24 Jun 2012, Accepted 15 Mar 2013, Published online: 03 May 2013
 

Abstract

Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition.

Acknowledgements

We would like to thank Dr. Chip Gerfen for assistance in creating the auditory stimuli and Brian Giallorenzo and Kevin Weiss for assistance in creating the visual stimuli. We would also like to thank Dr. Judy Kroll and Dr. Reginald Adams for helpful comments during the preparation of this manuscript. This research was supported by NIH R03 grant HD048996-01 (DW).

Notes

1. This is consistent with previous segmentation studies (see Mitchel & Weiss, Citation2010; Weiss, Gerfen, & Mitchel, Citation2010).

2. Outliers were defined as any data point that fell outside the range specified by the following formula (see Lea & Cohen, Citation2004): lower bound, Quartile 1–1.5 (Quartile 3–Quartile 1); upper bound, Quartile 3+1.5 (Quartile 3–Quartile 1).

3. Transitional probabilities were calculated using the formula described in Aslin, Saffran, and Newport (Citation1998): P(Y∣X)=(frequency of XY)/(frequency of X).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 444.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.