380
Views
24
CrossRef citations to date
0
Altmetric
Original Articles

Visual speech primes open-set recognition of spoken words

, &
Pages 580-610 | Received 01 Sep 2007, Published online: 03 Apr 2009

References

  • Auer , E. T. Jr . 2002 . The influence of the lexicon on speech read word recognition: Contrasting segmental and lexical distinctiveness . Psychonomic Bulletin and Review , 9 ( 2 ) : 341 – 347 .
  • Auer , E. T. Jr and Bernstein , L. E. 1997 . Speechreading and the structure of the lexicon: Computationally modeling the effects of reduced phonetic distinctiveness on lexical uniqueness . Journal of the Acoustical Society of America , 102 ( 6 ) : 3704 – 3710 .
  • Bergeson , T. R. and Pisoni , D. B. 2004 . “ Audiovisual speech perception in deaf adults and children following cochlear implantation ” . In The handbook of multisensory processes , Edited by: Calvert , G. A. , Spence , C. and Stein , B. E. Cambridge, MA : MIT Press .
  • Bernstein , L. E. 2005 . “ Phonetic processing by the speech perceiving brain ” . In Handbook of speech perception , Edited by: Pisoni , D. B. and Remez , R. E. 79 – 98 . Malden, MA : Blackwell .
  • Bowers , J. S. , & Marsolek , C. J. 2003 . Rethinking implicit memory . New York : Oxford University Press .
  • Bybee , J. 2001 . Frequency and language use , Cambridge : Cambridge University Press .
  • Calvert , G. A. , Spence , C. , & Stein , B. E. 2004 . The handbook of multisensory processes Cambridge, MA : MIT Press .
  • Cole , R. A. and Jakimik , J. 1980 . “ A model of speech perception ” . In Perception and production of fluent speech , Edited by: Cole , R. A. 133 – 163 . Hillsdale, NJ : Lawrence Erlbaum Associates .
  • Davis , C. and Kim , J. 2001 . Repeating and remembering foreign language words: Implications for language teaching systems . Artificial Intelligence Review , 16 : 37 – 47 .
  • Davis , C. and Kim , J. 2004 . Audio-visual interactions with intact clearly audible speech . Quarterly Journal of Experimental Psychology , 57A ( 6 ) : 1103 – 1121 .
  • Dodd , B. , Oerlemens , M. and Robinson , R. 1989 . Cross-modal effects in repetition priming: A comparison of lip-read graphic and heard stimuli . Visible Language , 22 : 59 – 77 .
  • Dufour , S. and Peereman , R. 2003 . Inhibitory priming effects in auditory word recognition: When the target's competitors conflict with the word prime . Cognition , 88 : B33 – B44 .
  • Foster , K. I. and Davis , C. 1984 . Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology: Learning . Memory and Cognition , 10 : 680 – 689 .
  • Fowler , C. 1986 . An event approach to the study of speech perception from a direct-realist perspective . Journal of Phonetics , 14 : 3 – 28 .
  • Fowler , C. 1996 . Listeners do hear sounds, not tongues . Journal of the Acoustical Society of America , 99 : 1730 – 1741 .
  • Fowler , C. 2004 . “ Speech as a supramodal or amodal phenomenon ” . In The handbook of multisensory processes , Edited by: Calvert , G. A. , Spence , C. and Stein , B. E. Cambridge, MA : MIT Press .
  • Ghazanfar , A. A. and Schroeder , C. E. 2006 . Is neocortex essentially multisensory? . Trends in Cognitive Science , 10 : 278 – 285 .
  • Gibson , J. J. 1966 . The senses considered as perceptual systems , Boston, MA : Houghton Mifflin .
  • Goldinger , S. D. 1996 . Words and voices: episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning . Memory and Cognition , 22 : 1166 – 1183 .
  • Goldinger , S. D. 1998 . Echoes of echoes? An episodic theory of lexical access . Psychological Review , 105 ( 2 ) : 251 – 279 .
  • Goldinger , S. D. and Azuma , T. 2003 . Puzzle-solving science: The quixotic quest for units in speech perception . Journal of Phonetics , 31 : 305 – 320 .
  • Green , K. P. , Kuhl , P. K. , Meltzoff , A. N. and Stevens , E. B. 1991 . Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect . Perception and Psychophysics , 38 : 269 – 276 .
  • Grill-Spector , K. , Henson , R. and Martin , A. 2006 . Repetition and the brain: neural models of stimulus-specific effects . Trends in Cognitive Science , 10 ( 1 ) : 14 – 23 .
  • Hamilton , R. H. , Shenton , J. T. and Coslett , H. B. 2006 . An acquired deficit of audiovisual speech processing . Brain and Language , 98 : 66 – 73 .
  • Horii , Y. , House , A. S. and Hughes , G. W. 1971 . A masking noise with speech envelope characteristics for studying intelligibility . Journal of the Acoustical Society of America , 49 : 1849 – 1856 .
  • Huttenlocher , D. P. , & Zue , V. W. 1984 . A model of lexical access from partial phonetic information Paper presented at the IEEE International Conference on Acoustics, Speech and Signal Processing .
  • Iverson , P. , Bernstein , L. E. and Auer , E. T. Jr . 1998 . Modeling the interaction of phonemic intelligibility and lexical structure in audiovisual word recognition . Speech Communication , 26 : 45 – 63 .
  • Johnson , K. 1997 . “ Speech perception without speaker normalization: an exemplar model ” . In Talker variability in speech processing , Edited by: Johnson , K. and Mullenix , J. W. 145 – 166 . San Diego, CA : Academic Press .
  • Johnson , K. 2005 . Decisions and mechanisms in exemplar-based phonology , Berkeley, CA : UC Berkeley .
  • Kamachi , M. , Hill , H. , Lander , K. and Vatikiotis-Bateson , E. 2003 . Putting the face to the voice': Matching identity across modality . Current Biology , 13 : 1709 – 1714 .
  • Kerzel , D. and Bekkering , H. 2000 . Motor activation from visible speech: Evidence from stimulus response compatibility . Journal of Experimental Psychology: Human Perception and Performance , 26 ( 2 ) : 634 – 647 .
  • Kim , J. and Davis , C. 2003 . Task effects in masked cross-script translation and phonological priming . Journal of Memory and Language , 49 : 484 – 499 .
  • Kim , J. , Davis , C. and Krins , P. 2004 . Amodal processing of visual speech as revealed by priming . Cognition , 93 ( 1 ) : B39 – B47 .
  • Lachs , L . 2002 . Vocal tract kinematics and crossmodal speech information Bloomington, IN : Speech Research Laboratory, Indiana University .
  • Lachs , L. , & Hernandez , L. R. 1998 . Update: The Hoosier audiovisual multi-talker database In Research on spoken language processing progress report no. 22 (pp. 377 – 388 ). Bloomington, IN : Speech Research Laboratory, Indiana University .
  • Lachs , L. and Pisoni , D. B. 2004a . Cross-modal source information and spoken word recognition . Journal of Experimental Psychology: Human Perception and Performance , 30 ( 2 ) : 378 – 396 .
  • Lachs , L. and Pisoni , D. B. 2004b . Crossmodal source identification in speech perception . Ecological Psychology , 16 ( 3 ) : 159 – 187 .
  • Lachs , L. , Pisoni , D. B. and Kirk , K. I. 2001 . Use of audiovisual information in speech perception by prelingually deaf children with cochlear implants: A first report . Ear and Hearing , 22 : 236 – 251 .
  • Lachs , L. , Weiss , J. W. and Pisoni , D. B. 2000 . Use of partial stimulus information by cochlear implant users and listeners with normal hearing in identifying spoken words: Some preliminary analyses . The Volta Review , 102 ( 4 ) : 303 – 320 .
  • Liberman , A. M. and Mattingly , I. G. 1985 . The motor theory of speech perception revised . Cognition , 21 : 1 – 36 .
  • Luce , P. A. and Pisoni , D. B. 1998 . Recognizing spoken words: The neighborhood activation model . Ear and Hearing , 19 : 1 – 36 .
  • Marslen-Wilson , W. D. and Zwitserlood , P. 1989 . Accessing spoken words: The importance of word onsets . Journal of Experimental Psychology: Human Perception and Performance , 15 : 576 – 585 .
  • Massaro , D. W. 1987 . “ Speech perception by ear and eye ” . In Hearing by eye: The psychology of lip-reading , Edited by: Dodd , B. and Campbell , R. 53 – 84 . Hillsdale, NJ : Lawrence Erlbaum Associates .
  • Massaro , D. W. 1998 . Perceiving talking faces: From speech perception to a behavioral principle , Cambridge, MA : MIT Press .
  • Massaro , D. W. and Cohen , M. M. 1995 . Perceiving talking faces . Current Directions in Psychological Science , 4 : 104 – 109 .
  • Massaro , D. W. and Cohen , M. M. 1999 . Speech perception in hearing-impaired perceivers: Synergy of multiple modalities . Journal of Speech, Language and Hearing Research , 42 : 21 – 41 .
  • Massaro , D. W. and Stork , D. G. 1998 . Speech recognition and sensory integration: a 240-year-old theorem helps explain how people and machines can integrate auditory and visual information to understand speech . American Scientist , 86 : 236 – 244 .
  • Mattys , S. L. , Bernstein , L. E. and Auer , E. T. Jr . 2002 . Stimulus-based lexical distinctiveness as a general word-recognition mechanism . Perception and Psychophysics , 64 ( 4 ) : 667 – 679 .
  • McGurk , H. and MacDonald , J. 1976 . Hearing lips and seeing voices . Nature , 264 ( 5588 ) : 746 – 748 .
  • McLennan , C. T. , Luce , P. A. and Charles-Luce , J. 2003 . Representation of lexical form. Journal of Experimental Psychology: Learning . Memory and Cognition , 29 ( 4 ) : 539 – 553 .
  • Miller , G. A. and Nicely , P. 1955 . An analysis of perceptual confusions among some English consonants . Journal of the Acoustical Society of America , 27 ( 2 ) : 338 – 352 .
  • Morton , J. 1979 . “ Word recognition ” . In Structures and processes , Edited by: Morton , J. and Marshall , J. C. 109 – 156 . Cambridge : MIT Press .
  • Nygaard , L. C. , Sommers , M. S. and Pisoni , D. B. 1994 . Speech perception as a talker-contingent process . Psychological Science , 5 ( 1 ) : 42 – 46 .
  • Oldfield , R. C. 1966 . Things, words, and the brain . Quarterly Journal of Experimental Psychology , 18 : 340 – 353 .
  • Palmeri , T. J. , Goldinger , S. D. and Pisoni , D. B. 1993 . Episodic encoding of voice attributes and recognition memory for spoken words. Journal of Experimental Psychology: Learning . Memory and Cognition , 19 : 309 – 328 .
  • Peterson , G. E. and Lehiste , I. 1960 . Duration of syllable nuclei in English . Journal of the Acoustical Society of America , 32 ( 6 ) : 693 – 703 .
  • Pierrehumbert , J. 2001 . “ Exemplar dynamics: Word frequency, lenition, and contrast ” . In Frequency effects and the emergence of lexical structure , Edited by: Bybee , J. and Hopper , P. 137 – 157 . Amsterdam : John Benjamins .
  • Pisoni , D. B. , & Levi , S. V. (in press) . Representations and representational specificity in speech perception and spoken word recognition In M. G. Gaskell Handbook of psycholinguistics . Oxford : Oxford University Press .
  • Port , R. and Leary , A. 2005 . Against formal phonology . Language , 81 ( 4 ) : 927 – 964 .
  • Ronquest , R. E. , Levi , S. V ., & Pisoni , D. B. 2007 . Language identification from visual-only speech In Research on spoken Language Processing Progress Report No 28 ( Vol. 95–118 ). Bloomington, IN : Speech Research Laboratory, Indiana University .
  • Rosenblum , L. D. 2005 . “ Primacy of multimodal speech perception ” . In Handbook of speech perception , Edited by: Pisoni , D. B. and Remez , R. E. 51 – 78 . Malden, MA : Blackwell .
  • Rosenblum , L. D. , Miller , R. M. and Sanchez , K. 2007 . Lip-read me now, hear me better later: Cross-modal transfer of talker-familiarity effects . Psychological Science , 18 ( 5 ) : 392 – 396 .
  • Sheffert , S. M. and Fowler , C. A. 1995 . The effects of voice and visible speaker change on memory for spoken words . Journal of Memory and Language , 34 : 665 – 685 .
  • Sheffert , S. M ., Lachs , L. , & Hernandez , L. R. 1997 . The Hoosier audiovisual multi-talker database In Research on spoken language processing progress report no. 21 (pp. 578 – 583 ). Bloomington, IN : Speech Research Laboratory, Indiana University .
  • Shipman , D. W ., & Zue , V. W . 1982 . Properties of large lexicons: Implications for advanced isolated word recognition systems Paper presented at the IEEE 1982 International Conference on Acoustics, Speech and Signal Processing .
  • Skipper , J. I. , Nusbaum , H. C. and Small , S. L. 2005 . Listening to talking faces: Motor cortical activation during speech perception . Neuroimage , 25 : 76 – 89 .
  • Slowiaczek , L. M. , Nusbaum , H. C. and Pisoni , D. B. 1987 . Phonological priming in auditory word recognition. Journal of Experimental Psychology: Learning . Memory and Cognition , 13 ( 1 ) : 64 – 75 .
  • Soto-Faraco , S. , Navarra , J. , Weikum , W. M. , Vouloumanos , A. , Sebastián-Gallés , N. and Werker , J. F. 2007 . Discriminating languages by speech reading . Perception and Psychophysics , 69 ( 2 ) : 218 – 237 .
  • Sumby , W.H. and Pollack , I. 1954 . Visual contribution to speech intelligiblity in noise . Journal of the Asoustical Society of America , 26 : 212 – 215 .
  • Summerfield , A.Q. 1979 . Use of visual information in phonetic perception . Phonetica , 36 : 314 – 331 .
  • Summerfield , A. Q. 1987 . “ Some preliminaries to a comprehensive account of audio-visual speech perception ” . In Hearing by eye: The psychology of lip-reading , Edited by: Dodd , B. and Campbell , R. 3 – 52 . Hillsdale, NJ : Lawrence Erlbaum Associates .
  • Treiman , R. 1986 . The division between onsets and rimes in English syllables . Journal of Memory and Language , 25 : 476 – 491 .
  • Treiman , R. and Danis , C. 1988 . Short-term memory errors for spoken syllables are affected by the linguistic structure of the syllables . Journal of Experimental Psychology: Learning, Memory and Cognition , 14 : 145 – 152 .
  • van Wassenhove , V. , Grant , K. W. and Poeppel , D. 2005 . Visual speech speeds up the neural processing of auditory speech . Proceedings of the National Academy of Sciences , 102 ( 4 ) : 1181 – 1186 .
  • van Wassenhove , V. , Grant , K. W. and Poeppel , D. 2007 . Temporal window of integration in auditory-visual speech perception . Neuropsychologia , 45 : 598 – 607 .
  • Vatakis , A. and Spence , C. 2007 . Crossmodal binding: Evaluating the ‘unity assumption’ using audiovisual speech stimuli . Perception & Psychophysics , 69 : 744 – 756 .
  • Vitevitch , M. S. 2002 . Influence of onset density on spoken word recognition . Journal of Experimental Psychology: Human Perception and Performance , 28 ( 2 ) : 270 – 278 .
  • Vitevitch , M. S. and Luce , P. A. 1998 . When words compete: Levels of processing in spoken word perception . Psychological Science , 9 : 325 – 329 .
  • Vitevitch , M. S. and Luce , P. A. 1999 . Probabilistic phonotactics and neighborhood activation in spoken word recognition . Journal of Memory and Language , 40 : 374 – 408 .
  • Vitevitch , M. S. , Luce , P. A. , Pisoni , D. B. and Auer , E. T. Jr . 1999 . Phonotactics, neighborhood activation and lexical access for spoken words . Brain and Language , 68 : 306 – 311 .
  • Weikum , W. M. , Vouloumanos , A. , Navarra , J. , Soto-Faraco , S. , Sebastián-Gallés , N. and Werker , J. F. 2007 . Visual language discrimination in infancy . Science , 316 ( 5828 ) : 1159
  • Yehia , H. , Rubin , P. and Vatikiotis-Bateson , E. 1998 . Quantitative association of vocal-tract and facial behavior . Speech Communication , 26 : 23 – 43 .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.