Speech-in-speech recognition: A training study: Language, Cognition and Neuroscience: Vol 27 , No 7-8

Sample our Behavioral Sciences journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/01690965.2012.654644?needAccess=true

Abstract

This study aims to identify aspects of speech-in-noise recognition that are susceptible to training, focusing on whether listeners can learn to adapt to target talkers (“tune in”) and learn to better cope with various maskers (“tune out”) after short-term training. Listeners received training on English sentence recognition in speech-shaped noise (SSN), Mandarin babble, or English babble. Results from a speech-in-babble posttest showed evidence of both tuning in and tuning out: (1) listeners were able to take advantage of target talker familiarity; (2) training with babble was more effective than SSN training; and (3) after babble training, listeners improved most in coping with the babble in which they were trained. In general, the results show that processes related both to tuning in to speech targets and tuning out speech maskers can be improved with auditory training.

Keywords:

Speech perception in noise
Auditory training
Speech masker

Acknowledgements

I am deeply grateful to Ann Bradlow for helpful discussions throughout this project, to Chun Chan and Lauren Calandruccio for technical assistance, to Matt Goldrick for advice on statistical analysis, and to Kelsey Mok for help with data collection. This research was supported by NIH-NIDCD Award No. F31DC009516. The content is solely the responsibility of the author and does not necessarily represent the official views of the NIDCD or the NIH.

Notes

¹Although the precise definition of informational masking is still under discussion (Kidd, Mason, Richards, Gallun, & Durlach, Citation2007), the term is used here in this broad sense (i.e., nonenergetic masking) to draw the important distinction between interference that occurs in the auditory periphery and interference that occurs at higher levels of auditory and cognitive processing during speech-in-speech listening.

²While energetic masking may also differ across these maskers, Van Engen (Citation2010a) provided evidence for differences in informational masking in cross-language maskers by showing that the relative effects of English and Mandarin maskers differed across listener populations with different experiences with the two languages.

³Since the HINT test was developed from the original set of BKB sentences, some sentences appear in both the HINT and the BKB lists. To eliminate any overlap, 10 sentences from BKB lists 2 and 20 were used to replace items in the training/test lists that also appeared in HINT lists 1 and 2. Replacement sentences were selected to match the number of keywords and the basic sentence structure of the items they replaced.

⁴Female speakers were used for all targets and babble to eliminate the variable of gender differences in speech-in-speech intelligibility (Brungart, Simpson, Ericson, & Scott, Citation2001).

⁵In addition to providing evidence for listener adaptation to a wide range of speech signal types, perceptual learning studies have shown that the use of multiple talkers in training can be beneficial to learning. For example, Bradlow and Bent's (2008) comparison of various training conditions for promoting adaptation to foreign-accented English showed that training with multiple talkers of a given accent facilitated the intelligibility of a test talker just as much as training on only that talker. Multiple-talker training has also been shown to be particularly effective for generalized learning in studies of non-native phoneme contrast perception (e.g., Lively, Logan, & Pisoni, Citation1993; Logan, Lively, & Pisoni, Citation1991), lexical tone (Wang, Jongman, & Sereno, Citation2003; Wang, Spence, Jongman, & Sereno, Citation1999), and dialect classification (Clopper & Pisoni, Citation2007).

⁶This is not to say that experience cannot affect listeners' ability to cope with energetic masking during speech recognition. Indeed, many studies have shown that even highly proficient non-native speakers of a target language perform worse than monolingual, native speakers of the language on speech recognition tasks in speech-shaped noise (e.g., Cooke, Garcia Lecumberri & Barker, Citation2008; Hazan & Simpson, Citation2000; Van Engen, 2010; Van Wijngaarden et al., Citation2002).

Kidd , G. , Mason , C. R. , Richards , V. M. , Gallun , F. J. and Durlach , N. I. 2007 . “ Informational masking ” . In Auditory perception of sound sources , Edited by: Yost , W. A. , Popper , A. N. and Fay , R. R. 143 – 189 . US : Springer .

Google Scholar

Van Engen , K. J . 2010a . Similarity and familiarity: Second language sentence recognition in first- and second-language multi-talker babble . Speech Communication , 52 , 943 – 953 .

PubMed Web of Science ®Google Scholar

Brungart , D. S. , Simpson , B. D. , Ericson , M. A. and Scott , K. R. 2001 . Informational and energetic masking effects in the perception of multiple simultaneous talkers . Journal of the Acoustical Society of America , 110 ( 5 ) : 2527 – 2538 .

PubMed Web of Science ®Google Scholar

Lively , S. E. , Logan , J. S. and Pisoni , D. B. 1993 . Training Japanese listeners to identify English /r/ and /l/. II: The role of phonetic environment and talker variability in learning new perceptual categories . Journal of the Acoustical Society of America , 94 ( 3 ) : 1242 – 1255 .

PubMed Web of Science ®Google Scholar

Logan , J. S. , Lively , S. E. and Pisoni , D. B. 1991 . Training Japanese listeners to identify English /r/ and /l/: A first report . Journal of the Acoustical Society of America , 89 ( 2 ) : 874 – 886 .

PubMed Web of Science ®Google Scholar

Wang , Y. , Jongman , A. and Sereno , J. A. 2003 . Acoustic and perceptual evaluation of Mandarin tone productions before and after perceptual training . Journal of the Acoustical Society of America , 113 ( 2 ) : 1033 – 1043 .

PubMed Web of Science ®Google Scholar

Wang , Y. , Spence , M. M. , Jongman , A. and Sereno , J. A. 1999 . Training American listeners to perceive Mandarin tones . Journal of the Acoustical Society of America , 106 ( 6 ) : 3649 – 3658 .

PubMed Web of Science ®Google Scholar

Clopper , C. G. and Pisoni , D. B. 2007 . Free classification of regional dialects of American English . Journal of Phonetics , 35 : 421 – 438 .

PubMed Web of Science ®Google Scholar

Cooke , M. , Garcia Lecumberri , M. L. and Barker , J. 2008 . The foreign language cocktail party problem: Energetic and informational masking effects on non-native speech perception . Journal of the Acoustical Society of America , 123 ( 1 ) : 414 – 427 .

PubMed Web of Science ®Google Scholar

Hazan , V. and Simpson , A. 2000 . The effect of cue-enhancement on consonant intelligibility in noise: Speaker and listener effects . Language and Speech , 43 ( 3 ) : 273 – 294 .

PubMedGoogle Scholar

Van Wijngaarden , S. , Steeneken , H. and Houtgast , T. 2002 . Quantifying the intelligibility of speech in noise for non-native listeners . Journal of the Acoustical Society of America , 111 : 1906 – 1916 .

PubMed Web of Science ®Google Scholar

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 53.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 444.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

Speech-in-speech recognition: A training study

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Speech-in-speech recognition: A training study

Abstract

Acknowledgements

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature