390
Views
18
CrossRef citations to date
0
Altmetric
Speech Recognition in Adverse Conditions

The effect of energetic and informational masking on the time-course of stream segregation: Evidence that streaming depends on vocal fine structure cues

, , &
Pages 1056-1088 | Received 09 Nov 2010, Accepted 18 May 2011, Published online: 25 Oct 2011
 

Abstract

To examine the effect of energetic and informational masking on the time-course of stream segregation, we presented listeners with semantically anomalous but syntactically correct target sentences (e.g., “A house should dash to the bowl”) that were masked by a two-talker speech masker or steady-state noise masker. To determine the effect of each masker on the time-course of stream segregation, we measured performance as a function of keyword position (key words in italics). The results from Experiment 1 showed that performance improved as a function of keyword position under speech masking, but was relatively stable across keyword positions under noise masking. The results of subsequent experiments showed that the variation in performance across keywords under speech masking was primarily due to the vocal similarities between the competing talkers, and that interference from the semantic content of the masker played a secondary role in undermining performance. Taken together, these results indicate that stream segregation takes longer to build up when a speech target is masked by other speech in the absence of cues that aid stream segregation (e.g., spatial separation), but that it takes little time to build up when a speech target is masked by a noise or when cues that aid stream segregation are available to listeners.

Acknowledgments

This research was supported by the Canadian Institutes of Health Research [MOP-15359] [STP-53875], Natural Sciences and Engineering Research Council of Canada [RGPIN 138472 and RGPIN 9956] and the National Natural Science Foundation of China [30711120563].

Notes

1Previous studies have indicated that robust effects using these stimuli in this informational masking paradigm can be obtained with as few as 12–16 participants (e.g., Li, Daneman, Qi, & Schneider, 2004, 12 subjects; Freyman, Balakrishnan, & Helfer, 2004, 16 subjects per condition in the priming conditions). Hence, we were confident that 16 participants would be sufficient to explore the effects of word position in these experiments.

2The repetitious nature of the speech masker may have allowed participants to become familiar with this masker over the course of the experiment. Consequently, it is possible that performance might improve over time due to the participants becoming familiar with the masker.

3This spectral difference between the two types of maskers indicates that the amount of energetic masking produced by the two types of maskers may have differed.

4 y represents the probability of correctly identifying a keyword, x is the SNR in dB, µ represents the dB SNR level corresponding to 50%-correct performance, and σ determines the slope of the fitted function.

5Keyword 1 vs. Keyword 2, mean difference = 0.469 dB, t(30) = 2.713, p>.05; Keyword 1 vs. Keyword 3, mean difference = 0.063 dB, t(30) = 0.365, p>.05; Keyword 2 vs. Keyword 3, mean difference = 0.406 dB, t(30) = 2.350, p>.05.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 444.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.