390
Views
18
CrossRef citations to date
0
Altmetric
Speech Recognition in Adverse Conditions

The effect of energetic and informational masking on the time-course of stream segregation: Evidence that streaming depends on vocal fine structure cues

, , &
Pages 1056-1088 | Received 09 Nov 2010, Accepted 18 May 2011, Published online: 25 Oct 2011
 

Abstract

To examine the effect of energetic and informational masking on the time-course of stream segregation, we presented listeners with semantically anomalous but syntactically correct target sentences (e.g., “A house should dash to the bowl”) that were masked by a two-talker speech masker or steady-state noise masker. To determine the effect of each masker on the time-course of stream segregation, we measured performance as a function of keyword position (key words in italics). The results from Experiment 1 showed that performance improved as a function of keyword position under speech masking, but was relatively stable across keyword positions under noise masking. The results of subsequent experiments showed that the variation in performance across keywords under speech masking was primarily due to the vocal similarities between the competing talkers, and that interference from the semantic content of the masker played a secondary role in undermining performance. Taken together, these results indicate that stream segregation takes longer to build up when a speech target is masked by other speech in the absence of cues that aid stream segregation (e.g., spatial separation), but that it takes little time to build up when a speech target is masked by a noise or when cues that aid stream segregation are available to listeners.

Acknowledgments

This research was supported by the Canadian Institutes of Health Research [MOP-15359] [STP-53875], Natural Sciences and Engineering Research Council of Canada [RGPIN 138472 and RGPIN 9956] and the National Natural Science Foundation of China [30711120563].

Notes

1Previous studies have indicated that robust effects using these stimuli in this informational masking paradigm can be obtained with as few as 12–16 participants (e.g., Li, Daneman, Qi, & Schneider, 2004, 12 subjects; Freyman, Balakrishnan, & Helfer, 2004, 16 subjects per condition in the priming conditions). Hence, we were confident that 16 participants would be sufficient to explore the effects of word position in these experiments.

2The repetitious nature of the speech masker may have allowed participants to become familiar with this masker over the course of the experiment. Consequently, it is possible that performance might improve over time due to the participants becoming familiar with the masker.

3This spectral difference between the two types of maskers indicates that the amount of energetic masking produced by the two types of maskers may have differed.

4 y represents the probability of correctly identifying a keyword, x is the SNR in dB, µ represents the dB SNR level corresponding to 50%-correct performance, and σ determines the slope of the fitted function.

5Keyword 1 vs. Keyword 2, mean difference = 0.469 dB, t(30) = 2.713, p>.05; Keyword 1 vs. Keyword 3, mean difference = 0.063 dB, t(30) = 0.365, p>.05; Keyword 2 vs. Keyword 3, mean difference = 0.406 dB, t(30) = 2.350, p>.05.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.