46
Views
0
CrossRef citations to date
0
Altmetric
Regular Article

Ham or hamster? Eye-tracking evidence of a clear speech benefit for word segmentation in quiet and in noise

&
Received 23 Apr 2023, Accepted 24 Mar 2024, Published online: 04 May 2024
 

ABSTRACT

This study examined whether intelligibility-enhancing hyperarticulated clear speaking styles improve word segmentation during real-time speech processing in quiet and in noise. English-speaking listeners heard clearly and conversationally spoken sentences in which the target (e.g. ham) was temporarily ambiguous with a competitor (e.g. hamster) across a word boundary (e.g. ham starting) while their eye fixations to target and competitor images were recorded. Relative to conversational speech, clear speech led listeners to fixate the target image over the competitor image to a greater degree, indicating facilitation of word segmentation. Such facilitation emerged in quiet and in noise even before disambiguating segmental information (e.g. /ɑ/ in starting) was available. A parallel clear speech benefit was not found when the disyllabic word (e.g. hamster) was the target. The findings suggest that improved word segmentation partly underlies the well-documented clear speech perceptual and cognitive benefits and may arise from the enhancements of multiple word boundary cues.

Acknowledgements

We are grateful to the associate editor Dr. Meredith Shafto and two anonymous reviewers for their insightful comments on the previous versions of this paper. We also would like to thank Madeline Smith, Eliana Spradling, and Madison Rider for their assistance with data collection.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 To obtain these measures, we ran the Montreal Forced Aligner (McAuliffe et al., Citation2017) on the stimuli and then manually corrected the phoneme boundaries in the target words if needed. When it was difficult to demarcate the boundaries in sequences such as vowels followed by the liquid /l/, we treated the sequences as single segments and excluded them from the calculation of F1 range, F2 range, and VSA.

2 We tested whether speaking style affects target advantage differently for the monolingual and bilingual groups. We repeated all the analyses in Section 3 using models in which Style, Group (monolingual vs. bilingual), and their interaction were included as fixed effects (by coding them as numeric predictors following van Rij et al., Citation2019). For all these analyses, the smooth representing the interaction effect over time suggested that the impact of speaking style did not differ significantly across the listener groups. We did not find evidence that the monolingual and bilingual participants behaved differently and thus their data were analysed together.

3 Growth Curve Analysis (GCA: Mirman, Citation2014; Mirman et al., Citation2008) has also been used to analyse fixation data (e.g. Tremblay, Broersma, et al., Citation2018; Tremblay et al., Citation2021). One advantage of GAMMs over GCA lies in its ability to model highly non-linear or wiggly trends (see e.g. Winter & Wieling, Citation2016, for a discussion). A flip side of this advantage, however, is that the model may contain many parameters and overfit to the data. To assess overfitting, we re-ran all the GAMMs described in Section 3 following a 5-fold cross-validation procedure, whereby we fitted the GAMM to the target advantage data from 80% of the trials, used it to predict target advantage values in the remaining 20%, and repeated the process to obtain predictions for all the trials. The predictions from the cross-validation were compared against the predictions by the model trained on the full dataset, using a GAMM that contained the same effect structure as those in Section 3 but with Style replaced by a factor contrasting the two groups of predictions. This factor had no significant effect for any analyses in Section 3, suggesting that the GAMMs reported in the main text should generalise to never-before-seen data and overfitting might not be an issue here.

4 For by-item results (graphs summarising speaking style difference in target advantage over the analysis window for each sentence), see the OSF repository of this paper.

5 The delayed benefit, however, may be unexpected considering the noise masker, which was the same for all the stimuli but had a spectral energy distribution overlapping more with conversational ones, especially in the 1–3 kHz range (see ). The overlap might disproportionately affect the conversational style more and, in fact, this seems to be what we found for the accuracy results of the MT trials (). Such a pattern may lead one to expect an earlier or larger clear speech processing benefit in noise than in quiet. Yet, our fixation data did not support this possibility. Rather, given the findings, it possible that the benefit would further decrease or vanish if the noise is adaptively shaped to match the LTAS of the specific style or stimulus.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 444.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.