ABSTRACT
While perceptual categories exhibit plasticity following recently heard speech, evidence of effects on production has been mixed. We tested the influences of perceptual plasticity on production with an implicit distributional learning paradigm. In Experiment 1, we exposed participants to an unlabelled bimodal distribution of voice onset time (VOT) using bilabial stop consonants, with a longer category boundary than is typical. Participants’ perceptual category boundaries shifted towards longer VOT, with a congruent increase in production VOT. Experiment 2 found evidence of perceptual transfer of these shifts to a different speaker and different syllables, and different words in production. Experiment 3 showed no shifts following exposure to a VOT boundary shorter than typical. We conclude that when listeners adjust their perceptual category boundaries, these changes may affect production categories, consistent with models where speech perception and production categories are linked, but with category boundaries influencing the link between perception and production.
Acknowledgements
We gratefully acknowledge the assistance of Sunil Naran, Claire Blackmore, Elaine Tham, Henness Wong and Mila Baer with data collection and coding, and Arty Samuel and Rachel M. Theodore for their comments.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 We group exemplar and parametric models by describing them as distributional theories following Smits et al., as the predictions of these models have been difficult to distinguish empirically (Smits et al., Citation2006), with Smits et al. describing the key difference as whether distributions are modelled non-parametrically (in exemplar models) or parametrically. It has also been suggested that exemplar models may be a way to implement parametric models (Shi et al., Citation2010).
2 VOT is a temporal cue to voicing in word-initial position in English, which we focus on in this paper. It measures the latency between the release burst of stop consonants and the onset of voicing, characterised by glottal pulses that lead to periodic high-amplitude waveforms associated with a subsequent vowel. In English the bilabial VOT contrast consists of two overlapping clusters of short-lag VOT voiced stops (/b/), where burst release and voicing occur closely in time, and long-lag VOT voiceless stops (/p/), where burst release and voicing are more separated in time.
3 Although we had originally not intended to look at generalisation in the design of Experiment 1, at the suggestion of a reviewer who noted due to the similarities between the item (labial stops with similar vowels), we also ran separate models for the four item pairs not heard in the exposure phase in Experiment 1 which differed for the post-exposure task and originally intended as fillers (beer/peer & bees/peas pre-exposure, beep/peep & beast/pieced post-exposure). Here we also found that VOT for productions of /b/ words was longer post-exposure (M = 15.1 ms, SD = 6.8 ms) compared with pre-exposure (M = 13.5 ms, SD = 5.8 ms; b = 1.61, SE = 0.80, z = 2.01, p = .046). However, here the significant increase in VOT seen for the repeated items in Experiment 1 was not seen in the VOT for the non-repeated /p/ productions comparing pre (M = 67.5 ms, SD = 14.0 ms) and post-exposure (M = 67.7 ms, SD = 14.8 ms; b = 1.83, SE = 2.52, z = 0.73, p = .47).
4 Across experiments, we found a significant correlation between voiced and voiceless productions, r(70) = .34, p < .01.