ABSTRACT
We examined the relationship between tolerance for audiovisual onset asynchrony (AVOA) and the spectrotemporal fidelity of the spoken words and the speaker’s mouth movements. In two experiments that only varied in the temporal order of sensory modality, visual speech leading (exp. 1) or lagging (exp. 2) acoustic speech, participants watched intact and blurred videos of a speaker uttering trisyllabic words and nonwords that were noise vocoded with 4-, 8-, 16- and 32-channels. They judged whether the speaker’s mouth movements and the speech sounds were in-sync or out-of-sync. Individuals perceived synchrony (tolerated AVOA) on more trials when the acoustic speech was more speech-like (8-channels and higher vs. 4-channels), and when visual speech was intact than blurred (exp. 1 only). These findings suggest that enhanced spectrotemporal fidelity of the audiovisual (AV) signal prompts the brain to widen the window of integration promoting the fusion of temporally distant AV percepts.
Acknowledgements
We thank Drs. Mark Pitt and Kristina Backer on their insights regarding the experimental design. We thank Dr Lee Miller for providing the original videos. We thank Dr Frederic Apoux for sharing the original vocoding Matlab scripts. To access the original individual data, please go to https://figshare.com/articles/Tolerance_for_Audiovisual_Onset_Asynchrony/4579540.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. The designation of parameters followed this heuristic formula as implemented in Matlab R2015a by default: hsize = 2 × ceil(2 × sigma) + 1, so that visual fidelity decreases monotonically as sigma increases.