ABSTRACT
Speech and music emerge from a spectrum of nested motor and perceptual coordination patterns across timescales of brief movements to actions. Intuitively, this nested clustering in movements should be reflected in sound. We examined similarities and differences in multimodal, multiscale coordination of speech and music using two complementary measures: We computed spectra for envelopes of acoustic amplitudes and motion amplitudes and correlated spectral powers across modalities as a function of frequency. We also correlated smoothed envelopes and examined peaks in their cross-correlation functions. YouTube videos of five different modes of speaking and five different types of music were analyzed. Speech performances yielded stronger, more reliable relationships between sound and movement compared with music. Interestingly, a cappella singing patterned more with music, and improvisational jazz piano patterned more with speech. Results suggest that nested temporal structures in sound and movement are coordinated as a function of communicative aspects of performance.
Disclosure statement
The authors declare no conflict of interests.
Data-availability
The datasets and scripts used in the data analyses are available in the GitHub repository https://github.com/camialviar/AVCoordMusicSpeech.