5,084
Views
24
CrossRef citations to date
0
Altmetric
Review Articles

Cortical entrainment: what we can learn from studying naturalistic speech perception

ORCID Icon, , &
Pages 681-693 | Received 20 Feb 2018, Accepted 15 Aug 2018, Published online: 13 Sep 2018

References

  • Ahissar, E., Nagarajan, S., Ahissar, M., Protopapas, A., Mahncke, H., & Merzenich, M. M. (2001). Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proceedings of the National Academy of Sciences of the United States of America, 98(23), 13367–13372.
  • Alexandrou, A. M., Saarinen, T., Kujala, J., & Salmelin, R. (2016). A multimodal spectral approach to characterize rhythm in natural speech. The Journal of the Acoustical Society of America, 139(1), 215–226.
  • Alexandrou, A. M., Saarinen, T., Kujala, J., & Salmelin, R. (2018). Cortical tracking of global and local variations of speech rhythm during connected natural speech perception. Journal of Cognitive Neuroscience. https://doi.org/10.1162/jocn_a_01295
  • Alexandrou, A. M., Saarinen, T., Mäkelä, S., Kujala, J., & Salmelin, R. (2017). The right hemisphere is highlighted in connected natural speech production and perception. NeuroImage, 152(C), 628–638.
  • Arnal, L. H., & Giraud, A.-L. (2012). Cortical oscillations and sensory predictions. Trends in Cognitive Sciences, 16(7), 390–398.
  • Assaneo, M. F., Sitt, J., Varoquaux, G., Sigman, M., Cohen, L., & Trevisan, M. A. (2016). Exploring the anatomical encoding of voice with a mathematical model of the vocal system. NeuroImage, 141, 31–39.
  • Bastos, A. M., & Schoffelen, J.-M. (2016). A tutorial review of functional connectivity analysis methods and their interpretational pitfalls. Frontiers in Systems Neuroscience, 9, 1–23.
  • Blaauw, E. (1994). The contribution of prosodic boundary markers to the perceptual difference between read and spontaneous speech. Speech Communication, 14(4), 359–375.
  • Borges, A. F. T., Giraud, A.-L., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2018). Scale-free amplitude modulation of neuronal oscillations tracks comprehension of accelerated speech. The Journal of Neuroscience, 38(3), 710–722.
  • Bourguignon, M., De Tiege, X., de Beeck, M. O., Ligot, N., Paquier, P., Van Bogaert, P., … Jousmäki, V. (2013). The pace of prosodic phrasing couples the listener's cortex to the reader's voice. Human Brain Mapping, 34(2), 314–326.
  • Brennan, S. E., & Schober, M. F. (2001). How listeners compensate for disfluencies in spontaneous speech. Journal of Memory and Language, 44(2), 274–296.
  • Calderone, D. J., Lakatos, P., Butler, P. D., & Castellanos, F. X. (2014). Entrainment of neural oscillations as a modifiable substrate of attention. Trends in Cognitive Sciences, 18(6), 300–309.
  • Canolty, R. T., Edwards, E., Dalal, S. S., Soltani, M., Nagarajan, S. S., Kirsch, H. E., … Knight, R. T. (2006). High gamma power is phase-locked to theta oscillations in human neocortex. Science, 313(5793), 1626–1628.
  • Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLoS Computational Biology, 5(7), 1–18.
  • Chawla, P., & Krauss, R. M. (1994). Gesture and speech in spontaneous and rehearsed narratives. Journal of Experimental Social Psychology, 30(6), 580–601.
  • Clark, H. H., & Wasow, T. (1998). Repeating words in spontaneous speech. Cognitive Psychology, 37(3), 201–242.
  • Cooke, M., Barker, J., Cunningham, S., & Shao, X. (2006). An audio-visual corpus for speech perception and automatic speech recognition. The Journal of the Acoustical Society of America, 120(5), 2421–2424.
  • Crystal, T. H., & House, A. S. (1982). Segmental durations in connected speech signals: Preliminary results. The Journal of the Acoustical Society of America, 72(3), 705–716.
  • Cummins, F. (2012a). Looking for rhythm in speech. Empirical Musicology Review, 7(1-2), 28–35.
  • Cummins, F. (2012b). Oscillators and syllables: A cautionary note. Frontiers in Psychology, 3, 1–2.
  • Cummins, F., & Port, R. (1998). Rhythmic constraints on stress timing in English. Journal of Phonetics, 26(2), 145–171.
  • Debener, S., Herrmann, C. S., Kranczioch, C., Gembris, D., & Engel, A. K. (2003). Top-down attentional processing enhances auditory evoked gamma band activity. Neuroreport, 14(5), 683–686.
  • Di Liberto, G. M., O’Sullivan, J. A., & Lalor, E. C. (2015). Low-frequency cortical entrainment to speech reflects phoneme-level processing. Current Biology, 25(19), 2457–2465.
  • Dilley, L. C., & Pitt, M. A. (2010). Altering context speech rate can cause words to appear or disappear. Psychological Science, 21(11), 1664–1670.
  • Ding, N., Chatterjee, M., & Simon, J. Z. (2014). Robust cortical entrainment to the speech envelope relies on the spectro-temporal fine structure. NeuroImage, 88, 41–46.
  • Ding, N., Melloni, L., Zhang, H., Tian, X., & Poeppel, D. (2016). Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience, 19(1), 158–164.
  • Ding, N., Patel, A. D., Chen, L., Butler, H., Luo, C., & Poeppel, D. (2017). Temporal modulations in speech and music. Neuroscience & Biobehavioral Reviews, 81(B), 181–187.
  • Ding, N., & Simon, J. Z. (2012). Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. Journal of Neurophysiology, 107(1), 78–89.
  • Ding, N., & Simon, J. Z. (2014). Cortical entrainment to continuous speech: Functional roles and interpretations. Frontiers in Human Neuroscience, 8, 311–317.
  • Doelling, K. B., Arnal, L. H., Ghitza, O., & Poeppel, D. (2014). Acoustic landmarks drive delta–theta oscillations to enable speech comprehension by facilitating perceptual parsing. NeuroImage, 85, 761–768.
  • Engel, A. K., Fries, P., & Singer, W. (2001). Dynamic predictions: Oscillations and synchrony in top-down processing. Nature Reviews Neuroscience, 2(10), 704–716.
  • Ferreira, F. (1991). Effects of length and syntactic complexity on initiation times for prepared utterances. Journal of Memory and Language, 30(2), 210–233.
  • Ferreira, F., & Swets, B. (2002). How incremental is language production? Evidence from the production of utterances requiring the computation of arithmetic sums. Journal of Memory and Language, 46(1), 57–84.
  • Finke, M., & Rogina, I. (1997). Wide context acoustic modeling in read vs. spontaneous speech 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (Vol. 3, pp. 1743-1746). Los Alamitos: IEEE Computer Society Press.
  • Ghazanfar, A. A., Morrill, R. J., & Kayser, C. (2013). Monkeys are perceptually tuned to facial expressions that exhibit a theta-like speech rhythm. Proceedings of the National Academy of Sciences of the United States of America, 110(5), 1959–1963.
  • Ghitza, O. (2011). Linking speech perception and neurophysiology: Speech decoding guided by cascaded oscillators locked to the input rhythm. Frontiers in Psychology, 2, 1–13.
  • Ghitza, O. (2012). On the role of theta-driven syllabic parsing in decoding speech: Intelligibility of speech with a manipulated modulation spectrum. Frontiers in Psychology, 3, 1–12.
  • Ghitza, O. (2013). The theta-syllable: A unit of speech information defined by cortical function. Frontiers in Psychology, 4, 1–5.
  • Ghitza, O., & Greenberg, S. (2009). On the possible role of brain rhythms in speech perception: Intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica, 66(1-2), 113–126.
  • Giordano, B. L., Ince, R. A. A., Gross, J., Schyns, P. G., Panzeri, S., & Kayser, C. (2017). Contributions of local speech encoding and functional connectivity to audio-visual speech perception. eLife, 6, e24763.
  • Giraud, A.-L., Kleinschmidt, A., Poeppel, D., Lund, T. E., Frackowiak, R. S. J., & Laufs, H. (2007). Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron, 56(6), 1127–1134.
  • Giraud, A.-L., & Poeppel, D. (2012). Cortical oscillations and speech processing: Emerging computational principles and operations. Nature Neuroscience, 15(4), 511–517.
  • Goswami, U., & Leong, V. (2013). Speech rhythm and temporal structure: Converging perspectives? Laboratory Phonology, 4(1), 67–92.
  • Gross, J., Hoogenboom, N., Thut, G., Schyns, P., Panzeri, S., Belin, P., & Garrod, S. (2013). Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLoS Biology, 11(12), 1–14.
  • Haegens, S., & Zion Golumbic, E. (2018). Rhythmic facilitation of sensory processing: A critical review. Neuroscience & Biobehavioral Reviews, 86, 150–165.
  • Henry, M. J., & Obleser, J. (2012). Frequency modulation entrains slow neural oscillations and optimizes human listening behavior. Proceedings of the National Academy of Sciences of the United States of America, 109(49), 20095–20100.
  • Hertrich, I., Dietrich, S., & Ackermann, H. (2013). Tracking the speech signal – time-locked MEG signals during perception of ultra-fast and moderately fast speech in blind and in sighted listeners. Brain and Language, 124(1), 9–21.
  • Hertrich, I., Dietrich, S., Trouvain, J., Moos, A., & Ackermann, H. (2012). Magnetic brain activity phase-locked to the envelope, the syllable onsets, and the fundamental frequency of a perceived speech signal. Psychophysiology, 49(3), 322–334.
  • Hickok, G., Farahbod, H., & Saberi, K. (2015). The rhythm of perception: Entrainment to acoustic rhythms induces subsequent perceptual oscillation. Psychological Science, 26(7), 1006–1013.
  • Hirose, K., & Kawanami, H. (2002). Temporal rate change of dialogue speech in prosodic units as compared to read speech. Speech Communication, 36(1), 97–111.
  • Howard, M. F., & Poeppel, D. (2010). Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension. Journal of Neurophysiology, 104(5), 2500–2511.
  • Iversen, J. R., Repp, B. H., & Patel, A. D. (2009). Top-down control of rhythm perception modulates early auditory responses. Annals of the New York Academy of Sciences, 1169(1), 58–73.
  • Jensen, O., Kaiser, J., & Lachaux, J.-P. (2007). Human gamma-frequency oscillations associated with attention and memory. Trends in Neurosciences, 30(7), 317–324.
  • Kayser, S. J., Ince, R. A., Gross, J., & Kayser, C. (2015). Irregular speech rate dissociates auditory cortical entrainment, evoked responses, and frontal alpha. The Journal of Neuroscience, 35(44), 14691–14701.
  • Keitel, A., & Gross, J. (2016). Individual human brain areas can be identified from their characteristic spectral activation fingerprints. PLoS Biology, 14(6), e1002498.
  • Keitel, A., Ince, R. A., Gross, J., & Kayser, C. (2017). Auditory cortical delta-entrainment interacts with oscillatory power in multiple fronto-parietal networks. NeuroImage, 147, 32–42.
  • Kösem, A., Basirat, A., Azizi, L., & Wassenhove, V. v. (2016). High-frequency neural activity predicts word parsing in ambiguous speech streams. Journal of Neurophysiology, 116(6), 2497–2512.
  • Lachaux, J. P., Rodriguez, E., Martinerie, J., & Varela, F. J. (1999). Measuring phase synchrony in brain signals. Human Brain Mapping, 8(4), 194–208.
  • Lakatos, P., Karmos, G., Mehta, A. D., Ulbert, I., & Schroeder, C. E. (2008). Entrainment of neuronal oscillations as a mechanism of attentional selection. Science, 320(5872), 110–113. doi: 10.1126/science.1154735
  • Lakatos, P., Shah, A. S., Knuth, K. H., Ulbert, I., Karmos, G., & Schroeder, C. E. (2005). An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. Journal of Neurophysiology, 94(3), 1904–1911.
  • Luo, H., Boemio, A., Gordon, M., & Poeppel, D. (2007). The perception of FM sweeps by Chinese and English listeners. Hearing Research, 224(1-2), 75–83.
  • Luo, H., Liu, Z., & Poeppel, D. (2010). Auditory cortex tracks both auditory and visual stimulus dynamics using low-frequency neuronal phase modulation. PLoS Biology, 8(8), 1–13.
  • Luo, H., & Poeppel, D. (2007). Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron, 54(6), 1001–1010.
  • Mai, G., Minett, J. W., & Wang, W. S. Y. (2016). Delta, theta, beta, and gamma brain oscillations index levels of auditory sentence processing. NeuroImage, 133, 516–528.
  • Makeig, S., Westerfield, M., Jung, T.-P., Enghoff, S., Townsend, J., Courchesne, E., & Sejnowski, T. J. (2002). Dynamic brain sources of visual evoked responses. Science, 295(5555), 690–694.
  • Mäkelä, J. P., Ahonen, A., Hämäläinen, M., Hari, R., Llmoniemi, R., Kajola, M., … Salmelin, R. (1993). Functional differences between auditory cortices of the two hemispheres revealed by whole-head neuromagnetic recordings. Human Brain Mapping, 1(1), 48–56.
  • Mathewson, K. E., Gratton, G., Fabiani, M., Beck, D. M., & Ro, T. (2009). To see or not to see: Prestimulus α phase predicts visual awareness. The Journal of Neuroscience, 29(9), 2725–2732.
  • Meyer, L. (2017). The neural oscillations of speech processing and language comprehension: State of the art and emerging mechanisms. European Journal of Neuroscience, 1–13. doi:10.1111/ejn.13748
  • Meyer, L., Henry, M. J., Gaston, P., Schmuck, N., & Friederici, A. D. (2017). Linguistic bias modulates interpretation of speech via neural delta-band oscillations. Cerebral Cortex, 27(9), 4293–4302.
  • Meyer, L., Sun, Y., & Martin, A. E. (2018). Entrainment in disguise: The exogenous and endogenous cortical rhythms of speech and language processing. PsyArXiv.
  • Miller, J. L., Grosjean, F., & Lomanto, C. (1984). Articulation rate and its variability in spontaneous speech: A reanalysis and some implications. Phonetica, 41(4), 215–225.
  • Millman, R. E., Johnson, S. R., & Prendergast, G. (2015). The role of phase-locking to the temporal envelope of speech in auditory perception and speech intelligibility. Journal of Cognitive Neuroscience, 27(3), 533–545.
  • Morillon, B., & Schroeder, C. E. (2015). Neuronal oscillations as a mechanistic substrate of auditory temporal prediction. Annals of the New York Academy of Sciences, 1337(1), 26–31.
  • Nakajima, S. y., & Allen, J. F. (1993). A study on prosody and discourse structure in cooperative dialogues. Phonetica, 50(3), 197–210.
  • Neuling, T., Rach, S., Wagner, S., Wolters, C. H., & Herrmann, C. S. (2012). Good vibrations: Oscillatory phase shapes perception. NeuroImage, 63(2), 771–778.
  • Nolan, F., & Jeon, H.-S. (2014). Speech rhythm: A metaphor? Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1658), 20130396. http://dx.doi.org/10.1098/rstb.2013.0396
  • Nourski, K. V., Reale, R. A., Oya, H., Kawasaki, H., Kovach, C. K., Chen, H., … Brugge, J. F. (2009). Temporal envelope of time-compressed speech represented in the human auditory cortex. Journal of Neuroscience, 29(49), 15564–15574.
  • Park, H., Ince, R. A., Schyns, P. G., Thut, G., & Gross, J. (2015). Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners. Current Biology, 25(12), 1649–1653.
  • Payton, K. L., Uchanski, R. M., & Braida, L. D. (1994). Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. The Journal of the Acoustical Society of America, 95(3), 1581–1592.
  • Peelle, J. E. (2018). Speech comprehension: Stimulating discussions at a cocktail party. Current Biology, 28(2), R68–R70.
  • Peelle, J., & Davis, M. H. (2012). Neural oscillations carry speech rhythm through to comprehension. Frontiers in Psychology, 3, 1–17.
  • Peelle, J. E., Gross, J., & Davis, M. H. (2013). Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cerebral Cortex, 23(6), 1378–1387.
  • Picheny, M. A., Durlach, N. I., & Braida, L. D. (1985). Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. Journal of Speech and Hearing Research, 28(1), 96–103.
  • Puschmann, S., Steinkamp, S., Gillich, I., Mirkovic, B., Debener, S., & Thiel, C. M. (2017). The right temporoparietal junction supports speech tracking during selective listening: Evidence from concurrent EEG-fMRI. The Journal of Neuroscience, 37(47), 11505–11516.
  • Ruspantini, I., Saarinen, T., Belardinelli, P., Jalava, A., Parviainen, T., Kujala, J., & Salmelin, R. (2012). Corticomuscular coherence is tuned to the spontaneous rhythmicity of speech at 2–3 Hz. The Journal of Neuroscience, 32(11), 3786–3790.
  • Schroeder, C. E., & Lakatos, P. (2009). Low-frequency neuronal oscillations as instruments of sensory selection. Trends in Neurosciences, 32(1), 9–18.
  • Schroeder, C. E., Wilson, D. A., Radman, T., Scharfman, H., & Lakatos, P. (2010). Dynamics of active sensing and perceptual selection. Current Opinion in Neurobiology, 20(2), 172–176.
  • Scott, S. K., Rosen, S., Lang, H., & Wise, R. J. S. (2006). Neural correlates of intelligibility in speech investigated with noise vocoded speech—A positron emission tomography study. The Journal of the Acoustical Society of America, 120(2), 1075–1083.
  • Smith, Z. M., Delgutte, B., & Oxenham, A. J. (2002). Chimaeric sounds reveal dichotomies in auditory perception. Nature, 416(6876), 87–90.
  • Steinschneider, M., Nourski, K. V., & Fishman, Y. I. (2013). Representation of speech in human auditory cortex: Is it special. Hearing Research, 305, 57–73.
  • Swerts, M., Strangert, E., & Heldner, M. (1996). F0 declination in read-aloud and spontaneous speech. In H. T. Bunnel & W. Idsardi (Eds.), Proceedings of the fourth international conference on spoken language processing (Vol. 3, pp. 1501–1504). Philadelphia, USA: IEEE.
  • Telkemeyer, S., Rossi, S., Koch, S. P., Nierhaus, T., Steinbrink, J., Poeppel, D., … Wartenburger, I. (2009). Sensitivity of newborn auditory cortex to the temporal structure of sounds. The Journal of Neuroscience, 29(47), 14726–14733.
  • Tilsen, S., & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages. The Journal of the Acoustical Society of America, 134(1), 628–639.
  • Tree, J. E. F. (1995). The effects of false starts and repetitions on the processing of subsequent words in spontaneous speech. Journal of Memory and Language, 34(6), 709–738.
  • Tree, J. E. F. (2001). Listeners’ uses of um and uh in speech comprehension. Memory & Cognition, 29(2), 320–326.
  • Uchanski, R. M., Choi, S. S., Braida, L. D., Reed, C. M., & Durlach, N. I. (1996). Speaking clearly for the hard of hearing IV: Further studies of the role of speaking rate. Journal of Speech, Language, and Hearing Research, 39(3), 494–509.
  • Wang, X.-J. (2010). Neurophysiological and computational principles of cortical rhythms in cognition. Physiological Reviews, 90(3), 1195–1268. doi: 10.1152/physrev.00035.2008
  • Yaruss, J. S. (1999). Utterance length, syntactic complexity, and childhood stuttering. Journal of Speech, Language, and Hearing Research, 42(2), 329–344.
  • Zhou, H., Melloni, L., Poeppel, D., & Ding, N. (2016). Interpretations of frequency domain analyses of neural entrainment: Periodicity, fundamental frequency, and harmonics. Frontiers in Human Neuroscience, 10, 1–8.
  • Zion-Golumbic, E. M., Ding, N., Bickel, S., Lakatos, P., Schevon, C. A., McKhann, G. M., … Simon, J. Z. (2013). Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron, 77(5), 980–991.
  • Zoefel, B., Archer-Boyd, A., & Davis, M. H. (2018). Phase entrainment of brain oscillations causally modulates neural responses to intelligible speech. Current Biology, 28(3), 401–408. e405.
  • Zoefel, B., ten Oever, S., & Sack, A. T. (2018). The involvement of endogenous neural oscillations in the processing of rhythmic input: More than a regular repetition of evoked neural responses. Frontiers in Neuroscience, 12, 1–13.
  • Zoefel, B., & VanRullen, R. (2015). The role of high-level processes for oscillatory phase entrainment to speech sound. Frontiers in Human Neuroscience, 9, 1–12.