178
Views
0
CrossRef citations to date
0
Altmetric
Regular Article

Hearing what is being said: the distributed neural substrate for early speech interpretation

ORCID Icon, &
Received 24 Mar 2023, Accepted 26 Mar 2024, Published online: 28 Apr 2024

References

  • Barack, D. L., & Krakauer, J. W. (2021). Two views on the cognitive brain. Nature Reviews Neuroscience, 22(6), Article 6. https://doi.org/10.1038/s41583-021-00448-6
  • Baroni, M., & Lenci, A. (2010). Distributional memory: A general framework for corpus-based semantics. Computational Linguistics, 36(4), 673–721. https://doi.org/10.1162/coli_a_00016
  • Bhaya-Grossman, I., & Chang, E. F. (2022). Speech computations of the human superior temporal gyrus. Annual Review of Psychology, 73(1), 79–102. https://doi.org/10.1146/annurev-psych-022321-035256
  • Brodbeck, C., Hong, L. E., & Simon, J. Z. (2018). Rapid transformation from auditory to linguistic representations of continuous speech. Current Biology, 28(24), 3976–3983.e5. https://doi.org/10.1016/j.cub.2018.10.042
  • Brodbeck, C., Presacco, A., & Simon, J. Z. (2018). Neural source dynamics of brain responses to continuous stimuli: Speech processing from acoustics to comprehension. NeuroImage, 172, 162–174. https://doi.org/10.1016/j.neuroimage.2018.01.042
  • Broderick, M. P., Anderson, A. J., & Lalor, E. C. (2019). Semantic context enhances the early auditory encoding of natural speech. Journal of Neuroscience, 39(38), 7564–7575. https://doi.org/10.1523/JNEUROSCI.0584-19.2019
  • Canolty, R. T., Soltani, M., Dalal, S. S., Edwards, E., Dronkers, N. F., Nagarajan, S. S., Kirsch, H. E., Barbaro, N. M., & Knight, R. T. (2007). Spatiotemporal dynamics of word processing in the human brain. Frontiers in Neuroscience, 1(1), 185–196. https://doi.org/10.3389/neuro.01.1.1.014.2007
  • Chang, E. F. (2015). Towards large-scale, human-based, mesoscopic neurotechnologies. Neuron, 86(1), 68–78. https://doi.org/10.1016/j.neuron.2015.03.037
  • Chang, E. F., Rieger, J. W., Johnson, K., Berger, M. S., Barbaro, N. M., & Knight, R. T. (2010). Categorical speech representation in human superior temporal gyrus. Nature Neuroscience, 13(11), Article 11. https://doi.org/10.1038/nn.2641
  • Chung, S., & Abbott, L. F. (2021). Neural population geometry: An approach for understanding biological and artificial neural networks. Current Opinion in Neurobiology, 70, 137–144. https://doi.org/10.1016/j.conb.2021.10.010
  • Cibelli, E. S., Leonard, M. K., Johnson, K., & Chang, E. F. (2015). The influence of lexical statistics on temporal lobe cortical dynamics during spoken word listening. Brain and Language, 147, 66–75. https://doi.org/10.1016/j.bandl.2015.05.005
  • Correia, J. M., Jansma, B. M. B., & Bonte, M. (2015). Decoding articulatory features from fMRI responses in dorsal speech regions. Journal of Neuroscience, 35(45), 15015–15025. https://doi.org/10.1523/JNEUROSCI.0977-15.2015
  • Delorme, A., & Makeig, S. (2004). EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134, 9–21. https://doi.org/10.1016/j.jneumeth.2003.10.009
  • DeWitt, I., & Rauschecker, J. P. (2012). Phoneme and word recognition in the auditory ventral system. Proceedings of the National Academy of Sciences of the United States of America, 109, E505–E514. https://doi.org/10.1073/pnas.1113427109
  • Di Liberto, G. M., O’Sullivan, J. A., & Lalor, E. C. (2015). Low-Frequency cortical entrainment to speech reflects phoneme-level processing. Current Biology: CB, 25(19), 2457–2465. https://doi.org/10.1016/j.cub.2015.08.030
  • Dubreuil, A., Valente, A., Beiran, M., Mastrogiuseppe, F., & Ostojic, S. (2022). The role of population structure in computations through neural dynamics. Nature Neuroscience, 25(6), Article 6. https://doi.org/10.1038/s41593-022-01088-4
  • Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179–211. https://doi.org/10.1207/s15516709cog1402_1
  • Flinker, A., Chang, E. F., Barbaro, N. M., Berger, M. S., & Knight, R. T. (2011). Sub-centimeter language organization in the human temporal lobe. Brain and Language, 117(3), 103–109. https://doi.org/10.1016/j.bandl.2010.09.009
  • Fox, N. P., Leonard, M., Sjerps, M. J., & Chang, E. F. (2020). Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. eLife, 9, e53051. https://doi.org/10.7554/eLife.53051
  • Gaskell, M. G., & Marslen-Wilson, W. (1999). Ambiguity, competition and blending in spoken word recognition. Cognitive Science, 23, 439–462. https://doi.org/10.1207/s15516709cog2304_3
  • Gaskell, M. G., & Marslen-Wilson, W. D. (1996). Phonological variation and inference in lexical access. Journal of Experimental Psychology: Human Perception and Performance, 22, 144–158. https://doi.org/10.1037/0096-1523.22.1.144
  • Gaskell, M. G., & Marslen-Wilson, W. D. (1997). Integrating form and meaning: A distributed model of speech perception. Language and Cognitive Processes, 12, 613–656. https://doi.org/10.1080/016909697386646
  • Gaskell, M. G., & Marslen-Wilson, W. D. (2002). Representation and competition in the perception of spoken words. Cognitive Psychology, 45, 220–266. https://doi.org/10.1016/S0010-0285(02)00003-8
  • Grosjean, F. (1980). Spoken word recognition processes and the gating paradigm. Perception & Psychophysics, 28(4), 267–283. https://doi.org/10.3758/BF03204386
  • Guggenmos, M., Sterzer, P., & Cichy, R. M. (2018). Multivariate pattern analysis for MEG: A comparison of dissimilarity measures. NeuroImage, 173, 434–447. https://doi.org/10.1016/j.neuroimage.2018.02.044
  • Gwilliams, L., & Davis, M. H. (2022). Extracting language content from speech sounds: The information theoretic approach. In L. L. Holt, J. E. Peelle, A. B. Coffin, A. N. Popper, & R. R. Fay (Eds.), Speech perception (pp. 113–139). Springer International Publishing.
  • Gwilliams, L., Linzen, T., Poeppel, D., & Marantz, A. (2018). In spoken word recognition, the future predicts the past. The Journal of Neuroscience, 38(35), 7585–7599. https://doi.org/10.1523/JNEUROSCI.0065-18.2018
  • Hamilton, L. S., Edwards, E., & Chang, E. F. (2018). A spatial map of onset and sustained responses to speech in the human superior temporal gyrus. Current Biology, 28(12), 1860–1871.e4. https://doi.org/10.1016/j.cub.2018.04.033
  • Hamilton, L. S., Oganian, Y., Hall, J., & Chang, E. F. (2021). Parallel and distributed encoding of speech across human auditory cortex. Cell, 184(18), 4626–4639.e13. https://doi.org/10.1016/j.cell.2021.07.019
  • Henson, R. N. A., Mouchlianitis, E., & Friston, K. (2009). MEG and EEG data fusion: Simultaneous localisation of face-evoked responses. Neuroimage, 47(2), 581–589. https://doi.org/10.1016/j.neuroimage.2009.04.063
  • Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews: Neuroscience, 8(5), 393–402. https://doi.org/10.1038/nrn2113
  • Hickok, G., & Poeppel, D. (2015). Chapter 8—Neural basis of speech perception. In M. J. Aminoff, F. Boller, & D. F. Swaab (Eds.), Handbook of clinical neurology (Vol. 129, pp. 149–160). Elsevier. https://doi.org/10.1016/B978-0-444-62630-1.00008-1
  • Hickok, G., Venezia, J., & Teghipco, A. (2023). Beyond Broca: Neural architecture and evolution of a dual motor speech coordination system. Brain, 146(5), 1775–1790. https://doi.org/10.1093/brain/awac454
  • Keshishian, M., Akkol, S., Herrero, J., Bickel, S., Mehta, A. D., & Mesgarani, N. (2023). Joint, distributed and hierarchically organized encoding of linguistic features in the human auditory cortex. Nature Human Behaviour, 1–14. https://doi.org/10.1038/s41562-023-01520-0
  • Klimovich-Gray, A., Tyler, L. K., Randall, B., Kocagoncu, E., Devereux, B., & Marslen-Wilson, W. D. (2019). Balancing prediction and sensory input in speech comprehension: The spatiotemporal dynamics of word recognition in context. Journal of Neuroscience, 39(3), 519–527. https://doi.org/10.1523/JNEUROSCI.3573-17.2018
  • Kocagoncu, E., Clarke, A., Devereux, B. J., & Tyler, L. K. (2017). Decoding the cortical dynamics of sound-meaning mapping. Journal of Neuroscience, 37(5), 1312–1319. https://doi.org/10.1523/JNEUROSCI.2858-16.2016
  • Kriegeskorte, N., & Diedrichsen, J. (2019). Peeling The Onion of brain representations. Annual Review of Neuroscience, 42(1), 407–432. https://doi.org/10.1146/annurev-neuro-080317-061906
  • Kriegeskorte, N., & Kievit, R. A. (2013). Representational geometry: Integrating cognition, computation, and the brain. Trends in Cognitive Sciences, 17(8), 401–412. https://doi.org/10.1016/j.tics.2013.06.007
  • Kriegeskorte, N., Mur, M., & Bandettini, P. (2008). Representational similarity analysis—Connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2(4). https://doi.org/10.3389/neuro.01.016.2008.
  • Leonard, M. K., Bouchard, K. E., Tang, C., & Chang, E. F. (2015). Dynamic encoding of speech sequence probability in human temporal cortex. Journal of Neuroscience, 35(18), 7203–7214. https://doi.org/10.1523/JNEUROSCI.4100-14.2015
  • Leonard, M. K., & Chang, E. F. (2014). Dynamic speech representations in the human temporal lobe. Trends in Cognitive Sciences, 18(9), 472–479. https://doi.org/10.1016/j.tics.2014.05.001
  • Lyu, B., Choi, H. S., Marslen-Wilson, W. D., Clarke, A., Randall, B., & Tyler, L. K. (2019). Neural dynamics of semantic composition. Proceedings of the National Academy of Sciences, 116(42), 21318–21327. https://doi.org/10.1073/pnas.1903402116
  • Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing of EEG- and MEG data. Journal of Neuroscience Methods, 164, 177–190. https://doi.org/10.1016/j.jneumeth.2007.03.024
  • Marslen-Wilson, W. D. (1973). Linguistic structure and speech shadowing at very short latencies. Nature, 244, 522–523. https://doi.org/10.1038/244522a0
  • Marslen-Wilson, W. D. (1975). Sentence perception as an interactive parallel process. Science, 189(4198), 226–228. https://doi.org/10.1126/science.189.4198.226
  • Marslen-Wilson, W. D. (1987). Functional parallelism in spoken word-recognition. Cognition, 25, 71–102. https://doi.org/10.1016/0010-0277(87)90005-9
  • Marslen-Wilson, W. D. (2019). Explaining speech comprehension: Integrating electrophysiology, evolution, and cross-linguistic diversity. In Peter Hagoort (Ed.), Human language: From genes and brains to behavior (pp. 409–427). The MIT Press.
  • Marslen-Wilson, W. D., & Tyler, L. K. (1975). Processing structure of sentence perception. Nature, 257(5529), 784–786. https://doi.org/10.1038/257784a0
  • Marslen-Wilson, W. D., & Warren, P. (1994). Levels of perceptual representation and process in lexical access: Words, phonemes, and features. Psychological Review, 101, 653–675. https://doi.org/10.1037/0033-295X.101.4.653
  • Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014). Phonetic feature encoding in human superior temporal gyrus. Science, 343(6174), 1006–1010. https://doi.org/10.1126/science.1245994
  • Miller, J., Ulrich, R., & Schwarz, W. (2009). Why jackknifing yields good latency estimates. Psychophysiology, 46(2), 300–312. https://doi.org/10.1111/j.1469-8986.2008.00761.x
  • Obleser, J., Lahiri, A., & Eulitz, C. (2004). Magnetic brain response mirrors extraction of phonological features from spoken vowels. Journal of Cognitive Neuroscience, 16(1), 31–39. https://doi.org/10.1162/089892904322755539
  • Oganian, Y., Bhaya-Grossman, I., Johnson, K., & Chang, E. F. (2022). Vowel and formant representation in human auditory speech cortex (p. 2022.09.13.507547). bioRxiv. https://doi.org/10.1101/2022.09.13.507547
  • Oganian, Y., & Chang, E. F. (2019). A speech envelope landmark for syllable encoding in human superior temporal gyrus. Science Advances, 5(11), eaay6279. https://doi.org/10.1126/sciadv.aay6279
  • Pinotsis, D. A., & Miller, E. K. (2022). Beyond dimension reduction: Stable electric fields emerge from and allow representational drift. NeuroImage, 253, 119058. https://doi.org/10.1016/j.neuroimage.2022.119058
  • Sassenhagen, J., & Draschkow, D. (2019). Cluster-based permutation tests of MEG/EEG data do not establish significance of effect latency or location. Psychophysiology, 56(6), e13335. https://doi.org/10.1111/psyp.13335
  • Saxena, S., & Cunningham, J. P. (2019). Towards the neural population doctrine. Current Opinion in Neurobiology, 55, 103–111. https://doi.org/10.1016/j.conb.2019.02.002
  • Stephenson, C., Feather, J., Padhy, S., Elibol, O., Tang, H., Mcdermott, J., & Chung, S. (2019). Untangling in invariant speech recognition. Advances in neural information processing systems, 32.
  • Su, L., Fonteneau, E., Marslen-Wilson, W., & Kriegeskorte, N. (2012). Spatiotemporal searchlight representational similarity analysis in EMEG source space. 2012 International Workshop on Pattern Recognition in NeuroImaging (PRNI), 97–100. https://doi.org/10.1109/PRNI.2012.26
  • Su, L., Zulfiqar, I., Jamshed, F., Fonteneau, E., & Marslen-Wilson, W. (2014). Mapping tonotopic organization in human temporal cortex: Representational similarity analysis in EMEG source space. Frontiers in Neuroscience, 8(368). https://www.frontiersin.org/articles/10.3389fnins.2014.00368.
  • Tang, C., Hamilton, L. S., & Chang, E. F. (2017). Intonational speech prosody encoding in the human auditory cortex. Science, 357(6353), 797–801. https://doi.org/10.1126/science.aam8577
  • Travis, K. E., Leonard, M. K., Chan, A. M., Torres, C., Sizemore, M. L., Qu, Z., Eskandar, E., Dale, A. M., Elman, J. L., Cash, S. S., & Halgren, E. (2013). Independence of early speech processing from word meaning. Cerebral Cortex, 23(10), 2370–2379. https://doi.org/10.1093/cercor/bhs228
  • Tyler, L. K., & Wessels, J. (1985). Is gating an on-line task? Evidence from naming latency data. Perception & Psychophysics, 38(3), 217–222. https://doi.org/10.3758/BF03207148
  • Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., Mazoyer, B., & Joliot, M. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage, 15(1), 273–289. https://doi.org/10.1006/nimg.2001.0978
  • Warren, P., & Marslen-Wilson, W. D. (1987). Continous uptake of acoustic cues in spoken word recognition. Perception and Psychophysics, 41, 262–275. https://doi.org/10.3758/BF03208224
  • Wingfield, C., Su, L., Liu, X., Zhang, C., Woodland, P., Thwaites, A., Fonteneau, E., & Marslen-Wilson, W. D. (2017). Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem. PLOS Computational Biology, 13(9), e1005617. https://doi.org/10.1371/journal.pcbi.1005617
  • Wingfield, C., Zhang, C., Devereux, B., Fonteneau, E., Thwaites, A., Liu, X., Woodland, P., Marslen-Wilson, W., & Su, L. (2022). On the similarities of representations in artificial and brain neural networks for speech recognition. Frontiers in Computational Neuroscience, 16(1057439). https://www.frontiersin.org/articles/10.3389fncom.2022.1057439.
  • Yi, H. G., Leonard, M. K., & Chang, E. F. (2019). The encoding of speech sounds in the superior temporal gyrus. Neuron, 102(6), 1096–1110. https://doi.org/10.1016/j.neuron.2019.04.023