2,110
Views
19
CrossRef citations to date
0
Altmetric
Articles

Interactive natural language acquisition in a multi-modal recurrent neural architecture

ORCID Icon & ORCID Icon
Pages 99-133 | Received 25 Jun 2016, Accepted 01 Feb 2017, Published online: 30 Jan 2018

References

  • Alho, J., Lin, F.-H., Sato, M., Tiitinen, H., Sams, M., & Jääskeläinen, I. P. (2014). Enhanced neural synchrony between left auditory and premotor cortex is associated with successful phonetic categorization. Frontiers in Psychology, 5(394), 1–10.
  • Awano, H., Ogata, T., Nishide, S., Takahashi, T., Komatani, K., & Okuno, H. G. (2010). Human–robot cooperation in arrangement of objects using confidence measure of neuro-dynamical system. Proceedings of 2010 IEEE international conference on systems man and cybernetics (SMC), Istanbul, TR (pp. 2533–2538).
  • Badre, D., & D'Esposito, M. (2009). Is the rostro–caudal axis of the frontal lobe hierarchical? Nature Reviews Neuroscience, 10(9), 659–669. doi: 10.1038/nrn2667
  • Badre, D., Kayser, A. S., & D'Esposito, M. (2010). Frontal cortex and the discovery of abstract action rules. Neuron, 66(2), 315–326. doi: 10.1016/j.neuron.2010.03.025
  • Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. doi: 10.1146/annurev.psych.59.103006.093639
  • Bergen, B. K. (2012). Louder than words: The new science of how the mind makes meaning. New York, NY: Basic Books.
  • Borghi, A. M., Gianelli, C., & Scorolli, C. (2010). Sentence comprehension: Effectors and goals, self and others. An overview of experiments and implications for robotics. Frontiers in Neurorobotics, 4(3), 8.
  • Brosch, M., & Schreiner, C. E. (1997). Time course of forward masking tuning curves in cat primary auditory cortex. Journal of Neurophysiology, 77(2), 923–943.
  • Broz, F., Nehaniv, C. L., Belpaeme, T., Bisio, A., Dautenhahn, K., Fadiga, L., … Cangelosi, A. (2014). The ITALK project: A developmental robotics approach to the study of individual, social, and linguistic learning. Topics in Cognitive Science, 6(3), 534–544. doi: 10.1111/tops.12099
  • Cangelosi, A. (2010). Grounding language in action and perception: From cognitive agents to humanoid robots. Physics of Life Reviews, 7(2), 139–151. doi: 10.1016/j.plrev.2010.02.001
  • Cangelosi, A., & Riga, T. (2006). An embodied model for sensorimotor grounding and grounding transfer: Experiments with epigenetic robots. Cognitive Science, 30(4), 673–689. doi: 10.1207/s15516709cog0000_72
  • Cangelosi, A., & Schlesinger, M. (2015). Developmental robotics: From babies to robots. Cambridge, MA: The MIT Press.
  • Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698. doi: 10.1109/TPAMI.1986.4767851
  • Chang, E. F., Rieger, J. W., Johnson, K., Berger, M. S., Barbaro, N. M., & Knight, R. T. (2010). Categorical speech representation in human superior temporal gyrus. Nature Neuroscience, 13(11), 1428–1432. doi: 10.1038/nn.2641
  • Christiansen, M. H., & Chater, N. (2016). Creating language – Integrating evolution, acquisition, and processing. Cambridge, MA: The MIT Press.
  • Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619. doi: 10.1109/34.1000236
  • Coradeschi, S., Loutfi, A., & Wrede, B. (2013). A short review of symbol grounding in robotic and intelligent systems. KI-Künstliche Intelligenz, 27(2), 129–136. doi: 10.1007/s13218-013-0247-2
  • Damasio, A. R. (1989). Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition, 33(1), 25–62. doi: 10.1016/0010-0277(89)90005-X
  • Dayan, P., & Abbott, L. F. (2005). Theoretical neuroscience. Cambridge, MA: The MIT Press.
  • Dominey, P. F., Inui, T., & Hoen, M. (2009). Neural network processing of natural language: II. Towards a unified model of corticostriatal function in learning sentence comprehension and non-linguistic sequencing. Brain and Language, 109(2), 80–92. doi: 10.1016/j.bandl.2008.08.002
  • Dominey, P. F., & Ramus, F. (2000). Neural network processing of natural language: I. Sensitivity to serial, temporal and abstract structure of language in the infant. Language and Cognitive Processes, 15(1), 87–127. doi: 10.1080/016909600386129
  • Donahue, J., Hendricks, L. A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., & Darrell, T. (2015). Long-term recurrent convolutional networks for visual recognition and description. Proceedings of the 2015 IEEE conference on computer vision and pattern recognition (CVPR 2015), Boston, MA (pp. 2625–2634).
  • Doya, K., & Yoshizawa, S. (1989). Adaptive neural oscillator using continuous-time back-propagation learning. Neural Networks, 2(5), 375–385. doi: 10.1016/0893-6080(89)90022-1
  • Elman, J. L. (1989). Structured representations and connectionist models. Proceedings of the 11th annual conference of the cognitive science society (CogSci 1989) (pp. 17–23). Hillsdale, MI: Lawrence Erlbaum Assoc.
  • Engel, A. K., & Singer, W. (2001). Temporal binding and the neural correlates of sensory awareness. Trends in Cognitive Sciences, 5(1), 16–25. doi: 10.1016/S1364-6613(00)01568-0
  • Farkaš, I., Malík, T., & Rebrová, K. (2012). Grounding the meanings in sensorimotor behavior using reinforcement learning. Frontiers in Neurorobotics, 6(1), 13.
  • Feldman, J. A. (2006). From molecule to metaphor: A neural theory of language. Cambridge, MA: The MIT Press.
  • Friederici, A. D. (2012). The cortical language circuit: From auditory perception to sentence comprehension. Trends in Cognitive Sciences, 16(5), 262–268. doi: 10.1016/j.tics.2012.04.001
  • Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456), 815–836. doi: 10.1098/rstb.2005.1622
  • Garagnani, M., & Pulvermüller, F. (2016). Conceptual grounding of language in action and perception: A neurocomputational model of the emergence of category specificity and semantic hubs. European Journal of Neuroscience, 43(6), 721–737. doi: 10.1111/ejn.13145
  • Gazzaniga, M. S., Ivry, R. B., & Mangun, G. R. (2013). Cognitive neuroscience: The biology of the mind (3rd ed.). New York, NY: W. W. Norton & Company.
  • Gegenfurtner, K. R. (2003). Cortical mechanisms of colour vision. Nature Reviews Neuroscience, 4(7), 563–572. doi: 10.1038/nrn1138
  • Glenberg, A. M., & Gallese, V. (2012). Action-based language: A theory of language acquisition, comprehension, and production. Cortex, 48(7), 905–922. doi: 10.1016/j.cortex.2011.04.010
  • Grimm, H. (2012). Störungen der Sprachentwicklung (3rd ed.). Göttingen, DE: Hogrefe.
  • Hagoort, P., & Levelt, W. J. M. (2009). The speaking brain. Science, 326(5951), 372–373. doi: 10.1126/science.1181675
  • Håkansson, G., & Westander, J. (2013). Communication in humans and other animals, Advances in interaction studies (Vol. 4). Amsterdam: John Benjamins.
  • Hayes, D. P., & Ahrens, M. G. (1988). Vocabulary simplification for children: A special case of “motherese”? Journal of Child Language, 15(2), 395–410. doi: 10.1017/S0305000900012411
  • Heinrich, S., Magg, S., & Wermter, S. (2015). Analysing the multiple timescale recurrent neural network for embodied language understanding. In P. D. Koprinkova-Hristova, V. M. Mladenov, & N. K. Kasabov (Eds.), Artificial neural networks – Methods and applications in bio-/neuroinformatics, Vol. 4 of SSBN (Chapter 8, pp. 149–174). Berlin: Springer.
  • Heinrich, S., Weber, C., & Wermter, S. (2012). Adaptive learning of linguistic hierarchy in a multiple timescale recurrent neural network. Proceedings of the 22nd international conference on artificial neural networks (ICANN 2012), Vol. 7552 of LNCS (pp. 555–562). Berlin: Springer.
  • Heinrich, S., & Wermter, S. (2014). Interactive language understanding with multiple timescale recurrent neural networks. Proceedings of the 24th international conference on artificial neural networks (ICANN 2014), Vol. 8681 of LNCS (pp. 193–200). Hamburg, DE: Springer.
  • Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8(5), 393–402. doi: 10.1038/nrn2113
  • Hinoshita, W., Arie, H., Tani, J., Okuno, H. G., & Ogata, T. (2011). Emergence of hierarchical structure mirroring linguistic composition in a recurrent neural network. Neural Networks, 24(4), 311–320. doi: 10.1016/j.neunet.2010.12.006
  • Hopfield, J. J., & Tank, D. W. (1986). Computing with neural circuits: A model. Science, 233(4764), 625–633. doi: 10.1126/science.3755256
  • Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E., & Gallant, J. L. (2016). Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600), 453–458. doi: 10.1038/nature17637
  • Indefrey, P., & Levelt, W. J. M. (2004). The spatial and temporal signatures of word production components. Cognition, 92(1–2), 101–144. doi: 10.1016/j.cognition.2002.06.001
  • Karmiloff, K., & Karmiloff-Smith, A. (2002). Pathways to language: From fetus to adolescent. Cambridge: Harvard University Press.
  • Krüger, N., Janssen, P., Kalkan, S., Lappe, M., Leonardis, A., Piater, J.,…Wiskott, L. (2013). Deep hierarchies in the primate visual cortex: What can we learn for computer vision?. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1847–1871. doi: 10.1109/TPAMI.2012.272
  • Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22(1), 79–86. doi: 10.1214/aoms/1177729694
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. doi: 10.1038/nature14539
  • LeCun, Y., Bottou, L., Orr, G. B., & Müller, K.-R. (1998). Efficient backprop. In G. Orr & K.-R. Müller (Eds.), Neural networks: Tricks of the trade, Vol. 1524 of LNCS (pp. 9–50). Berlin: Springer.
  • Levelt, W. J. M. (2001). Spoken word production: A theory of lexical access. Proceedings of the National Academy of Sciences of the United States of America, 98(23), 13464–13471. doi: 10.1073/pnas.231459498
  • Levelt, W. J. M., Schriefers, H., Vorberg, D., Meyer, A. S., Pechmann, T., & Havinga, J. (1991). The time course of lexical access in speech production: A study of picture naming. Psychological Review, 98(1), 122– 142. doi: 10.1037/0033-295X.98.1.122
  • Liebenthal, E., Binder, J. R., Spitzer, S. M., Possing, E. T., & Medler, D. A. (2005). Neural substrates of phonemic perception. Cerebral Cortex, 15(10), 1621–1631. doi: 10.1093/cercor/bhi040
  • Marocco, D., Cangelosi, A., Fischer, K., & Belpaeme, T. (2010). Grounding action words in the sensorimotor interaction with the world: Experiments with a simulated iCub humanoid robot. Frontiers in Neurorobotics, 4(7), 15.
  • Marslen-Wilson, W., & Zwitserlood, P. (1989). Accessing spoken words: The importance of word onsets. Journal of Experimental Psychology: Human Perception and Performance, 15(3), 576 –585.
  • Monner, D., & Reggia, J. A. (2012). Emergent latent symbol systems in recurrent neural networks. Connection Science, 24(4), 193–225. doi: 10.1080/09540091.2013.798262
  • Murata, S., Namikawa, J., Arie, H., Sugano, S., & Tani, J. (2013). Learning to reproduce fluctuating time series by inferring their time-dependent stochastic properties: Application in robot learning via tutoring. IEEE Transactions on Autonomous Mental Development, 5(4), 298–310. doi: 10.1109/TAMD.2013.2258019
  • Nishide, S., Nakagawa, T., Ogata, T., Tani, J., Takahashi, T., & Okuno, H. G. (2009). Modeling tool-body assimilation using second-order recurrent neural network. Proceedings of the 2009 IEEE/RSJ international conference on intelligent robots and systems (IROS 2009), St. Louis, USA (pp. 5376–5381).
  • Nishimoto, R., & Tani, J. (2004). Learning to generate combinatorial action sequences utilizing the initial sensitivity of deterministic dynamical systems. Neural Networks, 17(7), 925–933. doi: 10.1016/j.neunet.2004.02.003
  • Nishimoto, R., & Tani, J. (2009). Development of hierarchical structures for actions and motor imagery: A constructivist view from synthetic neuro-robotics study. Psychological Research, 73(4), 545–558. doi: 10.1007/s00426-009-0236-0
  • Noda, K., Arie, H., Suga, Y., & Ogata, T. (2014). Multimodal integration learning of robot behavior using deep neural networks. Robotics and Autonomous Systems, 62(6), 721–736. doi: 10.1016/j.robot.2014.03.003
  • Orban, G. A. (2008). Higher order visual processing in macaque extrastriate cortex. Physiological Reviews, 88(1), 59–89. doi: 10.1152/physrev.00008.2007
  • Palm, G. (1990). Cell assemblies as a guideline for brain research. Concepts in Neuroscience, 1(1), 133–147.
  • Pasupathy, A., & Connor, C. E. (1999). Responses to contour features in macaque area v4. Journal of Neurophysiology, 82(5), 2490–2502.
  • Piaget, J. (1954). The construction of reality in the child. New York, NY: Basic Books.
  • Pulvermüller, F. (2003). The neuroscience of language: On brain circuits of words and serial order. Cambridge: Cambridge University Press.
  • Pulvermüller, F., & Fadiga, L. (2010). Active perception: Sensorimotor circuits as a cortical basis for language. Nature Reviews Neuroscience, 11(5), 351–360. doi: 10.1038/nrn2811
  • Pulvermüller, F., Garagnani, M., & Wennekers, T. (2014). Thinking in circuits: Toward neurobiological explanation in cognitive neuroscience. Biological Cybernetics, 108(5), 573–593. doi: 10.1007/s00422-014-0603-9
  • Rauschecker, J. P., & Tian, B. (2000). Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proceedings of the National Academy of Sciences of the United States of America, 97(22), 11800–11806. doi: 10.1073/pnas.97.22.11800
  • Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The rprop algorithm. Proceedings of the IEEE international conference on neural networks (ICNN93), San Francisco, CA (Vol. 1, pp. 586–591).
  • Rohlfing, K. J., Fritsch, J., Wrede, B., & Jungmann, T. (2006). How can multimodal cues from child-directed interaction reduce learning complexity in robots? Advanced Robotics, 20(10), 1183–1199. doi: 10.1163/156855306778522532
  • Schmolesky, M. T., Wang, Y., Hanes, D. P., Thompson, K. G., Leutgeb, S., Schall, J. D., & Leventhal, A. G. (1998). Signal timing across the macaque system. Journal of Neurophysiology, 79(6), 3272–3278.
  • Schulz, R., Glover, A., Milford, M. J., Wyeth, G., & Wiles, J. (2011). Lingodroids: Studies in spatial cognition and language. Proceedings of the IEEE international conference on robotics and automation (ICRA 2011), Trieste, IT (pp. 178–183).
  • Singer, J. M., & Sheinberg, D. L. (2010). Temporal cortex neurons encode articulated actions as slow sequences of integrated poses. The Journal of Neuroscience, 30(8), 3133–3145. doi: 10.1523/JNEUROSCI.3211-09.2010
  • Smith, K., & Kirby, S. (2012). Compositionality and linguistic evolution. In M. Werning et al. (Eds.), The oxford handbook of compositionality ( Chapter 25, pp. 439–509). Oxford, UK: Oxford University Press.
  • Smith, L. B., & Gasser, M. (2005). The development of embodied cognition: Six lessons from babies. Artificial Life, 11(1–2), 13–29. doi: 10.1162/1064546053278973
  • Smith, L. B., & Yu, C. (2008). Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition, 106(3), 1558–1568. doi: 10.1016/j.cognition.2007.06.010
  • Smith, M. A., & Kohn, A. (2008). Spatial and temporal scales of neuronal correlation in primary visual cortex. The Journal of Neuroscience, 28(48), 12591–12603. doi: 10.1523/JNEUROSCI.2929-08.2008
  • Sporns, O., Chialvo, D. R., Kaiser, M., & Hilgetag, C. C. (2004). Organization, development and function of complex brain networks. Trends in Cognitive Sciences, 8(9), 418–425. doi: 10.1016/j.tics.2004.07.008
  • Steels, L., Spranger, M., van Trijp, R., Höfer, S., & Hild, M. (2012). Emergent action language on real robots. In L. Steels, & M. Hild (Eds.), Language grounding in robots (Chapter 13, pp. 255–276). New York, NY: Springer.
  • Sutskever, I., Vinyals, O., & Le, Q. V. V. (2014). Sequence to sequence learning with neural networks. Proceedings of the 28th annual conference on neural information processing systems (NIPS2014), Vol. 27 of Advances in NIPS (pp. 3104–3112). Montréal, CA: Curran Assoc.
  • Suzuki, S., & Abe, K. (1985). Topological structural analysis of digitized binary images by border following. Graphical Models and Image Processing, 30(1), 32–46. doi: 10.1016/0734-189X(85)90016-7
  • Tani, J. (2014). Self-organization and compositionality in cognitive brains: A neurorobotics study. Proceedings of the IEEE, 102(4), 586–605. doi: 10.1109/JPROC.2014.2308604
  • Tani, J., Nishimoto, R., Namikawa, J., & Ito, M. (2008). Codevelopmental learning between human and humanoid robot using a dynamic neural network model. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(1), 43–59. doi: 10.1109/TSMCB.2007.907738
  • Tanigawa, H., Lu, H. D., & Roe, A. W. (2010). Functional organization for color and orientation in macaque v4. Nature Neuroscience, 13(12), 1542–1548. doi: 10.1038/nn.2676
  • Tomasello, M. (2003). Constructing a Language. Cambridge: Harvard University Press.
  • Ulanovsky, N., Las, L., Farkas, D., & Nelken, I. (2004). Multiple time scales of adaptation in auditory cortex neurons. The Journal of Neuroscience, 24(46), 10440–10453. doi: 10.1523/JNEUROSCI.1905-04.2004
  • Wrede, B., Kopp, S., Rohlfing, K., Lohse, M., & Muhl, C. (2010). Appropriate feedback in asymmetric interactions. Journal of Pragmatics, 42(9), 2369–2384. doi: 10.1016/j.pragma.2010.01.003
  • Yamashita, Y., & Tani, J. (2008). Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment. PLoS Computational Biology, 4(11), e1000220. doi: 10.1371/journal.pcbi.1000220
  • Yau, J. M., Pasupathy, A., Brincat, S. L., & Connor, C. E. (2012). Curvature processing dynamics in macaque area V4. Cerebral Cortex, 23(1), 198–209. doi: 10.1093/cercor/bhs004
  • Yu, C. (2005). The emergence of links between lexical acquisition and object categorization: A computational study. Connection Science, 17(3), 381–397. doi: 10.1080/09540090500281554