519
Views
3
CrossRef citations to date
0
Altmetric
Regular Articles

The relational processing limits of classic and contemporary neural network models of language processing

ORCID Icon, ORCID Icon & ORCID Icon
Pages 240-254 | Received 25 Oct 2019, Accepted 03 Sep 2020, Published online: 21 Sep 2020

References

  • Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., … Zheng, X. (2016). Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16) (pp. 265–284).
  • Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR Conference Track, San Diego, CA.
  • Bahdanau, D., Murty, S., Noukhovitch, M., Nguyen, T. H., de Vries, H., & Courville, A. (2018). Systematic generalization: What is required and can it be learned? arXiv preprint arXiv:1811.12889.
  • Battaglia, P. W., Hamrick, J. B., Bapst, V., Sanchez-Gonzalez, A., Zambaldi, V., Malinowski, M., Tacchetti, A., Raposo, D., Santoro, A., Faulkner, R., Gulcehre, C., Song, F., Ballard, A., Gilmer, J., Dahl, G., Vaswani, A., Allen, K., Nash, C., Langston, V., … Pascanu, R. (2018). Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261.
  • Besold, T. R., d’Avila, A., Bader, S., Bowman, H., Domingos, P., Hitzler, P., Kühnberger, K., Lamb, L. C., Lowd, D., Machado, P., de Penning, L., Pinkas, G., Poon, H., & Zaverucha, G. (2017). Neural-symbolic learning and reasoning: A survey and interpretation. arXiv preprint arXiv:1711.03902.
  • Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2), 115–147. https://doi.org/10.1037/0033-295X.94.2.115
  • Chen, D., Lu, H., & Holyoak, K. J. (2017). Generative inferences based on learned relations. Cognitive Science, 41, 1062–1092. https://doi.org/10.1111/cogs.12455
  • Chollet, F., & and others. (2015). Keras. https://keras.io.
  • Christie, S., & Gentner, D. (2010). Where hypotheses come from: Learning new relations by structural alignment. Journal of Cognition and Development, 11(3), 356–373. https://doi.org/10.1080/15248371003700015
  • Doumas, L. A., & Hummel, J. E. (2005). Approaches to modeling human mental representations: What works, what doesn't and why. In K. J. Holyoak & R. G. Morrison (Eds.), The Cambridge handbook of thinking and reasoning (pp. 73–94). Cambridge: Cambridge University Press.
  • Doumas, L. A., & Hummel, J. E. (2012). Computational models of higher cognition. In Holyoak, K. J., & Morrison, R. G. (Eds.). The Oxford handbook of thinking and reasoning (pp. 52–66). Oxford University Press.
  • Doumas, L. A., & Hummel, J. E. (2013). Comparison and mapping facilitate relation discovery and predication. PloS One, 8(6), Article e63889. https://doi.org/10.1371/journal.pone.0063889
  • Doumas, L. A., Hummel, J. E., & Sandhofer, C. M. (2008). A theory of the discovery and predication of relational concepts. Psychological Review, 115(1), 1–43. https://doi.org/10.1037/0033-295X.115.1.1
  • Doumas, L. A., & Martin, A. E. (2018). Learning structured representations from experience. Psychology of Learning and Motivation, 69, 165–203. https://doi.org/10.1016/bs.plm.2018.10.002
  • Dozat, T. (2016). Incorporating Nesterov momentum into Adam. In In Proceedings of ICLR Conference Track, Caribe Hilton, San Juan, Puerto Rico.
  • Dunietz, J., Burnham, G., Bharadwaj, A., Chu-Carroll, J., Rambow, O., & Ferrucci, D. (2020). To test machine comprehension, start by defining comprehension. arXiv preprint arXiv:2005.01525.
  • Falkenhainer, B., Forbus, K. D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial Intelligence, 41(1), 1–63. https://doi.org/10.1016/0004-3702(89)90077-5
  • Fitz, H., Uhlmann, M., Duarte, R., Hagoort, P., & Petersson, K. M. (2019). Neuronal memory for language processing. bioRxiv 546325.
  • Fodor, J. A. (1975). The language of thought (Vol. 5). Harvard University Press.
  • Forbus, K. D., Liang, C., & Rabkina, I. (2017). Representation and computation in cognitive models. Topics in Cognitive Science, 9(3), 694–718. https://doi.org/10.1111/tops.2017.9.issue-3 doi: 10.1111/tops.12277
  • Franklin, N., Norman, K. A., Ranganath, C., Zacks, J. M., & Gershman, S. J. (2019). Structured event memory: A neuro-symbolic model of event cognition. BioRxiv 541607.
  • Gentner, D. (2016). Language as cognitive tool kit: How language supports relational thought. American Psychologist, 71(8), 650–657. https://doi.org/10.1037/amp0000082
  • Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18(5–6), 602–610. https://doi.org/10.1016/j.neunet.2005.06.042
  • Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., Colmenarejo, S. G., Grefenstette, E., Ramalho, T., Agapiou, J., Badia, A. P., Hermann, K., Zwols, Y., Ostrovski, G., Cain, A., King, H., Summerfield, C., Blunsom, P., Kavukcuoglu, K., & Hassabis, D. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), 471–476. https://doi.org/10.1038/nature20101
  • Gre, K., Srivastava, R. K., & Schmidhuber, J. (2015). Binding via reconstruction clustering. arXiv preprint arXiv:1511.06418.
  • Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21(6), 803–831. https://doi.org/10.1017/S0140525X98001769
  • Hill, F., Santoro, A., Barrett, D., Morcos, A., & Lillicrap, T. (2019). Learning to make analogies by contrasting abstract relational structure. In International conference on learning representations, New Orleans, LA.
  • Holyoak, K. J. (2012). Analogy and relational reasoning. In Holyoak, K. J., & Morrison, R. G. (Eds.). The Oxford handbook of thinking and reasoning (pp. 234–259). Oxford University Press.
  • Hummel, J. E., & Holyoak, K. J. (1997). Distributed representations of structure: A theory of analogical access and mapping. Psychological Review, 104(3), 427–466. https://doi.org/10.1037/0033-295X.104.3.427
  • Hummel, J. E., & Holyoak, K. J. (2003). A symbolic-connectionist theory of relational inference and generalization. Psychological Review, 110(2), 220–264. https://doi.org/10.1037/0033-295X.110.2.220
  • Hupkes, D., Dankers, V., Mul, M., & Bruni, E. (2019). The compositionality of neural networks: Integrating symbolism and connectionism. arXiv preprint arXiv:1908.08351.
  • Jia, R., & Liang, P. (2017). . Adversarial examples for evaluating reading comprehension systems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2021-2031). Copenhagen, Denmark.
  • Kaushik, D., & Lipton, Z. C. (2018, October–November). How much reading does reading comprehension require? A critical investigation of popular benchmarks. In Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 5010–5015). Association for Computational Linguistics. https://www.aclweb.org/anthology/D18-1546
  • Kollias, P., & McClelland, J. L. (2013). Context, cortex, and associations: A connectionist developmental approach to verbal analogies. Frontiers in Psychology, 4, 857. https://doi.org/10.3389/fpsyg.2013.00857
  • Lake, B. M., & Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. Proceedings of the 35 th International Conference on Machine Learning, Stockholm, Sweden.
  • Lake, B. M., Linzen, T., & Baroni, M. (2019). Human few-shot learning of compositional instructions. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st annual conference of the cognitive science society (pp. 611–617). Cognitive Science Society. Montreal, QB: Cognitive Science Society.
  • Leech, R., Mareschal, D., & Cooper, R. P. (2008). Analogy as relational priming: A developmental and computational perspective on the origins of a complex cognitive skill. Behavioral and Brain Sciences, 31(4), 357–378. https://doi.org/10.1017/S0140525X08004469
  • Loula, J., Baroni, M., & Lake, B. M. (2018). Rearranging the familiar: Testing compositional generalization in recurrent networks. arXiv preprint arXiv:1807.07545.
  • Lu, H., Chen, D., & Holyoak, K. J. (2012). Bayesian analogy with relational transformations. Psychological Review, 119(3), 617–648. https://doi.org/10.1037/a0028719
  • Lu, H., Wu, Y. N., & Holyoak, K. J. (2019). Emergence of analogy from relation learning. Proceedings of the National Academy of Sciences, 116(10), 4176–4181. https://doi.org/10.1073/pnas.1814779116
  • Marcus, G. F. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37(3), 243–282. https://doi.org/10.1006/cogp.1998.0694
  • Marcus, G. F. (2001). The algebraic mind: Integrating connectionism and cognitive science. MIT Press.
  • Martin, A. E., & Doumas, L. A. (2019). Tensors and compositionality in neural systems. Philosophical Transactions of the Royal Society B: Biological Sciences, 375(1791), Article 20190306. https://doi.org/10.1098/rstb.2019.0306
  • Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity. Psychological Review, 100(2), 254–278. https://doi.org/10.1037/0033-295X.100.2.254
  • Miikkulainen, R., & Dyer, M. G. (1991). Natural language processing with modular PDP networks and distributed lexicon. Cognitive Science, 15(3), 343–399. https://doi.org/10.1207/s15516709cog1503_2
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119). Stateline, NV.
  • Nematzadeh, A., Meylan, S. C., & Griffiths, T. L. (2017). Evaluating vector-space models of word representation, or, the unreasonable effectiveness of counting words near other words. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. J. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (pp. 859-864). Austin, TX: Cognitive Science Society.
  • O'reilly, R. C., & Busby, R. S. (2002). Generalizable relational binding from coarse-coded distributed representations. In Advances in neural information processing systems (pp. 75–82). British Columbia, Canada.
  • Penn, D. C., Holyoak, K. J., & Povinelli, D. J. (2008). Darwin's mistake: Explaining the discontinuity between human and nonhuman minds. Behavioral and Brain Sciences, 31(2), 109–130. https://doi.org/10.1017/S0140525X08003543
  • Pina, J. E., Bodner, M., & Ermentrout, B. (2018). Oscillations in working memory and neural binding: A mechanism for multiple memories and their interactions. PLoS Computational Biology, 14(11), Article e1006517. https://doi.org/10.1371/journal.pcbi.1006517
  • Rabovsky, M., Hansen, S. S., & McClelland, J. L. (2018). Modelling the N400 brain potential as change in a probabilistic representation of meaning. Nature Human Behaviour, 2(9), 693–705. https://doi.org/10.1038/s41562-018-0406-4
  • Rabovsky, M., & McClelland, J. L. (2020). Quasi-compositional mapping from form to meaning: A neural network-based approach to capturing neural responses during human language comprehension. Philosophical Transactions of the Royal Society B: Biological Sciences, 375(1791), Article 20190313. https://doi.org/10.1098/rstb.2019.0313
  • Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ Questions for Machine Comprehension of Text Squad: 100,000+ questions for machine comprehension of text. In J. Su, X. Carreras, & K. Duh (Eds.), Proceedings of the 2016 conference on empirical methods in natural language processing, EMNLP 2016, Austin, Texas, USA, November 1–4, 2016 (pp. 2383–2392). The Association for Computational Linguistics. https://doi.org/10.18653/v1/d16-1264
  • Rogers, T. T., & McClelland, J. L. (2008). Précis of semantic cognition: A parallel distributed processing approach. Behavioral and Brain Sciences, 31(6), 689–714. https://doi.org/10.1017/S0140525X0800589X
  • Rogers, T. T., & McClelland, J. L. (2014). Parallel distributed processing at 25: Further explorations in the microstructure of cognition. Cognitive Science, 38(6), 1024–1077. https://doi.org/10.1111/cogs.2014.38.issue-6 doi: 10.1111/cogs.12148
  • Rohde, D. L. (2002). A connectionist model of sentence comprehension and production [Unpublished doctoral dissertation]. School of Computer Science, Carnegie Mellon University.
  • Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., & Lillicrap, T. (2016). Meta-learning with memory-augmented neural networks. In International conference on machine learning (pp. 1842–1850). New York City, NY.
  • Santoro, A., Raposo, D., Barrett, D. G., Malinowski, M., Pascanu, R., Battaglia, P., & Lillicrap, T. (2017). A simple neural network module for relational reasoning. In Advances in neural information processing systems (pp. 4967–4976). Long Beach, CA.
  • Shastri, L., & Ajjanagadde, V. (1993). From simple associations to systematic reasoning: A connectionist representation of rules, variables and dynamic bindings using temporal synchrony. Behavioral and Brain Sciences, 16(3), 417–451. https://doi.org/10.1017/S0140525X00030910
  • St. John, M. F. (1992). The story gestalt: A model of knowledge-intensive processes in text comprehension. Cognitive Science, 16(2), 271–306. https://doi.org/10.1207/s15516709cog1602_5
  • St. John, M. F., & McClelland, J. L. (1990). Learning and applying contextual constraints in sentence comprehension. Artificial Intelligence, 46(1–2), 217–257. https://doi.org/10.1016/0004-3702(90)90008-N
  • Tenenbaum, J. B., Kemp, C., Griffiths, T. L., & Goodman, N. D. (2011). How to grow a mind: Statistics, structure, and abstraction. Science, 331(6022), 1279–1285. https://doi.org/10.1126/science.1192788
  • Van der Velde, F., & De Kamps, M. (2006). Neural blackboard architectures of combinatorial structures in cognition. Behavioral and Brain Sciences, 29(1), 37–70. https://doi.org/10.1017/S0140525X06009022
  • Weston, J., Bordes, A., Chopra, S., & Mikolov, T. (2016). Towards AI-complete question answering: A set of prerequisite toy tasks. In Y. Bengio, & Y. LeCun (Eds.), 4th international conference on learning representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, conference track proceedings. http://arxiv.org/abs/1502.05698.
  • Weston, J., Chopra, S., & Bordes, A. (2014). Memory networks. arXiv preprint arXiv:1410.3916.
  • Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … Dean, J. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
  • Yuan, A. (2017). Domain-general learning of neural network models to solve analogy tasks–a large-scale simulation. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th annual conference of the cognitive science society (pp. 2081–2086). Cognitive Science Society.
  • Zhang, X., Yang, A., Li, S., & Wang, Y. (2019). Machine reading comprehension: A literature review. arXiv preprint arXiv:1907.01686.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.