ABSTRACT
Whether neural networks can capture relational knowledge is a matter of long-standing controversy. Recently, some researchers have argued that (1) classic connectionist models can handle relational structure and (2) the success of deep learning approaches to natural language processing suggests that structured representations are unnecessary to model human language. We tested the Story Gestalt model, a classic connectionist model of text comprehension, and a Sequence-to-Sequence with Attention model, a modern deep learning architecture for natural language processing. Both models were trained to answer questions about stories based on abstract thematic roles. Two simulations varied the statistical structure of new stories while keeping their relational structure intact. The performance of each model fell below chance at least under one manipulation. We argue that both models fail our tests because they can't perform dynamic binding. These results cast doubts on the suitability of traditional neural networks for explaining relational reasoning and language processing phenomena.
Acknowledgments
The work of Guillermo Puebla was supported by the PhD Scholarship Program of CONICYT, Chile. Andrea E. Martin was supported by the Max Planck Research Group “Language and Computation in Neural Systems” and by the Netherlands Organization for Scientific Research (Grant 016.Vidi.188.029). We thank Hugh Rabagliati for his comments on earlier versions of the manuscript.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 Although these concepts were never used the in the context of each specific script, they were seen in the training dataset as a whole. By definition, the output of any traditional neural network to a completely new (unseen) concept depends on its initial weights. Given that these weights are initialised randomly, the behaviour of a neural network regarding an unseen input will be essentially random (Marcus, Citation1998).
2 We actually run that simulation and, unsurprisingly, obtained perfect “generalisation”.