2,670
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Empiricism, syntax, and ontogeny

ORCID Icon
Pages 1011-1046 | Received 31 May 2019, Accepted 27 May 2021, Published online: 14 Jun 2021
 

ABSTRACT

Generative grammarians typically advocate for a rationalist understanding of language acquisition, according to which the structure of a developed language faculty reflects innate guidance rather than environmental influence. This proposal is developed in developmental linguistics by triggering models of language acquisition. Opposing this tradition, various theorists have advocated for empiricist views of language acquisition, according to which the structure of a developed linguistic competence reflects the linguistic environment in which this competence developed. On this picture, linguistic development is accounted for by general statistical learning mechanisms. In this article I shall precisify the debate, provide a clearer picture of what is at stake, and show why an intermediate picture is needed.

Acknowledgments

This paper benefited greatly from discussions with from Josh Armstrong, Sam Cumming, Guillermo Del Pinal, Bill Kowalsky, and members of the UCLA Language and Mind Workshop, as well as helpful comments from two anonymous reviewers for this journal.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1. E.g Descartes (Citation1641/2017).

2. E.g., Locke (Citation1836/1996).

3. E.g., the Logical Empiricists’ extension from epistemology into scientific methodology.

4. It is sometimes suggested that this fact does not show that innateness is not a causal notion. Instead, it shows that innateness is not a categorical notion. That is, that there are degrees of innateness: traits are more or less innate to the extent that they are caused by internal forces. However, as internal and external influences are always necessary but insufficient for developed traits, it is hard to make sense of this graded notion of causality.

5. The carrying of information (i.e., the reducing of possibilities) can however be used a a heuristic: innate traits will typically carry less information about the environment than non-innate traits, due precisely to the fact that the latter (function to) reproduce environmental patterns.

6. Although they may be. See Wexler (Citation2003) for an argument, drawing on a very wide range of evidence, that certain features of a child’s innate knowledge of language is genetic.

7. Note that strictly, the claim is not that, for a rationalist system, the final state doesn’t reflect the environment, but that the system does not function to reflect the environment. Criticisms of rationalist proposals often fail due to a misunderstanding of this point. I will often talk simply of traits reflecting or not reflecting the environment, but it should always be kept in mind that what is meant is a functional claim about how this system develops, not merely a feature of the end-product of this development.

8. Obviously, nothing has been said about the details of E, whether it is, say, a Bayesian or a frequentist system, whether it responds to higher-order regularities or not, etc. which would determine its response to such strings. The example is merely illustrative.

9. O’Neill (Citation2015) rightly points out that this is a weaker notion than ‘canalization’ as introduced into the biological literature by Waddington (Citation1942). Waddington’s notion of canalization was a particular kind of invariance; namely, invariance in development as a result of developmental feedback mechanisms, usually as a product of selection for such invariance. While Ariew often suggests this more complex notion, most of the literature on innateness (e.g., Samuels (Citation2004) and Collins (Citation2005)) seems instead to view canalization as simple invariance, as stated in the body of the text.

10. So called ‘wild children’, raised in awful conditions in which they receive minimal linguistic input, never seem to fully acquire a language (see Curtiss et al. (Citation1974)). However, it seems that the linguistic input can apparently be relatively impoverished, as seen in the case of home sign (see Goldin-Meadow (Citation2005)).

11. Note that similar results could be obtained instead by appealing to a “normal environments condition’’, according to which a trait is not innate if it develops only in abnormal environments, as in Samuels (Citation2002). Such a condition faces all the usual worries with spelling out what it takes for an environment to be normal.

12. One difficulty in describing these debates as neutrally as possible is that empiricists sometimes object to using language like ‘competence’ or ‘faculty’ which at least suggest a specialized system. I shall continue using these terms simply to refer to the capacity to use and acquire a language, without presupposing the standard generativist account of in what such capacities consist.

13. A plausible mechanism for this would be the Baldwin effect, wherein genetic dispositions to learn new behaviors are selected for. Glackin (Citation2011) argues for an evolutionary account of language in just these terms.

14. However, Christiansen and Chater (Citation2008) argue in the opposite direction that we may plausibly view the rapidity with which language changes compared to the evolution of the human organism as requiring that it is language which adapts to the requirements of the speakers, and not vice versa. It is unclear what exactly to make of this proposal. Of course, it is agreed by all parties that language must, in order to be passed on from one generation to the next, be learnable, and so any “attempt’’ to modify language which cannot be learned by the next generation will fail. The dispute then is whether the features of the mind which make some languages better adapted are specific to language or not.

15. See e.g., Lewontin et al. (Citation1984), Rose and Rose (Citation2010), Buller (Citation2006), and Lickliter and Honeycutt (Citation2003).

16. Note that pure empiricism is not the claim that no properties of the organism matter for explaining the developed state. This position is indeed a priori false. How the organism responds to its environment will of course depend on what the organism is like. A pure empiricist system will have some structural properties which explain why it responds to the environment in a purely empiricist way, namely some developmental system (e.g., a learning algorithm) which functions to precisely reproduce the environmental patterns it encounters.

17. Those objecting to the perceived excesses of generative grammar tend to adopt a position something like this Goldberg (Citation2006), Tomasello (Citation2003), and Onnis et al. (Citation2008)).

18. Hale and Reiss (Citation2008) argue for an analogous view of phonology.

19. I am here assuming that debates about the purview of linguistics (e.g., whether linguistics is, following Chomsky, a science of the mind, or like Katz (Citation1980) a formal theory of Platonic abstracta, or extra-mental concrete reality as in Devitt (Citation2006)) do not arise in developmental linguistics. A theory of language acquisition must be a psychological theory, on pain of changing the subject. Of course, however, one’s views about linguistic theory may influence one’s theory of acquisition.

20. E.g., Sampson (Citation1989) and Pullum and Scholz (Citation2002).

21. Note that I am here assuming that both the rationalist and the empiricist models are themselves computational level models. On the first point, there is wide agreement. However, it is sometimes suggested that many paradigm empiricist models, especially Bayesian approaches, are best understood as algorithmic models. I am here following some of the leading voices in Bayesian cognitive science in insisting that they be viewed, just like their rationalist opposite numbers, as computational (Chater et al. (Citation2011)).

22. But see Van Dongen (Citation2006) for worries that the appeal to flat priors itself hides substantive assumptions which themselves influence the end result in often surprising ways. While van Dongen’s point is well taken, his advice on how to avoid bias in Bayesian modeling seems more difficult to adopt. Firstly, he proposes that a certain amount of prior knowledge should be allowed in the selection of priors. This is fine advice in many cases, but doesn’t apply to debates concerning nativism, where the shape of the priors may be exactly what is at issue. So, while his point is a sound one, the “uninformative priors approach’’ may be plausible in this case, even though it is not applicable across the board.

23. See Crain and Pietroski (Citation2001) for an excellent overview.

24. I am assuming here a model according to which language acquisition is profitably described as the acquisition of hypotheses (alternatively, rules or constraints, which I will use interchangeably). This has itself been challenged (see McClelland and Patterson (Citation2002)).

25. As in my above account of reflecting the environment, it is worth distinguishing the functioning of the system from the actual relationship between the system and the environment. In this case, what matters is whether the learner uses or relies on the available evidence to acquire H over H’. If not, H is innate, even if there is (unused) evidence available in the environment. The absence of such evidence from the environment is, of course, the best possible reason for claiming that learners are not using this evidence.

26. I am sidestepping the question of how these abstract classifications (like ‘auxiliary’ and ‘matrix’) are themselves acquired by the learner, although this is itself part of a powerful argument for nativism.

27. This is an oversimplification. Crucially, it is highly contentious whether only positive evidence of this sort should be included in the data set from which learners generalize. In particular, Bayesian models, such as Perfors et al. (Citation2010), often stress that the absence of certain constructions from the learner’s experience can itself function as evidence that such constructions are not possible. There are, however, significant problems with this kind of argument. See, in particular, Marcus (Citation1993) and Yang (Citation2015) for compelling empirical arguments that any system capable of excluding possible expressions on the basis of indirect negative evidence is liable to massively overgenerate and exclude many perfectly acceptable expressions. This problem is exacerbated in cases where the child’s language environment is very sparse, as in cases of deaf children with non-signing parents (Goldin-Meadow and Yang (Citation2017)). As mentioned above, the poverty of the linguistic data has itself been challenged by e.g., Reali and Christiansen (Citation2005). Gulordava et al. (Citation2018) develop computational models for the acquisition of hierarchical structure from a naturalistic corpus. See Yang et al. (Citation2017) for critical discussion of these kinds of model.

28. Various features of language acquisition can be used to strengthen this claim. For example, children must not only be exposed to these crucial data, they must also attend to them. As young children have been shown to be fairly weak at parsing complex sentences, it is not a given that even if they encounter sentences like 4 they will use them as evidence for or against their linguistic hypotheses.

29. Perhaps even R is slightly impure in that the letter which gets represented as output is always found in the input, as the first letter of the string. R was described this way for clarity, but even this reflection of the environment could be dropped. Say instead that R’s behavior was modeled by a set of 26 rules of the form: “if the first character of the input is ‘a’, output ‘bbbb … ’ ”, “if the first character is ‘b’, output ‘kkkk … ’ ”, and so on for all 26 classes of possible inputs.

30. This picture of parameters is somewhat out of date. Contemporary generative theory largely either views parametric variation as restricted to differences in the lexicon as in Borer (Citation2014), or rejects the idea of parameters, in this sense, entirely, as in Boeckx (Citation2010). This debate is highly complex, and so I will skip the details, but note that if a pure rationalism is to be maintained, something like parameters, accounting for linguistic variation, is necessary. It is for this reason that Boeckx advocates for a mixed approach, with an innate (rationalist) core, and variation accounted for by abstracting rules from the environment.

31. See e.g., Sakas (Citation2016) for a recent overview.

32. Fodor (Citation1975)’s position that (almost) all concepts must be innate can be understood analogously. Because there is no (known) procedure by which we can learn, i.e., rationally acquire, the information stored in our concepts, this information must come from within the system. Environmental stimuli can thus serve to trigger the occurrence or development of a concept, but the environment is limited to this causal role, rather than the traditional role as the source of this information.

33. I am here focusing on internal debates about how to develop a parametric model, for which this problem is one of the most severe. Those who reject the parametric view entirely are often motivated by precisely the observations about language that I mentioned as favoring empiricist models: that languages display high degrees of variation, and these variations correlate with environmental patterns. The rejection of so-called “micro-parameters”, purported parameters which correspond to these very fine-grained differences, is largely motivated in this way. See Newmeyer (Citation2005) for discussion.

34. E.g., A sentence which seems to be SVO can either be a result of a genuinely (underlying) SVO language (such as English), or an underlying SOV language with a rule that moves verbs into second position in surface form (such as German). There are complex empirical issues in this area. For example, if Kayne (Citation1994) is right, then all languages are SVO in their underlying structure. This would have important repercussions for a triggering account of language acquisition, as it may make the problem of ambiguous triggers easier to solve.

35. Other complications to the triggering model, such as the distinction between “global’’ and “local’’ triggers, i.e., triggers which unambiguously require a particular parameter setting no matter what other parameter settings are selected versus those which unambiguously call for a particular setting only given other settings, can be introduced in order to solve these kinds of issues. See Sakas and Fodor (Citation2012) for a thorough proposal along these lines.

36. Although it is an empirical question whether the contribution is language-specific or not, i.e., whether language is innate in the narrow, domain-specific sense or not.

37. This fact explains something puzzling about the terminological conventions in this debate: it is the empiricists who are committed to the explanation of language acquisition as a rational activity, whereas rationalists view this process as purely causal.

38. I am skimming over significant complications in the story here. These complications should not matter for our purposes.

39. An additional complication in this debate is how we ought apportion this linguistic competence to the learner’s psychological system. In particular, it may be that semi-productive rules are not acquired in the same way, i.e., by the development of the same psychological system, as the kinds of generally applicable principles and (possibly) parameters discussed in previous sections. A picture of this sort is suggested in Dupre (Citation2019). However, I take it that an account of language acquisition in general must account for all kinds of acquisition, whether this involves the development of just one specifically linguistic system or many. More on this issue in section 8.

40. Yang (Citation2016) Chapter 4.

41. An extra benefit is that such a proposal provides a neat explanation for historical language change. When, for whatever reason, the patterns in the environment are modified (say by the influx of speakers of different languages as a result of mass immigration), the children will pick up on, and reflect, such patterns. It is much more difficult to give a triggering-based account of this phenomenon. See Yang (Citation2000) for discussion.

42. See, for example, Chomsky (Citation2000) (p. 5).

43. An interesting possibility, however, would be that this mixed approach can itself resolve some of the issues with the Principles and Parameters approach. In particular, this approach seemed to flounder because inter-linguistic variation seemed to be too fine-grained, leading to the positing of too many micro-parameters, and too sensitive to the environment, as we saw in the discussion of semi-productive rules. If some of this variation and sensitivity can be viewed as acquired separately by empiricist-style models, it could be possible to revise such a picture and avoid these problems. On such a view, language variation reflects two distinct modes of language acquisition: parameter setting and learning. Of course, this would still fail to be a pure rationalist model. This would perhaps be a slightly messier model than those discussed in the text, but who expected cognitive science to be clean?

Additional information

Funding

This work was supported by the Leverhulme Trust [ECF-2020-424].

Notes on contributors

Gabe Dupre

Gabe Dupre is a Leverhulme Early Career Fellow, in the School of Social, Political, and Global Studies at Keele University. He works on philosophical issues in linguistic theory. Previously, he was a Teaching Fellow at Reading University, and before that he studied at the University of California, Los Angeles, and Bristol University.