Search in:

Language Acquisition Volume 26, 2019 - Issue 4

Submit an article Journal homepage

853

Views

CrossRef citations to date

Altmetric

Articles

The discontinuity model: Statistical and grammatical learning in adult second-language acquisition

Stefano RastelliUniversity of PaviaCorrespondence[email protected]

Pages 387-415 | Received 22 Jan 2018, Accepted 09 Nov 2018, Published online: 20 Mar 2019

Cite this article
https://doi.org/10.1080/10489223.2019.1571594
CrossMark

Sample our Behavioral Sciences journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/10489223.2019.1571594?needAccess=true

ABSTRACT

The Discontinuity Model (DM) described in this article proposes that adults can learn part of L2 morphosyntax twice, in two different ways. The same item can be learned as the product of generation by a rule or as a modification of a template already stored in memory. These learning modalities, which are often seen as opposed in language theory, integrate and superpose in adult SLA. Learners resort to grammatical rules and statistical templates under different circumstances during language processing. Ontogenetically, while in L1 acquisition, the natural endowment for language constrains statistical learners’ capacity by narrowing the hypothesis space; in adult SLA, statistics can reopen the window of opportunity for grammar and drive adult learners to derive part of L2 morphosyntax. This article proposes a computational and psycholinguistic model of how this might occur. According to this model, skewness between transition probabilities (TP) represents the triggering factor in both L1 and L2 acquisition. As fluctuation in TP drives children to individuate the words in a speech stream, so skewness between TP drives adult learners to discover the grammatical features that are hidden in asymmetric chunks.

Acknowledgments

I would like to thank Michael Long for his friendly support and for providing insightful discussion about the DM in the last three years. I would also like to thank Kiel Christianson and another anonymous reviewer for their time and competence.

Disclosure statement

No potential conflict of interest was reported by the author.

Notes

¹ There is no agreement about the meanings of N400 and P600 effects. The N400 has been related not only to frequency and SL but also to semantic memory retrieval and preactivation of information in semantic memory (Kutas & Federmeier Citation2011; Wlotko & Federmeier Citation2013). The P600 has been associated not only with reanalysis processes but also to more general controlled processes (e.g., task effects, context, frequency of occurrence of anomalies within the experimental task itself, etc.) and to strong semantic violations, as well as violations of event-level predictions (Kolk et al. Citation2003; Schacht et al. Citation2014; Kuperberg Citation2007). Most recent accounts of the P600 also associate it with the P300 component (Sassenhagen et al. Citation2014).

² Discontinuity is not equivalent to “nonlinearity,” and the DM differs from theories of nonlinear changes, such as Dynamic System Theory (DST). These theories focus on nonlinearity in learner production and try to account for discontinuities, jumps, U-shaped trajectories, and backsliding by using appropriate statistical models (de Bot et al. Citation2007). In these theories, nonlinear development is described by a continuous, rather than discontinuous, function. Nonlinearity—unlike discontinuity—does not imply the presence of superposing developmental trajectories.

³ Superposition—meant as the continuing simultaneous functioning of the two learning procedures—differs from “syncroninc variation,” an expression utilized in SLA studies to refer to the continued, albeit gradually decreasing coexistence of target-like and non-target-like forms that characterizes also advanced stages of acquisition. The difference lies in the fact that superposition concerns just target-like forms.

⁴ Unlike in this article, some authors use “chunk” to refer just to the mental representations of all multiword units in long-term memory. Such authors do not distinguish between symmetric and asymmetric chunks.

⁵ Multiword expressions are often defined as strings of words characterized by different degrees of fixedness and idiomaticity that act as a single unit at some level of linguistic analysis (Wray Citation2017).

⁶ For details on the statistical procedure utilized to calculate skewness holding between the components of this and of other chunks in written and spoken contemporary Italian, see Appendix section 4.

⁷ For a list of reference corpora of spoken and written Italian utilized in this study, see Appendix section 1.

⁸ I adopt Thompson and Newport’s (Citation2007:4) definition: “Transitional probability is a conditional probability statistic that measures the predictiveness of adjacent elements.” Other conditionalized statistics, such as mutual information, conditional entropy, z-scores, and t-scores, include additional information in the formula, such as directionality of the effect and the size of the reference corpus, but they all share the general assumption that the overall probability of co-occurrence of two events equals the ratio between the joint and the disjoint probabilities of the events.

⁹ Light verb constructions are formed by a verb devoid of semantic content (e.g., get, do, take in English) plus an element (e.g., a noun such as criticism, cleaning, exam) that carries the meaning of the whole expression (get harsh criticism, do the cleaning, take an exam; Grimshaw & Mester Citation1988).

¹⁰ A reviewer pointed out that the DM fits nicely with Romance and Germanic languages but gets less convincing when applied to typologically different families—for instance highly agglutinative languages or Semitic languages, with root-and-pattern morphology—because in those cases it is less evident either what may work as a cue for chunking (whether bound morphemes or roots) or how the asymmetry between the components of chunks can be calculated, given that such components are perhaps less clearly decipherable and detachable from one another in the input stream by L2 learners. A possibility is that—in such languages—the functioning of procedural memory and the availability of the combinatorial skills that are supported by it become even more crucial for the learning task. The separateness among different components that procedural memory needs in order to disentangle chunks and rearrange into new combinations is warranted by the uneven distribution of these elements in the input. It does not matter whether such uneven distribution is encoded by the stem-affix relationship, as in most Standard Average European Languages, or it is interpreted at a different layer of linguistic description, such as the alternation in the consonants of the root and the vowels.

¹¹ Tomasello (Citation2003) does not mention the passato prossimo among what he calls “constructions.” However, Joan Bybee (personal communication, April 2010) suggests that aux+PastP does constitute a construction, as “it is a cognitive unit which develops through the speaker’s experience with language.”

¹² “A collocation is a word combination whose semantic and/or syntactic properties cannot be fully predicted from those of its components, and which therefore has to be listed in a lexicon” (Evert Citation2005:17). According to Ellis and Odgen (Citation2017:606), collocations are “words with particular selection preference” (see also Matsuno Citation2017).

¹³ For example, the combination [modal auxiliary verb + budge] ‘will/won’t budge’ is a colligation because the English verb budge is attracted to this specific construction (Sinclair Citation1998:13; Hunston Citation2001:13; Tognini-Bonelli Citation2001).

¹⁴ http://corpora.dslo.unibo.it/coris_ita.html.

¹⁵ A reviewer observed that people might not be that good at remembering the word immediately preceding the word they are currently hearing or reading. As to reading, there is evidence that people automatically preprocess upcoming words and therefore make shorter fixations on words that were visible in the parafovea during preceding fixations (Li et al. Citation2015). For obvious reasons, the mechanism that would allow speakers to keep track of a preceding word when processing the successive—if any—must be different when the word is just heard. For instance, I have no knowledge of studies using auditory n-back tasks to test speakers’ capacity to retain memory of preceding words in a sequence. I agree that it would be a fascinating series of experiments to run.

¹⁶ For example, according to the formula T_n = T₁ n^−a, a ~ 0.4, where Tn = the time to perform a task after n trials; T₁ = the time to perform a task in the first trial; n = the number of trials.

¹⁷ A pivot is the word around which other words in the sentence revolve. The term pivot is more theoretically neutral than the term head, even though some authors (e.g., Malec Citation2010:129) utilize the expression collocational head as an equivalent of statistical pivot. A statistical pivot is always defined by its context. A grammatical pivot is defined by its labels (see the following).

¹⁸ In chunk (1b) in section 3.1, the statistical pivot is a meno, in chunk (1c) the statistical pivot is stai, in chunk (1d) it is arrivata, in chunk (1e) it is paura, in chunk (1f) it is colazione, and in chunk (1g) it is ridere.

¹⁹ In chunk (1c) in section 3.1, the grammatical pivot is come and the label is “Interrogative”; in chunk (1d), the grammatical pivot is è and the label “Unaccusative”; in chunks (1e) and (1g), the grammatical pivots are mette and fa and the label “Causative.”

²⁰ The DM assumes that statistically learned representations are processed with statistical processing mechanisms and that grammatically learned representations are processed with grammatical processing mechanisms. An anonymous reviewer pointed out that it is not logically necessary that there is a one-to-one mapping between learning and processing mechanisms. I acknowledge that the issue exists and that not even ERP data can disentangle it, given that the P600 is no longer directly associated exclusively with grammatical processing.

²¹ Native speakers of Italian do possess this kind of implicit knowledge. A sample of 176 Italian native speakers (mean age 21;03) at the University of Pavia (Italy) were asked to select A or E for the predicate correre ‘run’ at passato prossimo. When the predicate was followed by prepositional phrases such as those in the variation sets described previously, 73 participants out of 76 selected A, as expected.

²² In this article, the structurally overlapping sentences that make up a variation set do not necessarily need to be uttered within a very short time span—for instance, in the same speech turn—in order to have a learning effect.

²³ As constituting “an obligatory response to input” (Batterink et al. Citation2015).

²⁴ In Ullman’s version of the DP model, the word lexicon corresponds to what Paradis (Citation2009:15‒19) calls “vocabulary.” Paradis (Citation2004, Citation2009) in contrast defines the lexicon as the complete, structural set of variously combining items forming a network of combinable stems and affixes. In more recent articles, Paradis seems to agree with Ullman that lexical items are comparable to vocabulary items defined as “form-meaning pairs,” which are stored in declarative memory (Paradis Citation2013). In this article, by “lexicon” I mean the closed set of form-meaning pairs that are stored as unanalyzed wholes in the declarative memory system, as in Ullman’s version of the declarative/procedural model.

²⁵ Michael Ullman, personal communication, October 2017, Siena, Italy.

²⁶ A reviewer correctly pointed out that studies exist (e.g., Luke & Christianson Citation2016) demonstrating that in language processing the predictive capacity could be either limited or very selective (e.g., semantic and morphosyntactic information can be highly predictable even when word identity is not).

²⁷ Evidence is indirect because only the possible effects of gemination in performance data are disclosed, not how SL and GL impact online processing.

²⁸ This technique of course does not allow us to place the shift from SL to GL at an exact point in the developmental path nor to establish whether such a shift was gradual or instantaneous.

²⁹ Unlike the DM, Pinker (Citation1998) uses the terms combinatorial and non-combinatorial to define respectively grammatical and lexical (idiosyncratic, nonderivable) grammar.

³⁰ A reviewer pointed out that take a drink is fine.

Kutas, M. & K.D. Federmeier. 2011. Thirty years and counting: Finding meaning in the n400 component of the event-related brain potential (ERP). Annual Review of Psychology 62. 621–647. doi:10.1146/annurev.psych.093008.131123

PubMed Web of Science ®Google Scholar

Wlotko, E.W. & K.D. Federmeier. 2013. Two sides of meaning: The scalp-recorded N400 reflects distinct contributions from the cerebral hemispheres. Frontiers in Psychology 4/181. doi:10.3389/fpsyg.2013.00181

PubMed Web of Science ®Google Scholar

Kolk, H. H., D. J. Chwilla, M. Van Herten & P. J. Oor. 2003. Structure and limited capacity in verbal working memory: A study with event-related potentials. Brain and Language 85. 1–36. doi:10.1016/S0093-934X(02)00548-5

PubMed Web of Science ®Google Scholar

Schacht, A., W. Sommer, O. Shmuilovich, P. C. Martíenz & M. Martín-Loeches. 2014. Differential task effects on N400 and P600 elicited by semantic and syntactic violations. PLoS ONE 9. e91226. doi:10.1371/journal.pone.0091226

PubMed Web of Science ®Google Scholar

Kuperberg, G. R. 2007. Neural mechanisms of language comprehension: Challenges to syntax. Brain Research 1146. 23–49. doi:10.1016/j.brainres.2006.12.063

PubMed Web of Science ®Google Scholar

Sassenhagen, J., M. Schlesewsky & I. Bornkessel-Schlesewsky. 2014. The P600-as-P3 hypothesis revisited: Single-trial analyses reveal that the late EEG positivity following linguistically deviant material is reaction time aligned. Brain and Language 137. 29–39. doi:10.1016/j.bandl.2014.07.010

PubMed Web of Science ®Google Scholar

de Bot, K., W. Lowie & M. Verspoor. 2007. A dynamic systems theory approach to second lan- guage acquisition. Bilingualism: Language and Cognition 10(1). 7–21. doi:10.1017/S1366728906002732

Web of Science ®Google Scholar

Wray, A. 2017. Formulaic sequences as a regulatory mechanism for cognitive perturbations during the achievement of social goals. Topics in Cognitive Science 9(3). 569–587. doi:10.1111/tops.12257

PubMed Web of Science ®Google Scholar

Thompson, S. & E. L. Newport. 2007. Statistical learning of syntax: The role of transitional probability. Language Learning and Development 3(1). 1–42. doi:10.1080/15475440709336999

Google Scholar

Grimshaw, J. & A. Mester. 1988. Light verbs and Θ-marking. Linguistic Inquiry 19(2). 205–232.

Web of Science ®Google Scholar

Tomasello, M. 2003. Constructing a language. A usage-based theory of language acquisition. Cambridge, MA, & London, UK: Harvard University Press.

Google Scholar

Evert, S. 2005. The statistics of word cooccurrences: Word pairs and collocations. Dissertation, IMS, University of Stuttgart.

Google Scholar

Ellis, N.C. & D.C. Odgen. 2017. Thinking about multiword constructions: Usage-based approaches to acquisition and processing. Topics in Cognitive Science 9(3). 604–620. doi:10.1111/tops.12256

PubMed Web of Science ®Google Scholar

Matsuno, K. 2017. Processing collocations: Do native speakers and second language learners simultaneously access prefabricated patterns and each single word? Journal of the European Second Language Association 1(1). 61–72. doi:10.22599/jesla.17

Google Scholar

Sinclair, J. 1998. The lexical item. In Weigand E. (ed.), Contrastive lexical semantics. Amsterdam: John Benjamins. 1-25. doi:10.1075/cilt.171.02sin

Google Scholar

Hunston, S. 2001. Colligation, lexis, pattern, and text. In Scott M., Thompson G. (eds.), Pat terns of text, 13–33. Amsterdam-Philadelphia: John Benjamins.

Google Scholar

Tognini-Bonelli, E. 2001. Corpus linguistics at work. Amsterdam: John Benjamins. doi:10.1075/scl.6

Google Scholar

Li, N., F. Niefind, S. Wang, W Sommer & O. Dimigen. 2015. Parafoveal processing in reading Chinese sentences: Evidence from event-related brain potentials. Psychophysiology 52. 1361–1374. doi:10.1111/psyp.12502

PubMed Web of Science ®Google Scholar

Malec, W. 2010. On the asymmetry of verb-noun collocations. In Arabsky J., Wojtaszek A. (eds.), Neurolinguistic and psycholinguistic perspectives on SLA, 126–143. Bristol-New York: Multilingual Matters.

Google Scholar

Batterink, L.J., P.J. Reber, H.J. Neville & K.A. Paller. 2015. Implicit and explicit contributions to statistical learning. Journal of Memory and Language 83. 62–78. doi:10.1016/j.jml.2015.04.004

PubMed Web of Science ®Google Scholar

Paradis, M. 2009. Declarative and procedural determinants of second languages. Amsterdam- Philadelphia: John Benjamins.

Google Scholar

Paradis, M. 2004. Neurolinguistic theory of bilingualism. Amsterdam-Philadelphia: John Benjamins.

Google Scholar

Paradis, M. 2009. Declarative and procedural determinants of second languages. Amsterdam- Philadelphia: John Benjamins.

Google Scholar

Paradis, M. 2013. Late-L2 increased reliance on L1 neurocognitive substrates: A comment on Babcock, Stowe, Maloof, Brovetto & Ullman (2012). Bilingualism: Language and Cognition 16(3). 704–707. doi:10.1017/S1366728913000011

Web of Science ®Google Scholar

Luke, S.G & K. Christianson. 2016. Limits on lexical prediction during reading. Cognitive Psychology 88. 22–60. doi:10.1016/j.cogpsych.2016.06.002

PubMed Web of Science ®Google Scholar

Pinker, S. 1998. Words and rules. Lingua 106(1–4). 219–242. doi:10.1016/S0024-3841(98)00035-7

Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

The discontinuity model: Statistical and grammatical learning in adult second-language acquisition

Information for

Open access

Opportunities

Help and information

The discontinuity model: Statistical and grammatical learning in adult second-language acquisition

ABSTRACT

Acknowledgments

Disclosure statement

Notes

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature