3,129
Views
19
CrossRef citations to date
0
Altmetric
Original Articles

Bilingual Preschoolers’ Speech is Associated with Non-Native Maternal Language Input

, , &

ABSTRACT

Bilingual children are often exposed to non-native speech through their parents. Yet, little is known about the relation between bilingual preschoolers’ speech production and their speech input. The present study investigated the production of voice onset time (VOT) by Dutch-German bilingual preschoolers and their sequential bilingual mothers. The findings reveal an association between maternal VOT and bilingual children’s VOT in the heritage language German as well as in the majority language Dutch. By contrast, no input-production association was observed in the VOT production of monolingual German-speaking children and monolingual Dutch-speaking children. The results of this study provide the first empirical evidence that non-native and attrited maternal speech contributes to the often-observed linguistic differences between bilingual children and their monolingual peers.

Introduction

The considerable amount of language input that children receive from their parents is an important factor in children’s language learning (Hart & Risley, Citation1995; see Snow, Citation2014 for an overview). In particular, maternal input appears to be crucial in young children’s language development. For example, the amount of maternal language input is positively correlated with monolingual children’s lexical knowledge (Hoff, Citation2003; Hurtado, Marchman, & Fernald, Citation2008; Rowe, Citation2008, Citation2012; Rowe, Raudenbush, & Goldin-Meadow, Citation2012). Despite evidence that the speech of children reflects sociophonetic details similar to their mothers’ speech (Foulkes, Docherty, & Watt, Citation1999), the direct influence of maternal language input on children’s linguistic development beyond the lexicon has received little attention.

While the mother may be one of the most important input providers for a monolingual child, she may be the only input provider for a bilingual child in one language. Such contexts may arise, for example, when a child is born to parents with different native languages, and acquires the mother’s native language as a heritage language.

Crucially, children raised bilingually by parents with different native languages are commonly exposed to non-native language input from their parents in both the majority language and the heritage language: the parents are likely to speak each other’s native language (L1) as a second language (L2), but with non-native phonetic characteristics (e.g., Flege, Citation1987, Citation1991; Flege & Eefting, Citation1987; Flege & Port, Citation1981; Simon & Leuschner, Citation2010). Depending on their L2 use and the length of residence outside of their L1 community, parents of bilingual children may also speak their L1 differently than monolingual parents (Bergmann, Nota, Sprenger, & Schmid, Citation2016; Chang, Citation2012; De Leeuw, Mennen, & Scobbie, Citation2012; Flege, Citation1987; Flege & Hillenbrand, Citation1984; Major, Citation1992; Mayr, Price, & Mennen, Citation2012; Mennen, Citation2004; Sancier & Fowler, Citation1997; Ulbrich & Ordin, Citation2014; Ventureyra, Pallier, & Yoo, Citation2004). Changes to the L1 because, for example, the L2 widely replaces the use of the L1 after emigration, are known as first language attrition (Freed, Citation1982; Schmid, Citation2004).

The present study focuses on a largely understudied aspect of bilingual first language acquisition by testing whether differential phonetic aspects in the speech of sequential bilingual mothers are associated with their children’s speech production. The term differential refers to divergences in bilingual speech from the monolingual norm (Kupisch & Rothman, Citation2016). The focus of the present study is on the production of voice onset time (VOT) of mothers and children speaking Dutch and German in the Netherlands.

Bilingual children often produce VOT differently than their monolingual peers when they acquire two languages with different phonetic implementations of the voicing contrast, such as Dutch and German (Deuchar & Clark, Citation1996; Fabiano-Smith & Bunta, Citation2012; Johnson & Wilson, Citation2002; Kehoe, Lleó, & Rakow, Citation2004; Khattab, Citation2000, Citation2003). VOT is a cue to plosive voicing and describes the time between the release of a plosive’s closure and the onset of vocal fold vibration (Lisker & Abramson, Citation1964). Voiceless plosives (/p/, /t/, /k/) and voiced plosives (/b/, /d/, /ɡ/) primarily differ by means of VOT. For example, shortening of the VOT of /p/ in the German word [ˈpakən] <packen> (to pack/to grab) can cause it to be perceived as /b/, which also changes the word’s meaning to [ˈbakən] <backen> (to bake). The phonetic realization of voiceless and voiced plosives differs between languages. For example, German and English voiceless plosives have aspiration, which means that they are produced with long VOT values. German and English voiced plosives are realized with short positive VOT or short lag VOT. Occasionally, German and English voiced plosives are produced with negative VOT or prevoicing, which means that the vocal folds start vibrating prior to the release of the plosive. By contrast, voiceless plosives in, for example, Dutch, Arabic, Spanish, and Italian have short lag VOT. Voiced plosives in these languages require prevoicing, but it is nevertheless common that adult native speakers of a prevoicing language fail to produce prevoicing for a small proportion of their voiced plosives (Khattab, Citation2003; Van Alphen & Smits, Citation2004).

Differences between the speech of bilingual children and monolingual children have often been interpreted as resulting from cross-linguistic influence (CLI), that is, the interplay between two languages during language processing (Fabiano & Goldstein, Citation2005; Fabiano-Smith & Bunta, Citation2012; Kehoe, Citation2002; Kehoe et al., Citation2004; Lleó & Kehoe, Citation2002; Müller & Hulk, Citation2001; Paradis & Genesee, Citation1996). However, rather than emerging exclusively from CLI between bilingual children’s two languages, deviances of bilinguals’ speech from the monolingual reference point may, also, be related to the non-native and attrited language input bilingual children receive in each of their languages (Fish, García-Sierra, Ramírez-Esparza, & Kuhl, Citation2017).

Similarities between the phonetic properties of bilingual children’s input and their own speech production were previously reported in four case studies of bilingual and trilingual children whose language input in one language was limited to a small number of speakers (Deuchar & Clark, Citation1996; Khattab, Citation2003; Klinger, Citation1962; Mayr & Montanari, Citation2015). Klinger (Citation1962) describes the influence of atypical speech input on a three-year-old Spanish-English bilingual child’s global accent. The child’s only source of English language input was his older cousin, whose speech was atypical due to a cleft palate. The child adopted his cousin’s cleft palate speech symptoms when he spoke English, while his speech was unaffected in Spanish. This report illustrates the link between children’s language input and their speech production, and also highlights the impact of language input that comes from a single source, rather than from a diverse set of speakers.

Similarities between maternal input and a bilingual child’s speech production based on acoustic measures of VOT were first reported by Deuchar and Clark (Citation1996). The English-Spanish bilingual child they investigated had developed a voicing contrast between short lag VOT and aspiration in English at age 2;3 (years;months), while a covert voicing contrast within the short lag range appeared to be developing in Spanish. The native English-speaking mother made a similar covert voicing contrast in the short lag VOT range when she spoke Spanish, her L2. Given that target-like prevoicing is generally acquired after 2;3 years in monolingual acquisition, the authors did not draw strong conclusions about a possible link between maternal input and the child’s VOT production.

Input effects of maternal VOT on bilingual children’s VOT production were directly addressed by Khattab (Citation2003). Two siblings aged seven and ten years acquired Arabic as a heritage language in the United Kingdom from their parents, and both children resorted to differential phonetic realizations of the voicing contrast in Arabic. The older child made a covert voicing contrast within the short lag VOT range. The younger child avoided prevoicing by producing voiced plosives either prenasalized or as implosives, for which the airstream flows into the mouth. A similar non-target-like production pattern had been observed in the mother’s speech as well. Khattab’s study appears to be the first to demonstrate a link between specific acoustic patterns in bilingual children’s speech and their maternal language input. Moreover, the study suggests that input-production links may be limited to young children.

A nuanced view on input characteristics comes from a study on two trilingual sisters’ VOT production in all their three languages (Mayr & Montanari, Citation2015). The speech of the English-Italian-Spanish trilingual sisters and their primary input providers had been recorded when the children were 6;8 and 8;1 years of age. The children lived in the United States where they attended an Italian immersion school, in which both English and Italian were the languages of instruction. In addition, they were exposed to English from their father and to Italian from their mother. Their only source of Spanish input came from their monolingual Spanish-speaking nanny. In English and Spanish, the children produced target-like VOT, which was in line with the VOT produced by their father (English) and their nanny (Spanish). Non-target like VOT production was only observed in Italian, in which the children produced voiceless plosives with longer than target-like VOT, which can be regarded as more English-like VOT. It is intriguing that CLI seems to be present from English to Italian, while the children’s weakest language, Spanish, appeared to be unaffected by CLI from the majority language, English. The authors suggested that the children’s differential Italian VOT production is related to their exposure to non-native Italian at their Italian immersion school, which was predominantly attended by children for whom Italian was the L2. In Spanish, the sisters were exclusively exposed to a single speaker, who reportedly provided monolingual-like input, and this presumably stable phonetic input may have contributed to the children’s native-like acquisition of Spanish VOT. The study of Mayr and Montanari suggests that exposure to non-native input may be associated with children’s non-native-like speech production.

Taken together, the four studies observed striking similarities between the language input and a bilingual child’s speech production. Given the small sample sizes, these studies were descriptive and did not allow for statistical analyses of association between the input and the children’s speech at an individual level. Such analyses necessarily require a larger sample of children and are essential to provide evidence for the claim that phonetic characteristics of the speech input are indeed associated with a child’s speech production.

In summary, non-native language input appears to be a crucial –but largely unexplored– factor that should be acknowledged when comparing the linguistic skills of bilingual and monolingual children. The present study is the first to address whether bilingual preschoolers’ speech production is associated with non-native and attrited maternal speech input .

The current study

The current study investigates whether individual variation in VOT production of Dutch-German simultaneous bilingual children reflects variation in the VOT production of their sequential bilingual mothers who speak German as L1 and Dutch as L2. All participants lived in the Netherlands and the children were immersed in a Dutch language environment. They were only exposed to German through their family members, and primarily through their German mothers.

Bilingual children often produce VOT differently from monolingual children in at least one of their languages (Deuchar & Clark, Citation1996; Fabiano-Smith & Bunta, Citation2012; Johnson & Wilson, Citation2002; Kehoe et al., Citation2004; Khattab, Citation2000, Citation2003). Most of the bilingual children examined in the above-mentioned studies were raised in a one-parent-one-language family, and in order to communicate with each other, at least one of their parents spoke the other parent’s native language as L2. Adult L2 VOT typically diverges from monolinguals’ VOT (Flege, Citation1987, Citation1991; Flege & Eefting, Citation1987; Flege & Port, Citation1981), and non-native VOT production has recently been observed in the speech directed toward bilingual infants (Fish et al., Citation2017). Furthermore, the parents’ L1-VOT may diverge from monolingual native speakers’ VOT due to language attrition, presumably resulting from reduced language contact with other native speakers (Flege, Citation1987; Major, Citation1992; Mayr et al., Citation2012; Sancier & Fowler, Citation1997). Consequently, bilingual children are likely exposed to non-native and attrited VOT from their parents.

The present study builds on two recent studies that investigated VOT production of bilingual children (Stoehr, Benders, van Hell, & Fikkert, Citation2018) and their bilingual parents (Stoehr, Benders, van Hell, & Fikkert, Citation2017). These two studies, which are summarized below, revealed conspicuous similarities at the group-level between the VOT production of bilingual children and the non-native and attrited VOT production of their mothers.

In a large-scale study, speech analyses of Dutch-German simultaneous bilingual preschoolers aged 3;6 to 6;0 revealed VOT differences between the bilinguals and their age-matched monolingual Dutch-speaking and German-speaking peers (Stoehr et al., Citation2018). In the heritage language, German, the bilingual children produced VOT for voiced plosives and voiceless plosives differently from their monolingual German-speaking peers: they produced more voiced plosives with prevoicing, and they produced voiceless plosives with shorter VOT. In the majority language, Dutch, the differences between bilinguals and monolinguals were less pronounced. The bilinguals produced voiced plosives with prevoicing less consistently than monolingual Dutch-speaking children. However, even the monolingual Dutch-speaking children did not yet reach adult-like consistency in prevoicing for voiced plosives, which is in line with previous research reporting late mastery of prevoicing in monolingual acquisition (Khattab, Citation2000; MacLeod, Citation2016). The bilingual children produced Dutch voiceless plosives only with slightly longer VOT than monolingual children, and these productions still fell within monolingual-like ranges. In sum, the observed differences in VOT production between bilingual and monolingual children in both German and Dutch can be interpreted as resulting from CLI that operates from Dutch to German, and to a lesser extent also from German to Dutch.

A factor contributing to the bilingual children’s differential VOT production may be their exposure to non-native and attrited speech. The bilingual children’s mothers who spoke German as L1 and Dutch as L2 exhibited a VOT production pattern that was strikingly similar to the children’s VOT production at the group-level (Stoehr et al., Citation2017).Footnote1 The mothers had been living in the Netherlands for several years at the time of testing, and it appeared that they were affected by phonetic attrition of L1-VOT: they produced German voiceless plosives with shorter, and therefore more Dutch-like VOT than monolingual speakers of German. Voiced plosives, on the other hand, were produced with variable VOT including prevoicing (about one third of all productions) and short lag VOT by the bilingual mothers and monolingual German-speaking women alike.

Moreover, the bilingual mothers also produced non-native VOT in Dutch, their L2.Footnote2 They produced longer VOT in voiceless plosives and produced fewer voiced plosives with prevoicing in comparison to monolingual Dutch-speaking women.

Given these deviations from monolingual-like VOT production in both the bilingual children and their mothers, the present study investigates whether bilingual children’s differential VOT production is associated with their own mothers’ non-native (L2) or attrited (L1) VOT production. To assess whether such an association is limited to a bilingual acquisition context or arises during language acquisition in general, we also test whether an association between VOT production and language input can be observed in monolingual child–mother dyads who speak either Dutch or German.

Regarding the outcome of this study, we hypothesize that a positive association exists between maternal VOT and the VOT production of both bilingual and monolingual children. This hypothesis is based on previous research, which reported maternal input effects in monolingual children’s lexical growth (Hoff, Citation2003; Hurtado et al., Citation2008; Rowe, Citation2008, Citation2012; Rowe et al., Citation2012) and observations of similarities between phonetic aspects of the input and the speech of monolingual children (Foulkes et al., Citation1999) and bilingual children (Deuchar & Clark, Citation1996; Khattab, Citation2003; Klinger, Citation1962; Mayr & Montanari, Citation2015). Given that the bilingual children’s regular exposure to German is restricted to their mothers, we hypothesize that the association is stronger in German than in Dutch.

Method

Participants

Seventy-four children aged between 3;6 and 6;0 participated in this study: 23 Dutch-German simultaneous bilinguals (12 female, Mage = 4;8, SDage = 9 months), 26 Dutch-speaking monolinguals (13 female, Mage = 4;10, SDage = 10 months), and 25 German-speaking monolinguals (16 female, Mage = 4;7, SDage = 10 months). In addition, each child’s mother participated.

Based on parental report, all children were typically developing and had no speech impairments or delays, and no neurological, auditory, or cognitive impairments. All bilingual children lived in the Netherlands from birth. The mothers of all bilingual children were native speakers of German and spoke Dutch as L2. Twenty bilingual children had a Dutch father, and the remaining three bilingual children had a German father. Out of the three bilingual children whose parents were both native speakers of German, two were exposed to Dutch from native speakers from birth. The third child’s exposure to Dutch commenced at 0;6 years when she started attending daycare. On average, the bilingual children were exposed to German for 45% of the day (range 11–69%, SD = 15%) at the time of testing, as determined by the Bilingual Language Experience Calculator (BiLEC; Unsworth, Citation2013).

An additional 23 children (10 bilinguals and 13 monolinguals) had been tested, but they were excluded from the analysis. The bilinguals were excluded either due to exposure to a third language (= 3) or onset of bilingualism after the first year of life (= 1). In addition, bilingual children born to a Dutch-speaking mother were also excluded (= 6) to obtain a more homogeneous data set for this study. The 13 monolingual children were excluded either due to inability to complete the task (= 1), experimenter error (= 1), exposure to non-native speakers at home (= 4), or missing speech data of the mother (= 7).

The mothers of the bilingual children learned Dutch at an average age of 23 years (range 8–33 years, SD = 6 years) when they moved to the Netherlands. Twenty-one mothers reported frequent use of German and Dutch. One mother reported occasional use of Dutch and one mother did not provide information on her usage of Dutch. The mothers rated themselves as very proficient in speaking Dutch (on a scale from 0 [virtually no fluency] to 5 [native fluency]: = 4.0, SD = 0.64, range 3–5), and almost native-like in understanding Dutch (on a scale from 0 [almost no understanding] to 5 [native understanding]: = 4.4, SD = 0.51, range 4–5). The bilingual families lived within a radius of 100 kilometers from Nijmegen in the Central Eastern Netherlands and were tested at their homes.

Four of the monolingual German-speaking mothers knew some Dutch, but none of them reported regular use of a language different from German. The monolingual German-speaking child-mother dyads were tested in Central Western Germany (= 23) and Northern Germany (= 2).

Two monolingual Dutch-speaking mothers reported speaking some German, and three reported speaking English sporadically. All monolingual Dutch-speaking child-mother dyads were tested in Nijmegen or its periphery. Although most mothers in both groups did not report speaking additional languages, schooling in Germany and the Netherlands requires English language classes in high school, suggesting that all mothers knew at least some English.

Materials and procedure

The investigated plosives were voiceless /p/, /t/ and /k/ and voiced /b/ and /d/. The voiced dorsal plosive /ɡ/ does not exist in Dutch and was therefore not included in either language in the analyses. For each of the five plosives, six target words per language were selected from developmental vocabulary lists (for German: Grimm & Doil, Citation2000; Szagun, Stumper, & Schramm, Citation2009; for Dutch: Zink & Lejaegere, Citation2002). The 30 target words per language were nouns recognizable on pictures, which started with a singleton onset plosive followed by a vowel, such as the Dutch word “kast” (cupboard). Lists of the target words can be found in Appendix 1A (Dutch) and Appendix 1B (German).

To keep the children engaged, the pictures representing the target words were presented in two different picture-naming tasks. In the story task, the experimenter read a custom-made story to the child, which was set up in Microsoft Powerpoint and presented on a laptop computer. Within the story, the target words were replaced with pictures representing the words. Each picture occurred on a separate slide and the children were prompted to name each picture.

The picture naming game was designed as a lotto matching game. The child and the experimenter each had one lotto board in DIN A3 format. Half of the pictures were printed on the child’s lotto board and the other half were printed on the lotto board of the experimenter. The same pictures were printed on individual 6 × 6 cm cards, which had to be matched to a lotto board. The player who first collected all matching cards on her lotto board won the game. The child and the experimenter took turns in turning around one card at a time. During the game, the experimenter instructed the child to name the pictures for a hand puppet. A koala hand puppet that was introduced as being unfamiliar with the words was used in the Dutch session, and an allegedly blind mole hand puppet was used in the German session. The order of the cards was randomized with two exceptions: to let the child win the game, the last card on the stack always belonged to the experimenter’s lotto board. In addition, the second last card always belonged to the child’s lotto board, which ensured that the game did not end prematurely.

The pictures in both tasks were color photographs and color drawings.Footnote3 Each production of the child entered the analysis. If a child did not say the target word, the experimenter gave hints without saying the word or the initial plosive. If the child did not know the word after three elicitation attempts, the experimenter named the picture and continued to the next trial.

Testing took place in a quiet room at the children’s homes. At the beginning of the session, the parents gave informed consent. A German native speaker administered the German session and a Dutch native speaker administered the Dutch session. The bilinguals’ two sessions were scheduled approximately two weeks apart. The order in which the Dutch and German sessions were administered was counter-balanced across children. During these testing sessions, the mothers completed a picture-naming task. Their speech was elicited in an adult-directed register using the same 6 × 6 cm picture cards used in children’s picture naming game. Moreover, the mothers completed a language background questionnaire. The questionnaire for bilingual children was based on the BiLEC (Unsworth, Citation2013). The questionnaire for monolingual children was custom-made and screened for potential exposure to additional languages and foreign accents. Throughout the session, the children were rewarded with stickers. At the end of each testing session, families were compensated with their choice between €10 or a book.

Recordings and VOT measurements

Recordings were made with an Olympus Linear PCM Recorder LS-10 with uncompressed 24bit/96 kHz recording capability. The first author measured VOT of all participants in Praat (Boersma & Weenink, Citation2015) taking into account waveforms and spectrograms viewed at 0–5,000 Hz. VOT was defined as the time interval between the onset of the plosive’s burst release and the onset of vocal fold vibration measured at the nearest zero-crossing.

Three additional phonetically trained coders measured a total of 25% of the data. Inter-coder reliability indicated 99% agreement. Measurements for voiceless plosives were considered in agreement when they differed in less than 10 ms (Fabiano-Smith & Bunta, Citation2012). For voiced plosives, measurements were considered in agreement when both coders agreed on the presence or absence of prevoicing.

In total, 11% of the child data were excluded from the analysis because VOT could not be unambiguously measured, for example due to background noise or coarticulation when target words were not produced in isolation.Footnote4 In addition, 8% of the child data could not be included in the analysis because the corresponding maternal production data could not be measured unambiguously, for example because of background noise.

Results

presents the mean VOT durations and standard deviations for the bilingual and monolingual children and their mothers, separated by language, voicing, and consonantal place of articulation. Because the production of voiced plosives was bimodally distributed in all groups and both languages, displays mean VOT duration (standard deviations in parentheses) for prevoiced and devoiced voiced plosives separately, followed by the proportion of prevoiced voiced plosives. A more detailed overview of the mean VOT durations (voiceless plosives) and the proportion of prevoiced productions (voiced plosives) for each individual child-mother dyad are presented in Appendices 2A (bilinguals), 2B (Dutch monolinguals), and 2C (German monolinguals).

Table 1. Mean VOT durations over participants by language, group and consonantal place of articulation (SD in parentheses). Information for voiced plosives are displayed separately for prevoiced and devoiced productions followed by the proportion of prevoiced voiced plosives

Statistical models

Mixed effects linear and logistic regression models using an alpha-level of .05 were performed in R (R Core Team, Citation2013). Separate models were run for bilingual and monolingual children. Per group, three analyses were conducted. The main model combined data of voiceless and voiced plosives. Because VOT of voiced plosives was bimodally distributed, violating the assumption of normality in linear regression, data were also analyzed separately for voiceless and voiced plosives. The sub-model on voiceless plosives was designed as a mixed effects linear regression model in line with the main model. The sub-model on voiced plosives was designed as a mixed effects logistic regression model. These models are described in detail below.

Main models

In the main mixed effects linear regression models, the dependent variable was the children’s VOT for each target word they produced in the study. As fixed effects, the bilingual model used Maternal VOT (continuous, in ms), Voicing (voiced = –1, voiceless = 1), Language (Dutch = –1, German = 1), Exposure to German (continuous, in percent; centered around zero; inversely related to Exposure to Dutch), Place of Articulation (labial = 1 vs. coronal = 0; and coronal = 0 vs. dorsal = 1), and Word Length (monosyllabic = –1, disyllabic = 1).

Based on the results of Stoehr et al. (Citation2017, Citation2018), the following interaction terms including lower-level interactions were entered: a four-way interaction term between Maternal VOT * Voicing * Language * Exposure to German to test the hypothesis that Exposure to German is associated with bilingual children’s VOT production only in German voiceless plosives. This interaction term also allows for testing the hypothesis that the magnitude of the input-production association in bilinguals is stronger in German than in Dutch. In addition, a three-way interaction between Language * Voicing * Place of Articulation (labial vs. coronal) was included to test the hypothesis that bilingual and monolingual children and adults produce more voiced labial plosives than voiced coronal plosives with prevoicing, with a larger magnitude of this effect in German than in Dutch. A two-way interaction between Language * Place of Articulation (coronal vs. dorsal) was added to test whether bilingual adults produce longer VOT in dorsal than in coronal plosives in German. Finally, a three-way interaction between Language * Voicing * Word Length was included to examine whether bilingual children’s and adults’ VOT production is associated with Word Length in German voiceless plosives.

As random effects, the model included intercepts for Child and Target Word, as well as by-Child random slopes for Language and Voicing. The model for monolingual children was the same with the exceptions that it excluded Exposure to German as fixed effect and as interaction component, and it moreover excluded the random slopes for Language.

Sub-models voiceless plosives

The sub-models on voiceless plosives were based on the main model, but excluded the fixed effect Voicing and the by-Child random slopes for Voicing. Moreover, Voicing was removed from the interaction terms that were included in the main model.

Sub-models voiced plosives

The sub-models on voiced plosives were mixed effects logistic regressions accounting for the bimodal distribution of VOT in voiced plosives. In these analyses, the dependent variable was the absence (coded as 0) or presence (coded as 1) of prevoicing for each token. Tokens were coded as prevoiced when vocal fold vibration was present prior to burst release. As fixed effects, the bilingual model included Maternal VOT (prevoiced = 1, devoiced = 0), Language (Dutch = –1, German = 1), Exposure to German (in percent, centered around zero), and Place of Articulation (labial = 1, coronal = 0).

A three-way interaction term between Maternal VOT * Language * Exposure to German was included to test whether the effect of Maternal VOT on bilingual children’s VOT is stronger in the heritage language German, and whether this association is dependent on Exposure to German. Based on the results of Stoehr et al. (Citation2017, Citation2018) a two-way interaction term between Language * Place of Articulation was entered to test their finding that prevoicing occurs more frequently at the labial than the coronal place of articulation, with a larger effect in German than in Dutch.

As random effects, the model included intercepts for Child and Target Word and by-Child random slopes for Language. The model for monolingual children was the same with the exception that it excluded the fixed effect Exposure to German and also the random slopes for Language. Moreover, Exposure to German was removed from the interaction terms that were included in bilingual model.

In the following, the results regarding the effect of Maternal VOT and interactions involving Maternal VOT are reported. Other significant effects and interactions regarding the other independent variables included in the analyses are presented in Appendix 3.

Bilinguals

Main model

A mixed effects linear regression model based on 2974 observations was conducted to test if maternal VOT production is associated with bilingual children’s VOT production in voiceless and voiced plosives. The results revealed the predicted positive association between maternal VOT and the bilingual child’s own VOT production (β = 0.124, SE = 0.045, = 2.513, = .012). There was no significant interaction between Maternal VOT and Language, suggesting that the association is present in both Dutch and German (β = 0.042, SE = 0.042, = 1.000, > .250).

Sub-model voiceless plosives

A mixed effects linear regression model based on 1843 observations was conducted to test if maternal VOT production is associated with bilingual children’s VOT production in voiceless plosives. This sub-model confirmed the positive association between maternal VOT and the bilingual child’s own VOT production in voiceless plosives (β = 0.132, SE = 0.0543, = 2.436, = .015). No significant interaction between Maternal VOT and Language was detected, suggesting that the association is present in both Dutch and German voiceless plosives (β = 0.005, SE = 0.053, = 0.100, > .250).

Sub-model voiced plosives

A mixed effects logistic regression model based on 1131 observations was conducted to confirm that the effect of maternal VOT holds in a model that accounts for the bimodal distribution of VOT in voiced plosives. However, the model did not detect a significant association between maternal VOT and bilingual children’s VOT production in voiced plosives (β = 0.205, SE = 0.231, = 0.889, > .250). Moreover, no significant interaction between Maternal VOT and Language was detected (β = –0.392, SE = 0.228, = –1.718, = .086).

In summary, the results suggest that the VOT production of the mothers who speak German as L1 and Dutch as L2 are associated with their children’s VOT production in both Dutch and German. Furthermore, this effect is only detected in bilingual children’s production of voiceless plosives, and it could not be detected in a model which accounts for the bimodal distribution of VOT in voiced plosives.

Monolinguals

Main model

A mixed effects linear regression model based on 3172 observations was conducted to test if maternal VOT is associated with monolingual children’s VOT production in voiceless and voiced plosives. The model did not detect a significant association between maternal VOT and the monolingual child’s own VOT production (β = 0.052, SE = 0.046, = 1.149, < .250), but a significant interaction between maternal VOT and Voicing was detected (β = 0.099, SE = 0.046, = 2.184, = .029). This interaction was investigated by analyzing the data for voiceless and voiced plosives separately. In these post-hoc analyses, an effect of maternal VOT on monolingual children’s VOT production was detected neither in voiceless plosives (β = 0.092, SE = 0.062, = 1.481, = .139) nor in voiced plosives (β = –0.046, SE = 0.033, = –1.372, = .170). Instead, the interaction seems to reflect differences in the direction of the association between maternal VOT and monolingual children’s VOT for voiceless plosives (positive β) and voiced plosives (negative β), neither of which is significant in its own right. In addition, no significant interaction between Maternal VOT and Language was detected (β = –0.061, SE = 0.045, = –1.333, = .183).

Sub-model voiceless plosives

A mixed effects linear regression model based on 1985 observations was used to test if maternal VOT production is associated with monolingual children’s VOT production in voiceless plosives. As reported in the post-hoc test accompanying the main model above, no association was detected between maternal VOT and the monolingual child’s own VOT production in voiceless plosives. Moreover, no significant interaction between Maternal VOT and Language was detected (β = –0.116, SE = 0.062, = –1.853, = .064).

Sub-model voiced plosives

To account for the bimodal distribution of VOT in voiced plosives, a mixed effects logistic regression model based on 1187 observations was additionally conducted. This model tested if an effect of maternal VOT production on monolingual children’s VOT production is detected in voiced plosives using a model that accounts for the bimodal distribution in the data. Like the main model, the logistic regression model did not detect a significant association between maternal VOT and the monolingual children’s VOT production in voiced plosives (β = –0.468, SE = 0.288, = –1.627, = .104). No significant interaction between Maternal VOT and Language was detected (β = –0.060, SE = 0.288, = –0.208, > .250).

Discussion

The primary aim of the present study was to test whether individual variation in Dutch-German bilingual preschoolers’ VOT production is related to individual variation of VOT in their mothers’ native language (L1) and second language (L2) speech. The bilingual children acquired German as a heritage language predominantly from their mothers, who were L1-speakers of German. Dutch was the bilingual children’s majority language, and was spoken by their fathers, in the broad social environment, as well as by the bilingual children’s mothers who were L2-speakers of Dutch. In addition, this study sought to answer whether such an input-production association arises during language acquisition in general and is, thus, also present between the speech of monolingual children and their monolingual mothers. The results of this study represent the first statistical evidence that differential speech input contributes significantly to a bilingual child’s differential speech production.

We hypothesized that maternal VOT is associated with the VOT production of bilingual and monolingual children, and this hypothesis was partially confirmed. An association between maternal input and the bilingual children’s production was present in the heritage language German, in which the mother was the primary input provider. Moreover, an input-production association was also observed in the bilinguals’ majority language Dutch, in which the input was provided by many speakers besides the mother. The input-production associations in German and Dutch, however, were only detected in the production of voiceless plosives. Against our hypothesis, no input-production association was detected in the speech of monolingual children. In this section, we first discuss reasons that may contribute to the presence of an input-production association in bilingual acquisition, and to the apparent absence of this association in monolingual acquisition. We then discuss why maternal VOT was only detectably associated with the bilingual children’s VOT production of voiceless plosives.

The majority of the bilingual children in this study were raised by parents with different native languages. The mothers of the bilingual children in the present study all moved from Germany to the Netherlands before their children were born and used their L2, Dutch, as means of communication in everyday life. Yet, the mothers produced non-native VOT in Dutch (Stoehr et al., Citation2017). Moreover, the mothers’ restricted use of their L1, German, caused their VOT for voiceless plosives in German to become more Dutch-like (Stoehr et al., Citation2017).

The observed association between maternal VOT and the bilingual children’s VOT production suggests that bilingual children are affected by their mothers’ non-native speech in the majority language Dutch as well as by their mothers’ attrited speech in the heritage language German. This finding represents important evidence against the implicit assumption that bilingual children and monolingual children are exposed to similar input. The direct influence of phonetic aspects in maternal speech on bilingual children’s speech production puts into perspective the often-observed differences in VOT production between bilingual and monolingual children (Deuchar & Clark, Citation1996; Fabiano-Smith & Bunta, Citation2012; Johnson & Wilson, Citation2002; Kehoe et al., Citation2004; Khattab, Citation2000, Citation2003). The results of the present study specifically show that there are factors beyond CLI that can contribute to these differences in the speech of bilingual and monolingual children. The present study furthermore supports previous case studies describing striking similarities between the input and bilingual children’s speech production with statistical evidence (Deuchar & Clark, Citation1996; Khattab, Citation2003; Klinger, Citation1962; Mayr & Montanari, Citation2015). In sum, when phonetic aspects of the speech of bilingual children differ from monolingual children, this can in part be attributed to differences in their input.

One crucial difference between monolingual and bilingual children in this study is the bilingual children’s exposure to their mothers’ non-native Dutch at home. By contrast, none of the monolingual children were exposed to non-native speakers of their native language in their immediate social environment, as confirmed by parental report. An association between maternal VOT and children’s VOT is possibly present in bilingual and monolingual children alike, but there may not be sufficient individual variation in the VOT production of monolingual mothers and children to detect such an association. In particular, the VOT of monolingual Dutch-speaking children and their mothers is the most consistent (SD = 11 ms and 5 ms, respectively). In German, monolingual children and mothers show a similar degree of consistency in VOT production (SD = 14 ms and 11 ms, respectively), while the bilingual children produce more variable VOT (SD = 23 ms). Future research is required to test whether input-production associations are detected in a monolingual acquisition context that is likely to yield more individual variation, for example involving dialectal or sociophonetic variation.

It appeared in the present data that only the bilingual children’s production of voiceless plosives, but not voiced plosives, in both German and Dutch were affected by maternal VOT production. This finding raises the question why bilingual children seem to adopt the VOT of their mothers for voiceless plosives, but not for voiced plosives. One explanation is that maternal VOT of voiceless plosives is a target that children can easily match, as short lag VOT and aspiration are relatively less complex in their articulatory gestures than prevoicing, and they are also acquired earlier in production (Kewley-Port & Preston, Citation1974; Macken & Barton, Citation1980a, Citation1980b). For German voiced plosives, alternations between short lag VOT and prevoicing were common in the speech of the bilingual mothers and their children, and at the group-level, both mothers and children produced approximately one third of all German voiced plosives with prevoicing. Given that prevoicing occurs as free variation in German (Jessen, Citation1998), the production of prevoicing in word-initial singleton plosives does not seem to follow predictable rules in German. These variable input patterns render it unlikely to observe word-specific similarities in mothers’ and children’s production of prevoicing in German at the level of individual words.

In Dutch, the bilingual children also produced about one third of all voiced plosives with prevoicing at the group-level, whereas the mothers produced almost two thirds of all Dutch voiced plosives with prevoicing at the group-level. Two factors are likely to hinder direct input-production associations between maternal VOT and child VOT of voiced plosives in Dutch. First, the mothers are non-native speakers of Dutch, and do not produce all of their Dutch voiced plosives with prevoicing. Their use of prevoicing may therefore also be to some extent idiosyncratic in Dutch. As in German, a variable production pattern of voiced plosives with either prevoicing or short lag VOT makes it unlikely to observe a word-specific match in the production of voiced plosives between mothers and their children in Dutch. Second, prevoicing requires complex velopharyngeal adjustments (Kewley-Port & Preston, Citation1974), which do not appear to be completely mastered by children between 3;6 and 6;0 years of age (Khattab, Citation2000; MacLeod, Citation2016). For this reason, the children in this study may motorically not be able to match their mothers’ VOT production for voiced plosives in Dutch.

Conclusions

The current study provided novel evidence that individual variation in maternal language input is associated with individual variation in bilingual children’s speech production. No input–production association was detected in monolingual children’s production of VOT, which may result from limited variance in monolinguals’ VOT production. The association between maternal VOT production and bilingual children’s VOT production suggests that linguistic differences between simultaneous bilingual children and their monolingual peers are not exclusively driven by CLI between a bilingual child’s linguistic systems. The speech of the bilingual children’s mothers, which diverges from monolingual mothers’ speech because the bilinguals’ mothers are L2-speakers and L1-attriters, contributes to differential speech production of three-and-a-half to six-year-old bilingual children.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 The participants investigated in Stoehr et al. (Citation2017), furthermore, included six fathers who spoke German as L1 and Dutch as L2. For the purpose of the present study, we verified that the same pattern of results holds after exclusion of those additional six participants.

2 In Stoehr et al. (Citation2017), the parents’ L2-Dutch VOT was compared to the VOT of Dutch native-speakers who spoke German as L2 to address their acquisition outcomes. Given that the present study is concerned with differences in the input of bilingual and monolingual children, the data of Stoehr et al. (Citation2017) has been reanalyzed using monolingual native speakers as a reference group. These findings are reported here.

3 Six out of 30 pictures in the Dutch task and 4 out of 30 pictures in the German task were color drawings because they represented the words better than photographs.

4 The tasks were designed to elicit isolated nouns: the gaps in the story task were bare nouns, and during the picture naming game, the experimenter named the determiner right away when a card was turned over. Because of these strategies, the majority of the data was not affected by coarticulation.

References

Appendix 1A

Dutch target words

Appendix 1B

German target words

Appendix 2A

– Data overview bilingual child-mother dyads

Mean VOT duration in ms (voiceless plosives) and proportion of prevoiced productions (voiced plosives) for each bilingual child-mother-dyad by language and consonantal place of articulation (SD in parentheses)

Appendix 2B

– Data overview monolingual Dutch-speaking child-mother dyads

Mean VOT duration in ms (voiceless plosives) and proportion of prevoiced productions (voiced plosives) for each monolingual Dutch-speaking child-mother-dyad by consonantal place of articulation (SD in parentheses)

Appendix 2C

– Data overview monolingual German-speaking child-mother dyads

Mean VOT duration in ms (voiceless plosives) and proportion of prevoiced productions (voiced plosives) for each monolingual German-speaking child-mother-dyad by consonantal place of articulation (SD in parentheses)

Appendix 3

Statistics (reporting significant effects of and interactions with variables other than Maternal VOT (see Results section for more details).

Bilinguals

Main model: The main model revealed a significant effect for Voicing (β = 26.68, SE = 7.869, t = 3.390, p < .001), showing that the bilingual children produced longer VOT for voiceless than for voiced plosives. A significant effect for Place of Articulation (labial vs. coronal; β = -14.50, SE=3.587, t=-4.043, p < .001) shows that the bilingual children produced labial plosives with shorter VOT than coronal plosives. A significant three-way interaction between Language, Voicing, and Exposure to German (β = 0.206, SE = 0.101, t = 2.038, p = .041) reflects the recent finding that more exposure to German (the heritage language) is associated with longer VOT in German voiceless plosives, while no such effect was observed for German voiced plosives and Dutch voiced and voiceless plosives (Stoehr et al., Citation2018). No other significant effects or interactions were observed.

Sub-model voiceless plosives: The model revealed an effect for Language (β = 13.46, SE = 4.507, t = 2.987, p = .003), which shows that the bilingual children produced longer VOT in German voiceless plosives than in Dutch voiceless plosives. A significant effect for Place of Articulation (labial vs. coronal; β = -16.58, SE = 4.823, t = -3.437, p < .001) shows that the bilingual children produced /p/ with shorter VOT than /t/. No other significant effects or interactions were observed.

Sub-model voiced plosives: The model detected an effect for Place of Articulation (β = 0.607, SE = 0.222, z = 2.739, p = .006), which shows that the bilingual children produced /b/ with prevoicing more frequently than /d/. No other significant effects or interactions were observed.

Monolinguals

Main model. The main model detected a significant effect for Voicing (β=29.966, SE=3.27, t=9.161, p < .001), which shows that the monolingual children produced longer VOT for voiceless plosives than for voiced plosives. A significant effect for Language (β = 25.203, SE = 3.323, t = 9.161, p < .001) confirms that monolingual German-speaking children produced longer VOT than monolingual Dutch-speaking children.

In addition, the main model detected a significant effect for Place of Articulation (labial vs. coronal; β = -11.412, SE = 2.773, t = -4.115, p < .001) and a three-way interaction between Voicing, Language, and Place of Articulation (labial vs. coronal; β = -7.422, SE = 2.773, t = -2.676, p = .007). Post hoc analyses conducted on the data split by voicing and language investigated this interaction and found that the monolingual German-speaking children produced longer VOT in coronal plosives than in labial plosives (voiced: β = -4.208, SE = 1.851, t = -2.273, p = .023; voiceless: β = -23.545, SE = 6.179, t = -3.810, p < .001), while no effect for Place of Articulation was detected in the VOT productions of the monolingual Dutch-speaking children (voiced: β = -16.545, SE = 8.99, t = -1.84, p = .066; voiceless: β = -4.402, SE = 3.242, t = -1.358, p = .174). The main model furthermore detected significant interactions between Voicing and Word Length (β = -4.627, SE = 1.432, t = -3.231, p = .001) and between Language and Word Length (β = -3.493, SE = 1.432, t = -2.440, p = .015). These interactions were further investigated by post hoc tests based on split data. Analyses on the data split by voicing suggest that both monolingual Dutch-speaking and German-speaking children produced longer VOT for voiceless plosives when they occurred in monosyllabic words than in disyllabic words (β = -3.934, SE = 1.371, t = -2.869, p = .004), but shorter VOT for voiced plosives when they occurred in monosyllabic words than in disyllabic words (β = 5.309, SE = 2.591, t = 2.049, p = .040). Analyses on the data split by language suggest that there is a non-significant trend towards longer VOT in disyllabic words in Dutch (β = -2.748, SE = 1.641, t = -1.675, p = .094), and a non-significant trend towards longer VOT in monosyllabic words in German (β = 4.176, SE = 2.332, t = 1.791, p = .073). No other significant effects or interactions were observed.

Sub-model voiceless plosives: A significant effect for Language (β = 31.679, SE = 3.583, t = 8.841, p < .001) shows that monolingual German-speaking children produced voiceless plosives with longer VOT than monolingual Dutch-speaking children. The model detected a significant effect for Place of Articulation (labial vs. coronal; β = -13.996, SE = 3.502, t = -3.996, p < .001), and an interaction between Language and Place of Articulation (labial vs. coronal; β = -9.423, SE = 3.502, t = -2.691, p = .007). This interaction was investigated in post hoc analyses conducted on the data split by language, and showed that only monolingual German-speaking children produced shorter VOT in /p/ than in /t/ (β = -23.545, SE = 6.181, t = -3.810, p < .001), while this effect was non-significant for monolingual Dutch-speaking children (β = -4.402, SE = 3.242, t = -1.358, p = .175). Moreover, the model detected a significant effect for Word Length (β = -3.934, SE = 1.372, t = -2.868, p = .004), as well as an interaction between Language and Word Length (β = -4.004, SE = 1.372, t = -2.920, p = .004). This interaction was investigated in post hoc analyses conducted on the data split by language, and showed that only monolingual German-speaking children produced voiceless plosives with longer VOT in monosyllabic words than in disyllabic words (β = -8.092, SE = 2.411, t = -3.356, p < .001), while this effect was non-significant for monolingual Dutch-speaking children (β = 0.018, SE = 1.275, t = 0.014, p > .250). No other significant effects or interactions were observed.

Sub-model voiced plosives: The model detected a significant effect for Language (β = -1.887, SE = 0.285, z = -6.629, p < .001), which shows that monolingual Dutch-speaking children produced more voiced plosives with prevoicing than monolingual German-speaking children. In addition, the model detected a significant effect for Place of Articulation (β = 1.030, SE = 0.224, z = 4.592, p < .001), showing that the monolingual Dutch-speaking and German-speaking children produced /b/ with prevoicing more frequently than /d/. No other significant effects or interactions were observed.