2,546
Views
2
CrossRef citations to date
0
Altmetric
Research Article

Learning to Read Interacts with Children’s Spoken Language Fluency

& ORCID Icon

ABSTRACT

Until at least the end of adolescence, children articulate speech differently than adults. While this discrepancy is often attributed to the maturation of the speech motor system, we sought to demonstrate that the development of spoken language fluency is shaped by complex interactions across motor and cognitive domains. In this study, we specifically tested for a relationship between reading proficiency and coarticulatory organization, a fundamental correlate of spoken language fluency, used for both reading aloud and conversational speech. We conducted reading assessments and ultrasound-based kinematic measurements of intersegmental coarticulation in a group of 32 German children. In German, a language which supports rather consistent grapheme-to-phoneme relationships, reading aloud uses similar phoneme to speech motor gesture correspondences as well as coarticulatory mechanisms as conversational speech. Using general additive modeling we found that better readers exhibited lower degrees of intersegmental coarticulation than poorer readers. This study therefore provides evidence that reading proficiency interacts with coarticulatory patterns in beginning readers. It suggests that in addition to maturational factors, interactions between speech motor ability and other co-developing skills must be considered to fully account for spoken language fluency.

Introduction

Learning to read, a societal necessity, is a crucial milestone in child development. While the facilitating role of reading acquisition has been well established for the development of comprehension, working memory, vocabulary and phonological awareness, it is not yet clear whether it also helps the development of spoken language fluency. The question remains intriguing: Why would a cognitively demanding skill, acquired to respond to societal conventions, interact with speech production, which develops organically regardless of societal, cultural and linguistic factors? A main motivation for examining this relationship comes from the observation that in alphabetic languages with consistent grapheme-to-phoneme relationships (e.g., German, Romanian, Spanish), speaking and reading aloud both use similar phonemes to speech motor gestures correspondences. In such languages, children therefore learn to read by building correspondences between printed letters, individual speech sounds and their speech motor realizations. Furthermore, in Chinese, a non-alphabetical language, Read et al. (Citation1986) have shown that while adults literate in alphabetic spelling (Hanyun pinyin) could easily manipulate individual phonemes in spoken Chinese words, adults familiar with only Chinese characters could not. It is therefore possible that the progressive automatization of grapheme to phoneme to speech motor correspondences may not only serve the development of reading fluency, but altogether stimulate children’s spoken language skills.

The present study takes a step toward addressing this question by investigating whether reading and speech fluency interact in German beginning readers after one year of explicit reading instruction in primary school. We focused on a fundamental property of spoken language fluency – coarticulation – which allows speakers to assemble various speech-sized units (individual phonemes, syllables, words) in continuous intelligible streams. More specifically, we examined children’s patterns of intersegmental coarticulation, which characterizes the temporal overlap of speech motor gestures of neighboring phonemes (e.g., gestures for a target vowel /ɛ/ may overlap temporally with those for the preceding consonant /b/ in the word “bet”). In the domain of speech development, intersegmental coarticulation degree has been widely used as a metric for estimating spoken language fluency in typically developing children, as well as in children with speech related disfluency (e.g., stuttering, developmental apraxia of speech, speech sound disorders: Maas & Mailend, Citation2017; Nijland et al., Citation2002; review in Noiray et al., Citation2019a).

Overall, it was found that very early on typically developing children produce a wide range of phonological and phonetic contrasts (e.g., Song et al., Citation2013), but at least through adolescence, their coarticulatory patterns differ significantly from those of adults. Children often exhibit greater degrees of intersegmental coarticulation (e.g., in English: Nittrouer et al., Citation1996, Citation1989; Noiray et al., Citation2013; Zharkova et al., Citation2011 in German: Rubertus et al., Citation2018; Noiray et al., Citation2018, Citation2019a). However, this difference is not systematic (Barbier et al., Citation2020; Goffman et al., Citation2008a; see also a review in Noiray et al., Citation2018).

Developmental discrepancies in coarticulatory organization have often been related to children’s neuroanatomical immaturity, weaker control over their speech motor system (e.g., Abakarova et al., Citation2020; Kent, Citation1976; Maas & Mailend, Citation2017; Nittrouer et al., Citation1989; Zharkova et al., Citation2011), or to the developing ability to quickly and efficiently convert phonemic units into speech motor gestures (e.g., Sereno et al., Citation1987). However, broader aspects of spoken language development may also be relevant. For instance, children initially tend to process and organize their speech into larger lexically driven units (e.g., words) tied to meaning and communicative intent rather than phonological properties or linguistic proficiency (e.g., Beckman & Edwards, Citation2000; Studdert-Kennedy & Goodell, Citation1995; Vihman & Keren-Portnoy, Citation2013 but see discussion in M. M. Vihman, Citation2017). This organizational scheme may result in “word recipes” (Vihman & Velleman, Citation1989), articulatory “gestalts” or routines (Goffman et al., Citation2008a; Menn Citation1983 , ; Redford, Citation2015) centered around vocalic gestural cores largely overlapping with neighboring consonantal gestures (e.g., in the anticipatory right-to-left direction: Noiray et al., Citation2018; in the preservatory left-to-right direction: Rubertus & Noiray, Citation2020). With increasing and more varied linguistic exposure, as well as with greater communicative opportunities, children gradually develop their receptive and expressive vocabulary (e.g., Mahr & Edwards, Citation2018) and the ability to extract individual phonemes from lexical compounds (e.g., Fowler, Citation1991; Mayo et al., Citation2003; Nittrouer et al., Citation1996). This would in turn result in greater phonemic differentiation of children’s speech motor organization (e.g., Studdert-Kennedy, Citation1998) and lower degrees of intersegmental coarticulation in the direction of adults’ patterns. While the construction of those lexical to phonemic to speech motor translations seems essential for developing spoken language fluency, there is good evidence that these developments mutually influence each other over time (e.g., motor, lexical and/or phonological developments: Menn, Citation1983; Heisler et al., Citation2010; expressive vocabulary and production ability:; M. M. Vihman, Citation1996; speech production capability and its processing; Depaolis et al., Citation2013; lexical and phonological development:; Stoel-Gammon, Citation2011).

A recent investigation of coarticulatory patterns in German 3-, 4-, 5-year old preschool and 7-year-old primary school children lends further support to this view (Noiray et al., Citation2019a). In this study, it was found that coarticulatory patterns in preschool children did not differ among the 3 age groups, but all showed greater intra- and inter-syllabic coarticulation degree compared to 7-year-old children. From this intriguing result, we hypothesized that a main difference between preschoolers and primary school children, aside from their age, may lie in their ability to decode and encode written language. It is possible that by stimulating correspondences between graphemic and phonetic units, extensive reading instruction in primary school may influence children’s coarticulatory patterns toward greater phonetic differentiation for individual phonemes and lower intersegmental coarticulation degree. There are several reasons for expecting this relationship in orthographically consistent languages. First, conversational speech and reading aloud both entail the ability to combine small units (e.g., individual speech sounds) into larger compounds (e.g., syllables, words) that are then combined into even larger chunks (e.g., sentences, see a discussion of the “particulate principle” in Studdert-Kennedy, Citation1998; Studdert-Kennedy & Goldstein, Citation2003). Both skills therefore build upon a principle of combinatoriality. Second, speaking and reading aloud require the ability to plan upcoming speech chunks ahead of time and anticipate their corresponding speech motor gestures, so that speech or text can be produced fluently (anticipatory and perseveratory coarticulation, see discussion in Whalen, Citation1990). From the very first babbling phrases (e.g., [dada]) infants coarticulate vowel- and consonant-like sounds and their underlying speech motor schemes. Similarly, from the first attempt at reading a word aloud, children coarticulate individual speech sounds corresponding to printed letters so words can be read fluently as opposed to being spelled out. Coarticulatory processes are therefore essential to both speaking and reading aloud. For a general discussion on the links between reading and speech production, we recommend Shankweiler and Fowler (Citation2019).

In an alphabetical writing system, reading fluency, unlike speech, requires children to develop an explicit awareness of words being composed of smaller meaningless units (e.g., individual phonemes /b/, /ɛ/ and /t/ in “bet”), which may map more-or-less systematically with letters of their native alphabet (e.g., Bryant & Goswami, Citation1987). Research has shown that this metalinguistic knowledge is in fact not only necessary for developing fluent reading (review in English: Goswami & Bryant, Citation2016; in French: Alegria & Mousty, Citation2004; in German: Fricke et al., Citation2016; Landerl & Wimmer, Citation2008), it also contributes to the development of fine-grained perceptual abilities (review in Mayo et al., Citation2003), and interacts with speech production fluency (Noiray et al., Citation2019b; Saletta et al., Citation2016). Noiray et al. (Citation2019b) recently found that German children enrolled in the first grade performed better in tasks probing the manipulation of small phonemic units as compared to preliterate children, who only performed well with larger phonological units (e.g., syllables, rimes, see similar findings in preliterate French children: Caudrelier et al., Citation2019, in English: Morais, Citation2003; review in Shankweiler & Fowler, Citation2004). This suggests that phonemic decoding/encoding training in school stimulates children’s awareness of the structural combinatoriality of their native language and improves their ability to manipulate various size compounds (e.g., Studdert-Kennedy & Goldstein, Citation2003; Ziegler & Goswami, Citation2005). However, in Noiray et al.’s study (Noiray et al., Citation2019b) children’s phonological awareness was highly correlated with age. It is therefore difficult to disentangle the role of phonological awareness from general maturational effects. In more opaque alphabetical writing systems such as English, Saletta (Citation2019) also showed that children with better reading proficiency exhibited less lip movement variability when reading non-words aloud compared to poorer readers. Taken collectively, these findings therefore point to a link between reading aloud and speech production.

Capitalizing on these recent findings, the present study tests the hypothesis that reading skill interacts with children’s coarticulatory organization in German, a language with fairly consistent grapheme-to-phoneme correspondences. While previous studies looked at labial patterns (Saletta, Citation2019), we focused on the tongue, a crucial articulator used for the production of all vowels and most consonants. Overall, we expected both coarticulatory patterns and reading proficiency to vary, with a tendency for more proficient readers to exhibit lower intersegmental coarticulation, compared to poorer readers.

Method

32 monolingual German children at the end of their first year of primary school (age-span: 6.9– 7.4; mean age 7.02; 19 females), all living in the Brandenburg region of Germany participated in the study. None of the participants reported any history of hearing or language impairment. An ethics approval was obtained from the Ethics Committee of the University of Potsdam prior to the study.

Assessment of speech production and reading proficiency

The study was organized in two sequential tasks: a speech production task followed by an assessment of children’s reading proficiency.

For the production task, participants were instructed to repeat prerecorded disyllabic target pseudo-words (C1VC2ǝ) embedded in a carrier phrase consisting of the German indefinite article /aɪnə/ (e.g., “eine bude”). Consonants (C1/C2 positions) were selected to vary in place of articulation and degree of involvement of the tongue articulator: the labial consonant /b/ has no tongue involvement, but instead requires a primary gesture from the lips and jaw; the alveolar consonant /d/ recruits the entire tongue to achieve a constriction in the alveolar region of the oral cavity; the velar consonant /g/ involves a tongue body gesture in the palatal to velar region of the oral cavity depending on its vocalic neighbor. For each target pseudo-word C1 was always different from C2. Vowels were chosen to reflect variations in the horizontal position of the tongue in relation to the front-back phonological dimension, from more anterior tongue positions for (i.e. /i/, /y/, /e/) to more posterior tongue positions: (/u/, /o/). Six repetitions presented in randomized blocks were elicited. This resulted in a total of 180 tokens per participant (3 consonants x 2 C1/C2 combinations per fixed C1 x 5 vowels x 6 repetitions). Movement of the tongue was recorded during the speech production task using an ultrasound imaging device (Sonosite edge, fps: 48 Hz.). The procedure was embedded in a playful space journey scenario to stimulate children’s attention and motivation to complete the task. Children were seated in a mock spaceship, which included the ultrasound probe fixed in a custom-made probe holder representing a “steering wheel.” It allowed the jaw to move naturally in the vertical dimension but prevented lateral and horizontal motions. The probe was positioned under the chin, between the arches of the mandibular body. The tongue surface contour was captured in the midsagittal plane. No additional head-to-probe stabilization was employed to maximize the naturalness of speech and make the recording more comfortable for children. Trials during which participants moved were discarded after visual inspection of the data (see Noiray et al., Citation2020 for a full description of the experimental platform). The acoustic speech signal was simultaneously recorded using a Shure microphone (fps: 48 Hz).

Reading proficiency was assessed with a standard battery of German reading tests (SLRT I – Salzburger Lese- und Rechtschreibtest – Landerl et al., Citation1997). This battery of tests consisted of reading 30 real-words (RWs) and 30 non-words (NWs). Both accuracy (maximal score (30) minus the number of mistakes) and speed (time measured in seconds) were measured. Mistakes consisted of elisions (e.g., /ta/ instead of /tag/ in “Tag,” “day”), additions (e.g.,/bʁaʊnen/ instead of /bʁaʊne/ in “braune,” “brown”), substitutions (e.g., /klaɪn/ instead of /kleid/ in “Kleid,” “dress”) and/or vowel length errors (e.g., lax instead of tense vowel in /liːd/ Lied “song”). Multiple errors per word were counted as a single error. Speed was measured per task, the time it took to read the complete set of 30 words was recorded. An additional composite score called “reading fluency” was defined by dividing the total accuracy score by the time recorded for reading the target words for both tasks. This additional measure aimed to differentiate between proficient readers who are additionally fast readers. Reading fluency scores range from 0.1 to 2.7; with higher scores indicating better fluency.

Speech production measure: coarticulation degree

Intersegmental coarticulation, the degree of temporal overlap between adjacent consonantal and vocalic lingual gestures within the syllable, is commonly computed from different syllable structures as a regression between the tongue position during the consonant and that of the following vowel (review in Noiray et al., Citation2019a). The acoustic signal (hand segmented in Praat – Boersma & Weenink, Citation2019) was used to determine the midpoint of the target consonants (C1) and vowels (V) within each target syllable. Tongue contours were semi-automatically detected and hand-corrected for each frame of the ultrasound videos. The tongue configurations corresponding to the acoustic midpoints of C1 and V were then extracted. For each target tongue contour, spatial coordinates of the highest point of the tongue body, which is the primary speech articulator used for vowel production, were extracted to estimate the horizontal anterior-posterior position of the tongue body (). The x-coordinates (normalized per subject) of each CV syllable were then used for the regression. A 0.5 value for the x-coordinate indicates a central position. While front vowels (e.g., /i, e/) are characterized by anterior tongue body positions (values < 0.5), back vowels (e.g., /u, o/) have more posterior positions (values > 0.5).

Figure 1. Ultrasound midsagittal tongue contours at the midpoint of the vowel /i/ (left) and /u/ (right) with the horizontal positions of the highest point on the tongue body – represented by the vertical arrows and the corresponding x mark on the horizontal axis.

Figure 1. Ultrasound midsagittal tongue contours at the midpoint of the vowel /i/ (left) and /u/ (right) with the horizontal positions of the highest point on the tongue body – represented by the vertical arrows and the corresponding x mark on the horizontal axis.

Statistical analyses

To address our research question General Additive Models (GAM, Wood, Citation2017) were used. This statistical method allows for measuring, comparing, visualizing and detecting both linear and non-linear effects between factors and their interactions. In this study, we modeled the effect of reading proficiency on coarticulation degree. For each model the response variable was the position of the tongue body during the midpoint of the consonant. Tensor smooth products were used to model (non-)linear effects across the different predictors: position of the tongue during the vowel and reading fluency, taken as a continuous variable. Furthermore, because consonants may exhibit different degrees of coarticulation based on their place of articulation (adults: Recasens, Citation2018, children: review in Noiray et al., Citation2019a) the tensor smooths were set for each level of the C1 variable (i.e. consonant type was integrated as an interaction factor in the model). Data visualization, an essential feature of GAM statistical modeling, was performed with the fvisgam function of the itsadug package (Van Rij et al., Citation2020). For a full description of GAM procedures and functions we recommend Wieling’s (Citation2018) tutorial.

Results

Intersegmental coarticulation is participant and context dependent

Children’s degree of coarticulation varied substantially within and across participants. Tongue positions during consonants varied with repetition, vowel context and participant. illustrates three such cases of variation. The left and middle plots illustrate three repetitions of the bilabial consonant /b/ in “bige” and “buge,” produced by a single participant. For both disyllabic tokens, tongue contours do not perfectly overlap, indicating a certain level of flexibility in articulation with each repetition. The front-back position of the tongue is represented by contours of varying color. Tongue contours are more front in “bige” than ’buge’ (gray-scale contours are overall more to the left of the central value 0 than the red-scale contours), indicating the tongue position of the subsequent vowel is anticipated during the production of the consonant. The right-most plot is a representation of inter-speaker variability illustrating productions of /b/ in “bige” for three different participants (participants 23, 31 and 32). Participant 23 (black contour) produces /b/ with a more central position of the tongue, whereas participants 31 and 33 (red and gray contours) exhibit more fronted positions, indicating more anticipation of the upcoming /i/ for these two participants.

Figure 2. Intra- (left) and inter-participant (right) variability in the production of the labial consonant /b/. The two plots on the left represent non-normalized 3 repetitions of “bige” and “buge” by the same participant. Grey-scale colors correspond to anterior and central positions of the tongue, red-scale colors correspond to posterior positions. The rightmost plot illustrates the tongue contours at the midpoint of /b/ in “bige” for three different participants (participants 23, 31 and 32).

Figure 2. Intra- (left) and inter-participant (right) variability in the production of the labial consonant /b/. The two plots on the left represent non-normalized 3 repetitions of “bige” and “buge” by the same participant. Grey-scale colors correspond to anterior and central positions of the tongue, red-scale colors correspond to posterior positions. The rightmost plot illustrates the tongue contours at the midpoint of /b/ in “bige” for three different participants (participants 23, 31 and 32).

provides an overview of the variability in children’s tongue body (TB) positions for each consonant, across all five vocalic contexts. For each distribution curve, all participants and repetitions are represented. The black vertical line delimits normalized anterior (left of the line, values < 0.5) from posterior (right of the line, values > 0.5) tongue positions. Coarticulation degree is illustrated by the degree of overlap of the distribution curves across vocalic contexts. If the five distribution curves overlapped to form a single curve, it would indicate an absence of vocalic anticipatory coarticulation on the preceding consonant. The degree of coarticulatory overlap varied across consonantal context.

Figure 3. Distribution of the normalized horizontal position of the highest tongue body (TB) at the midpoint of the Consonant per Vowel contexts (colors) for all participants and all repetitions. The black vertical line represents the most central position.

Figure 3. Distribution of the normalized horizontal position of the highest tongue body (TB) at the midpoint of the Consonant per Vowel contexts (colors) for all participants and all repetitions. The black vertical line represents the most central position.

More specifically, syllables including the alveolar consonant /d/ exhibited the highest degree of overlap between distribution curves, indicating a resistance to coarticulation with subsequent vowels (i.e. less vocalic anticipation). Contrastingly, the degree of overlap between the tongue contour distribution for front and back vowels in the bilabial /b/ context was low, which indicates that the bilabial consonant was very prone to coarticulation with adjacent vowels. The tongue body position at the midpoint of the consonant was anterior in the context of front vowels (i, e, y) and posterior position in the context of back vowels (o, u). Syllables including the velar consonant /g/ were characterized by an intermediate degree of curve overlap, with the means of the curves ranging from central to more posterior positions, but not anterior ones. /g/ was more resistant to coarticulation than the bilabial consonant, but less so than the alveolar.

The width of the curves illustrates the dispersion of the data for each vocalic context. The wider the curve, the higher the variability in tongue positions across participants and repetitions. reports the standard deviation (s) and the mean (μ) for all three consonants in each consonant-vowel syllable (all participants and repetitions included).

Table 1. Standard deviation (SD) and mean (μ) of the normalized measures of the horizontal position of the TB at the midpoint of the consonant for each vowel context

Overall, the bilabial consonant /b/ exhibited the highest degree of variability (s = 0.202) across vocalic context, with distribution means varying between 0.41 and 0.67 depending on the target vowel. The velar /g/ and the alveolar /d/ had similar standard deviations across vocalic context (s ~ 0.16). The distribution means were however more stable in the case of the alveolar consonant, indicating tongue positions centered around a specific anterior position. We provide interpretations for those contextual variations in coarticulation degrees in the discussion.

Reading proficiency is speaker and task dependent

Positive correlations were found between real and non-word reading tasks for number of mistakes (r = 0.39, t = 2.3821, df = 30, p-value = 0.02375) and reading time (r = 0.76, t = 6.46, df = 30, p-value = 3.83e-07), indicating that participants fared similarly in both tasks. However, several differences were found between the two tasks. In the real-word production task, accuracy and reading time (i.e., speed) were positively correlated (r = 0.59, t = 4.0387, df = 30, p-value = 0.0003433) which shows that more accurate readers were also faster. In contrast, time and accuracy were not correlated for non-words (p ~ 0.17).

Further differences in children’s reading skill between real and non-words were reflected in their reading time: it took participants approximately 20 seconds more to read the list of non-words as compared to real-words (Welch t-test: t = −2.0584, df = 61.219, p-value = 0.04381).

Lastly, a by-participant analysis highlighted substantial individual variability. reports differences in reading time (y-axis) as a function of differences in mistakes (x-axis) between the two tasks (real-word – non-words) for each of the 32 participants (represented by distinct numeric labels e.g., 3, 14, 20). Participants’ proficiency varied, with a few extreme cases (participants 2, 18, 8 and 6). Four types of reading proficiency profiles emerge, as illustrated by the four quadrants (I, II, III and IV) in . Participants in the lower two quadrants read real-words faster than non-words but differed in their mistake pattern: participants in quadrant II made more mistakes reading non-words, while participants in quadrant IV made more mistakes reading real-words. Participants in the upper two quadrants are faster at reading non-words, making more mistakes either for non-words (quadrant I) or real-words (quadrants III). Overall, the large majority of participants were faster at reading real words than non-words (n = 26, quadrants II and IV). Of these, some made more mistakes reading non-words (n = 14, quadrant II), others reading real-words (n = 8, quadrant IV), and some (n = 4) making the same number of mistakes in both tasks. The remaining participants (participants 2, 13, 15, 19 and 25) read non-words faster than real-words (quadrants I and III).

Figure 4. Difference in Time as a function of the difference of Mistakes between the two tasks per participant ID (represented by the numeric labels). The four quadrants represent four categories of readers based on the values of these differences.

Figure 4. Difference in Time as a function of the difference of Mistakes between the two tasks per participant ID (represented by the numeric labels). The four quadrants represent four categories of readers based on the values of these differences.

Negative correlation between children’s coarticulatory patterns and reading proficiency

The separate analyses of coarticulation and reading proficiency have shown that both skills differ across children at the end of their first primary school year. We employed generalized additive modeling (GAMs) to elucidate whether the two skills interact. provides three contour plots illustrating the outputs of the interaction between reading fluency and coarticulation per consonantal context /b/ (left), /d/ (middle) and /g/ (right). There are three variables on each plot: the reading fluency score (values on the x-axis: higher values indicate better fluency), the tongue body position at the midpoint of the vowel (values on the y-axis: values < 0.5 indicate anterior positions and values > 0.5 indicate posterior tongue positions) and the position of the tongue body at the midpoint of the consonant (color pattern: black for anterior/front positions; gray for central positions and red for posterior/back positions). The black contour lines represent tongue body positions of the same value, much like those in topological maps.

Figure 5. Contour plots illustrating the interaction between the composite reading fluency score (x-axis) and he position of the tongue body at the midpoints of the vowel (y-axis) and consonant (colors: black – anterior; gray – central; red – posterior).

Figure 5. Contour plots illustrating the interaction between the composite reading fluency score (x-axis) and he position of the tongue body at the midpoints of the vowel (y-axis) and consonant (colors: black – anterior; gray – central; red – posterior).

Before describing the empirical results in , we introduce as an interpretation guide. illustrates two extreme hypothetical cases of minimal or maximal coarticulation degree between vowels and a velar consonant /g/ in the context of several vowels and in the absence of any reading fluency effect.

Figure 6. Interpretation guide for GAM contour plots: the two color squares represent two hypothetical extreme cases of coarticulation degree for the velar consonant /g/ with no variation as a function of reading fluency (no coarticulation – left, maximal coarticulation – right).

Figure 6. Interpretation guide for GAM contour plots: the two color squares represent two hypothetical extreme cases of coarticulation degree for the velar consonant /g/ with no variation as a function of reading fluency (no coarticulation – left, maximal coarticulation – right).

On the left, the uniform red color (indicating a posterior tongue position) does not change as a function of vocalic context or reading fluency score. This color scheme therefore characterizes an absence of V-to-C coarticulation (the tongue position during the production of /g/ is the same in the context of /i/ (values of 0.2 on the y-axis) or /u/ (values of 0.8 on the y-axis) and no effect of reading fluency. The opposite pattern of maximal coarticulation is represented in the right square. Here, the color scheme varies as a function of the subsequent vowel. It indicates that in the context of front vowels (values < 0.3 on the y-axis), the tongue position during the consonant is front (dark gray), for central vowels (values ~ 0.5 on y-axis) it is also central (light gray) and when followed by back vowels (values > 0.7 on y-axis) the consonantal tongue position is also back (red). In this extreme case of complete V-to-C coarticulation, the tongue positions for the consonant fully overlap with those of the following vowel and are not affected by reading fluency (parallel color blocks with no variation along the x-axis).

Coming back to , note that neither of those two extremes is observed. The colors in the three panels vary as a function of reading fluency (the color blocks are not parallel). The variation is consonant specific. Altogether, the color and contour line patterns depict a negative correlation between children’s reading fluency and their coarticulation degree in all three consonantal contexts. Children with good reading fluency exhibited lower degrees of intersegmental coarticulation and higher degrees of gesture differentiation for individual phonemes than poorer readers. This is illustrated by the “fan-like” structure of the colors and the contour lines. More proficient readers (e.g., a score of 1) exhibited more central tongue positions (more shades of light gray) prior to the production of both front and back vowels than poor readers (e.g., a score of 0.2). Less proficient readers produced consonants with more anterior tongue positions in the case of syllables including anterior vowels (darker shades of gray), and more posterior tongue positions (shades of red) in anticipation of posterior vowels.

To understand the interaction between children’s reading fluency and their coarticulation patterns, let’s take the case of the velar consonant /g/ (right most plot in ). In the case of syllables including back vowels such as /u, o/ (e.g., values > 0.7 on the y-axis) children with low reading scores (x-axis values < 1) exhibit more posterior tongue positions during the preceding consonant (0.7 on the contour line and darker shades of red), thus showing more anticipation of upcoming back vowels. More proficient readers (x-axis values > 1) instead exhibit a different behavior: they produce the velar /g/ with a more central (palatal) position (< 0.7 on the x-axis and lighter shades of red). For vowels associated with anterior tongue positions (e.g., /i, y, e/), similar patterns of greater vocalic anticipations are observed for poorer in comparison to more proficient readers, who exhibit more central, vowel-independent tongue positions. In the case of syllables including the alveolar consonant /d/, which imposes strong articulatory demands on the tongue, an interaction between reading fluency and coarticulation was noticeable (central plot in ). Posterior tongue body positions (illustrated by red shades) were found for back vowels in children with low reading fluency scores (value < 0.8 on the x-axis), which reflects children’s substantial anticipation of back vowels. This pattern was not observed in children with higher reading scores, who instead produced the alveolar consonant /d/ with an expected alveolar-like fronted tongue position independently of the subsequent vowel (no red shades for scores > 0.9 on the x-axis). The bilabial consonant /b/ (left-most plot in ) showed the least amount of coarticulatory variation based on reading fluency scores. This is apparent from the more regular coloring (black – gray – red) across reading fluency scores (left to right on the x-axis). The differences in coarticulation degree based on consonant type corroborate the patterns identified beforehand. Coarticulation degree varies depending on the speech articulators involved in the production of the consonant (CD/b/ > CD/g/ > CD/d/). provides the statistical outputs of the GAM model testing for an interaction between coarticulation degree and reading fluency across our three consonantal contexts. All patterns differ significantly from zero (p < 2e-16), which suggests the two skills interact independently of phonetic context. Furthermore, the interaction patterns across each consonantal context are non-linear. This is illustrated by the greater than 1 degrees of freedom (edf) associated to the tensor smooths. The highest degree of non-linearity is found for the alveolar consonant /d/ (edf = 11.82) and the least amount of non-linearity is found for the velar /g/ (edf = 5.7). The non-linearity of the patterns indicates that a given increase in reading proficiency does not translate to a linear decrease in coarticulation degree. Instead the interaction is more complex, reflecting a more staggered development.

Table 2. Generalized additive model tensor smooth terms testing the interaction between the reading fluency score and coarticulation degree for different consonantal contexts. The degree of non-linearity is indicated by the degrees of freedom (edf) values higher than 1

Discussion

The overarching goal of this study was to test whether spoken language fluency interacts with reading fluency at the end of the first year of reading instruction in primary school. Combining children’s kinematic measures with their reading assessments, we made three important findings.

First, we found that both reading proficiency and coarticulatory patterns differed greatly across children, despite the age span being limited. Second, individual variability across reading and coarticulatory patterns were correlated, which suggests that both skills may not develop independently of each other but in relation to one another in the developmental course. Last, this correlation was negative: beginning readers with better reading fluency exhibited lower degrees of intersegmental coarticulation. This result provides evidence of a relationship between reading and speech fluency in German beginning readers. Below we expand on the implications of these findings, which we then frame within an integrative-interactive approach to spoken language development.

Variability in children’s reading proficiency and coarticulatory patterns

Despite all 32 children in this study being of similar age and receiving similar reading training in school, reading levels varied significantly across children. This result corroborates previous findings drawn from German beginning readers (e.g., Fricke et al., Citation2016, Citation2008), and converges with reports of individual variability in lexical, phonological and speech motor domains (e.g., Edwards et al., Citation2011; Grigos, Citation2009; Smith & Goffman, Citation1998; Sosa & Stoel-Gammon, Citation2012).

In addition, children’s reading proficiency was task dependent. Twenty-six children were faster reading real-words than non-words while 5 children presented the inverse pattern. One child performed similarly in both tasks. Interestingly, reading real-words faster than non-words was not systematically associated with greater reading accuracy for those words. Among the fast real-word readers, 8 produced more mistakes for real-words than for non-words, while 14 did so for non-words. Four children performed equally well in both tasks, making the same number of mistakes for both real-words and non-words. These findings indicate that reading proficiency is not strictly uniform across children of similar age and instruction period. Yet, overall, when accuracy and speed are considered altogether, children read real-words more fluently than non-words. This suggests that at the end of the first year in primary school, children are not yet proficient enough decoders to perform equally well in non- and real-words. This is the case even though the non-words included in our standard reading test had a relatively simple trisyllabic CVCVCV structure (with no complex phonotactics involving consonant clusters). Several explanations are possible. First, children may benefit from the facilitating effect of lexical representations when reading real-words (e.g., Cychosz et al., Citation2021a; Munson, Edwards et al., Citation2005). Instead, reading unfamiliar non-words is cognitively more demanding; it taps into children’s ability to sequentially decode phonemic components. Second, as mentioned in the introduction, both reading aloud and speaking require speakers to elaborate both a phonological and a speech motor plan (see for instance, Levelt & Wheeldon, Citation1994‘s referential planning model; Whalen, Citation1990). Because children at the beginning of primary school are not fluent readers yet, it is conceivable that their ability to plan speech motor gestures in anticipation of upcoming phonemic targets is not yet as efficient and automatized as in adults (Barbier et al., Citation2015; Rubertus & Noiray, Citation2020; Rubertus et al., Citation2018). This is particularly relevant in the case of non-word reading. First, speech motor plans may be computed on the go compared to well-known words belonging to children’s mental lexicon. Second, the execution of speech motor gestures for non-words is less practiced than for well-practiced real-words, hence potentially resulting in longer production times and/or more inaccuracy. Last, differences between children may also reflect discrepancies in individual reading practice and/or literacy strategies used in school and at home (Goulandris, Citation2003; Wimmer et al., Citation2000). To summarize, the variability observed in our sample of 32 children suggests that the development of reading acquisition is idiosyncratic.

Substantial variability was also found in children’s coarticulatory patterns across phonetic contexts. Coarticulation degree was highest in syllables including the onset bilabial consonant /b/, which does not require a primary gesture from the tongue body articulator. In this context, tongue positions for the subsequent vowel can be anticipated during the production of the labial consonant. Syllables including the alveolar /d/ consonant were instead associated with the least degree of vocalic anticipation. Because the tongue body contributes to the necessary front position of the tongue tip to achieve the alveolar constriction target, it cannot simultaneously be recruited to anticipate subsequent vocalic positions. Last, an intermediate degree of coarticulation was noted in the velar /g/ context, which can be produced with relative articulatory flexibility (e.g., C. A. Fowler & Brancazio, Citation2000) without altering its intelligibility.

Coarticulatory gradients have repeatedly been observed in adults (e.g., Abakarova et al., Citation2018; Noiray et al., Citation2018, Citation2019a; Recasens, Citation2018). In children, context-specific coarticulatory organizations seem to develop gradually (e.g., Goffman & Smith, Citation1999; Walsh & Smith, Citation2002). Interestingly, this process aligns well with children’s developing speech perception (e.g., Kolozsvári et al., Citation2021; Krüger & Noiray, Citationto appear; Mayo et al., Citation2003; Mayo & Turk, Citation2004; Nijland et al., Citation2002; Nittrouer & Miller, Citation1997; Noiray et al., Citation2019b). For instance, Nittrouer (Citation1996) found an effect of phonemic awareness in 7 to 8-year-old children’s perceptual processing of phonetic detail. This finding was later substantiated in Mayo and colleagues’ longitudinal investigation of 5-to-6-year old children (C. J. Mayo, Citation1999; Mayo et al., Citation2003). They argued for a developmental change in the processing of acoustic cues conveyed in the speech signal, from sensitivity to global cues (tailored to large chunks – e.g., syllables), to greater awareness of phonemic units (e.g., “Developmental Weighing Shift,” Nittrouer et al., Citation1993, Citation1996). Note however, that different individual profiles were identified in both Nittrouer and Mayo’s studies, whereby children’s perceptual weighting of phonemic cues did not systematically go hand in hand with more advanced phonemic awareness. This suggests that perceptual abilities – like reading and coarticulatory organization – develop non-uniformly across children.

To summarize, the period encompassing preschool to primary school entails important changes in children’s perceptual, phonological and production abilities, which result in substantial individual variability. Thus, if anything, future research should include larger samples of children and in-depth assessments of environmental (e.g., socio-economic situation), experiential (e.g., reading practices at home and in school) and language-related factors (e.g., phonological awareness, vocabulary growth), to better understand individual reading acquisition trajectories, beyond averaged behavior.

Directionality of the speech and reading interaction

The main motivation for examining a relationship between speech production and reading aloud resulted from the observation that in German the relation between graphemic and phonemic units is fairly consistent. Hence, we assumed that reading acquisition may come to interact with spoken language acquisition at the time beginning readers develop correspondences between graphemes, their associated phoneme and articulatory-acoustic expression. This assumption was validated. However, reading fluency involves more complex relationships than speech (e.g., Caudrelier et al., Citation2019; Mayo et al., Citation2003; Noiray et al., Citation2019b; Saletta, Citation2019). provides a conceptual illustration of those differences.

Figure 7. Visual conceptualization of some of the main processes underlying the acquisition of spoken and reading fluency. Their interaction is represented by the dotted double arrow; the question marks refers to possible causal directions.

Figure 7. Visual conceptualization of some of the main processes underlying the acquisition of spoken and reading fluency. Their interaction is represented by the dotted double arrow; the question marks refers to possible causal directions.

In its initial phase, two main types of experience seem to contribute to the development of spoken language: 1) audiovisual exposure to language (review in Danielson et al., Citation2017; Kuhl, Citation2011; Lewkowicz et al., Citation2015), and 2) speech motor practice (e.g., Goffman & Smith, Citation1999; Green et al., Citation2010; M. M. Vihman, Citation2017). During the first years of life, with increasing exposure to their native language, children organize their speech motor schemes around their developing lexical repertoire. Children may then progressively depart from linguistic organizations dominated by their small lexicon to develop more phonologically grounded representations (e.g., Ainsworth et al., Citation2016; Menn, Citation1983; M. M. Vihman, Citation2017). In this process, children would build correspondences between individual speech sounds and speech motor gestures allowing them to gradually produce an increasing number of new words.

In languages with consistent orthographies such as German, reading fluency, acquired subsequently, not only relies on knowledge of individual speech sounds and their gestural specificities, it also requires at least two additional types of knowledge, which are not directly available in children’s daily interactions. It necessitates explicit training in letter knowledge, that is, the ability to differentiate individual letters and associate them with distinct speech sounds (review in Hulme & Snowling, Citation2015) but not only that (see Shankweiler & Fowler, Citation2019). It also requires developing the awareness that various sized units of sound have a phonological structure (e.g., phonemes, syllables). As illustrated in , reading acquisition therefore builds upon a triangular relationship between graphemes, phonemes and speech motor gestures. While we do not expect this relationship to be fully automatized by the end of the first year of primary school, it should be well initiated after a full year of explicit reading instruction, for children to be able to read words and sentences aloud.

Given that spoken language precedes reading on the developmental timeline, children may well rely on the former to build the triangular grapheme-to-phoneme-to-gesture correspondence, necessary to read fluently. But the inverse may also be true: a year of formal practice deciphering graphemes and building associations between graphemes, phonemes and speech motor gestures may impact children’s preexisting speech production patterns. By accommodating this new “particulate system” (Studdert-Kennedy, Citation1998), coarticulatory processes may evolve toward greater phonemic distinction of speech motor gestures. There is already evidence that reading acquisition enhances spoken language production (e.g., greater accuracy and speech movement stability: Saletta et al., Citation2016), its processing (e.g., segmental representations: Pattamadilok et al., Citation2010; Perre et al., Citation2009) and even affects lexical decisions (e.g., Pattamadilok et al., Citation2009). Drawing on that evidence, it is therefore not unreasonable to hypothesize that coarticulatory processes may attune to reading requirements and, in this process, change speech motor organization altogether.

Empirical investigations conducted with adults provide support for this view. For instance, literate adults are more proficient in non-word repetition tasks compared to illiterate adults (review in Kolinsky et al., Citation2012). Neuroimaging research (e.g., Carreiras et al., Citation2009; Dehaene et al., Citation2010) provides further compelling evidence underlying the instrumental role of reading acquisition for spoken language fluency: “Literacy, whether acquired in childhood or through adult classes, enhances brain responses in at least three distinct ways. (…) literacy allows practically the entire left-hemispheric spoken language network to be activated by written sentences. Thus reading, a late cultural invention, approaches the efficiency of the human species’ most evolved communication channel, namely speech. Third, literacy refines spoken language processing by enhancing a phonological region, the planum temporale, and by making an orthographic code available in a top-down manner.” (Dehaene et al., Citation2010, p. 1364).

These findings do not invalidate the hypothesis that preexisting coarticulatory processes in casual speech may serve reading development. In fact, in languages supporting consistent orthographies, coarticulatory patterns practiced during the first 5 to 6 years of speech production may well facilitate the initial stage of reading acquisition. At first, children may read words based on the identification of the first syllable. Once associations with well-known lexemes are made, they may read words as holistic units (with high intersegmental coarticulation). If no word can be retrieved, children may instead use a spelling approach to decipher the written word. After reaching a certain level of letter knowledge, phonemic awareness and reading proficiency, their coarticulatory organization may in turn be influenced by reading skill and evolve to accommodate their native language’s “particulate” phonemic system. During this period, coarticulatory organization may change toward greater distinction of speech motor gestures for consecutive segments and lower intersegmental coarticulation degree. If this developmental scenario were to be empirically confirmed, it would have important implications for both reading and speech production theories. It would indeed show that a communicative skill (speech production) and a socially acquired skill (reading) are dynamically coupled, and that their individual developmental trajectory may be conditioned by the evolution of this relationship over time.

Yet, to understand the dynamics of this relationship, and in particular whether it supports a unidirectional, or instead bidirectional changes over time, longitudinal research is necessary. Such an approach, we will be able to evaluate to what extent speech-reading interactions develop differently across children, without an indication of atypicality. Furthermore, findings from longitudinal examinations may also contribute to building empirically grounded predictive models of speech or reading (dis)abilities that may originate from the initial connection between speech and reading.

An integrative-interactive approach to spoken language

Adding to existing evidence of individual variability, the present study has highlighted an important relationship between reading and spoken fluency in German beginning readers. It expands on previous research showing that greater phonemic awareness interacts with the refinement of children’s coarticulatory organization (Noiray et al., Citation2019b) and with the increasing ability to process phonetic cues (C. J. Mayo, Citation1999; Mayo et al., Citation2003). The observed individual patterns of skill interaction may reflect the integration of both reading and speech motor control in children’s broader language system.

Drawing on the increasing evidence of multiple interactions across domains (to cite a few examples: C. J. Mayo, Citation1999; Majorano et al., Citation2014; Noiray et al., Citation2019b; M. M. Vihman, Citation1996; Wang et al., Citation2021), we endorse an integrative-interactive approach to spoken language development (Noiray et al., Citation2019b). This approach is not novel; it aligns with other comparable theoretical views in developmental science (e.g., emergentist theory: Hirsch-Pasek et al., Citation2004; the Developmentally Sensitive Theory and Core model; Davis & Redford, Citation2019; Redford, Citation2015, Citation2019; Dynamical System theories of development: Thelen & Smith, Citation1994). In this approach (illustrated in ), spoken language fluency develops through the integration of various language-related skills (e.g., perceptual, lexical, phonological, speech motor), which are interdependent, their individual growth interacting dynamically over time. Their trajectories also converge toward more refined differentiations at the perceptual, representational and production level (e.g., greater sensitivity to phonetic cues, richer lexical repertoire, phonemic organization of speech motor gestures). Furthermore, there is much evidence in infant and child studies that linguistic exposure is a fundamental catalyst to this developmental process (e.g., DePaolis et al., Citation2013; Gervain, Citation2015; Mayo et al., Citation2003; Nittrouer et al., Citation1996; Nittrouer & Miller, 1996; Vihman & Wauquier, Citation2018). Not only is the quantity of input increasing over time, its nature diversifies as well (e.g., from parents to multiple interlocutors, from home to school settings). This increasing and more diverse exposure to their native language provides children with essential material to develop their perceptual, lexical phonological and speech motor abilities. Regarding coarticulatory organization specifically, recent research has illuminated a facilitating effect of daily speech practice on intra-syllabic coarticulation degree (Cychosz et al., Citation2021b) in addition to vocabulary (Cychosz et al., Citation2021a; Noiray et al., Citation2019b).

Figure 8. Illustrative sketch of an integrative-interactive perspective to spoken language development.

Figure 8. Illustrative sketch of an integrative-interactive perspective to spoken language development.

To fully understand how the cognitive and motor domains come to interact dynamically over time, future research will need to conceptualize spoken language ontogenesis as an evolving dynamical system (e.g., Fogel & Thelen, Citation1987; Thelen & Smith, Citation1994). This theory must explain how children integrate various organizational schemes (e.g., speech gestures, syllables, words) and types of knowledge (e.g., perceptual, lexical, phonological) in their speech. More importantly, it must decipher how interactions across skills change over time (e.g., with varying developmental paces, whether staggered or continuous) as children gain new skills (e.g., reading) or consolidate existing ones. These research avenues are becoming increasingly relevant in developmental psycholinguistics (e.g., discussion in DePaolis et al., Citation2013; M. M. Vihman, Citation2017). In a new project, we have started investigating differences in coarticulatory organization in read as compared to repeated speech in first to third graders as well as adults (Rubertus & Popescu, Citation2020). We also test for an effect of phonemic awareness and reading proficiency on coarticulatory organization for both modalities.

Another promising research avenue would be to test illiterate children and adults. If coarticulatory organization changes in contact with reading acquisition, we would expect children and adults, who haven’t received any reading instruction, to exhibit similar degrees of coarticulation. It has already been reported that illiterate adults have difficulties manipulating segment-sized phonological units compared to their literate counterparts (Lukatela et al., Citation1995). If, however, the development of coarticulatory patterns is mostly influenced by speech motor control maturation, we would expect illiterate children to exhibit similar coarticulation degrees to those of age-matched children with reading proficiency. Likewise, illiterate adults should exhibit similar coarticulatory patterns to those of proficient adult readers. Another extension of this research could also address the relation between coarticulatory organization, phonemic awareness and reading proficiency in speakers of non-alphabetic languages (e.g., Cherokee, Tamil, Chinese). In Chinese, whose writing system represents syllables, we might expect coarticulation patterns to represent syllable sized phonology, that is, greater intrasyllabic coarticulation. However, investigations into non-alphabetic languages should account for the possible familiarization with alphabetic spelling (e.g., pinyin in Chinese, alphabetic second language), which would confound the results (Read et al., Citation1986).

Conclusion

In summary we found that reading proficiency correlates with coarticulatory patterns in first grade children. Drawing upon this evidence, as well as previous research, we propose an interactive-integrative approach to account for the development of spoken language fluency. The gradual integration of various co-developing skills (e.g., lexicon, phonology, reading) and their dynamical interaction over time should provide a unifying account of spoken language development, as an alternative to only considering maturational factors (e.g., neuroanatomical development). This approach also implies moving away from normative, average-driven analyses of speech development and instead considering individual variability as reflecting idiosyncratic interactions between skills. If the finding of a tight developmental interaction between reading and spoken fluency were to be extended, it would have important implications for advancing our understanding of skill interactions in typical development and potentially predicting speech and language disorders (e.g., developmental dyslexia).

Acknowledgments

We thank students at LOLA lab in Potsdam for their valuable assistance in the recording and processing of the data collected. We are also grateful to Carol Fowler for stimulating discussion, and to the two anonymous scholars and editor who have provided very useful feedback on previous versions of this manuscript. Special thanks to all the children (and their parents) who participated in the study.

Data availability

If accepted for publication to Language Learning and Development, the dataset used for this article can be made publicly available on an online platform.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This research was generously supported by the Deutsche Forschungsgemeinschaft (DFG) grant N° 255676067 and 1098 and PredictAble (Marie Skłodowska-Curie Actions, H2020-MSCA-ITN-2014, N° 641858); .

References