Abstract
The type of sublexical correspondences employed during non-word reading has been a matter of considerable debate in the past decades of reading research. Non-words may be read either via small units (graphemes) or large units (orthographic bodies). In addition, grapheme-to-phoneme correspondences may involve context-sensitive correspondences, such as pronouncing an “a” as /ɔ/ when preceded by a “w”. Here, we use an optimisation procedure to explore the reliance on these three types of correspondences in non-word reading. In Experiment 1, we use vowel length in German to show that all three sublexical correspondences are necessary and sufficient to predict the participants' responses. We then quantify the degree to which each correspondence is used. In Experiment 2, we present a similar analysis in English, which is a more complex orthographic system.
Keywords:
We are grateful to Angela Heine and Tila Brink for collecting the Berlin data for Experiment 1B. We also thank Petra Schienmann and Reinhold Kliegl for their help with organising data collection at Potsdam University for Experiment 1B. We thank Johannes Ziegler for providing a list of German consistent words. Further thanks are due to Stephen Lupker, James Adelman and three anonymous reviewers for their helpful comments on earlier versions of this paper.
This article was written as part of XS's doctoral dissertation under supervision of EM, AC and MC. SR conducted the data analyses and contributed to the write-up and revision of the manuscript, and SP scored the English data.
We are grateful to Angela Heine and Tila Brink for collecting the Berlin data for Experiment 1B. We also thank Petra Schienmann and Reinhold Kliegl for their help with organising data collection at Potsdam University for Experiment 1B. We thank Johannes Ziegler for providing a list of German consistent words. Further thanks are due to Stephen Lupker, James Adelman and three anonymous reviewers for their helpful comments on earlier versions of this paper.
This article was written as part of XS's doctoral dissertation under supervision of EM, AC and MC. SR conducted the data analyses and contributed to the write-up and revision of the manuscript, and SP scored the English data.
Notes
1 It is not always true that graphemes are smaller (i.e., contain fewer letters than) bodies, for example, the grapheme “igh” is larger than the body of the word “cat” (“-at”). For the sake of clarity, we follow the terminology of Ziegler and Goswami (Citation2005) and refer to graphemes as small units and bodies as large units.
2 There are some differences associated with dialects. Here, we use the pronunciations given by the DRC's vocabulary and the Macquarie Essential Dictionary (5th edition) as representative of Australian English, and the IPA as illustrated by Cox and Palethorpe (Citation2007).
3 Although we refer to the reliance on different types of correspondences as a “strategy”, we do not mean to imply that readers consciously choose the type of correspondence that maximises the chance of correctly reading an unfamiliar word, in a way that optimally fits the empirical data.
4 In standard linear regression, only one of these two formulae would be required, since they are entirely dependent (i.e., , etc.). In traditional regression, the only difference between the first and second equations would be the location of the estimated intercept and the sign of the slope. However, by removing the intercept term, our modelling strategy undermines this interdependence. Since the intercept is not free to vary (it is forced to be 0) the parameter estimates for P(short) would not match those for P(long). As a result, we must simultaneously fit both vowel pronunciations. Although it is useful to use the language of regression to describe some of the procedures, it is very important to remember that the β s here do not represent regression slopes but weights. Also, if this were a regression problem, it would be more properly treated as a logistic regression problem. However, this would be incompatible with our interpretation of the weights as “the probability that a certain strategy is adopted”.
5 It is noteworthy that Perry, Ziegler, Braun, and Zorzi (Citation2010) report data with a similar set of non-words to the current study (although the study was conducted with different aims): the authors manipulated the number of consonants in the coda, but rather than controlling for the consistency of the base word, their non-words differed in terms of the existence of the body in real words: the body either occurred in real German words or it did not. In other words, they did not independently manipulate the predictions of BRCs and CSCs, and predictions of super-rules and body analogy were heavily correlated, ,
, as were the predictions of super-rules and GPCs,
,
. This means that the Perry et al.'s data is unsuitable for our purposes: The analysis would be unreliable, as it is impossible to disentangle reliance on bodies versus super-rules and super-rules versus GPCs.