Abstract
We propose a quantitative operationalisation of the complexity of a writing system. This complexity, also referred to as orthographic depth, plays a crucial role in psycholinguistic modelling of reading aloud (and learning to read aloud) in several languages. The complexity of a writing system is expressed by two measures, viz. that of the complexity of letter‐phoneme alignment and that of the complexity of grapheme‐phoneme correspondences. We present the alignment problem and the correspondence problem as tasks to three different data‐oriented learning algorithms, and submit them to English, French and Dutch learning and testing material. Generalisation performance metrics are used to propose for each corpus a two‐dimensional writing system complexity value.
Notes
Please address correspondence to: Antal van den Bosch, Department of Computer Science, University of Maastricht, PO Box 616, NL‐6200 MD Maastricht, The Netherlands, phone +31.43.882018, fax +31.43.252392, email: [email protected]
This research was partly supported by a grant from the Human Frontier of Science Programme Processing consequences of contrasting language phonologies. During this research, the first author was affiliated with the Institute for Language Technology and AI (ITK) at Tilburg University, with the Department of Psychology at Tilburg University, and with the Laboratoire de Psychologie Expérimentale at the Université Libre de Bruxelles. We would like to thank Terrence Sejnowski and Henk Kempff for making available the NETtalk corpus and the Dutch corpus, respectively. Thanks are also due to Jaap van den Herik for his valuable comments on the text.