Abstract
Metalinguistic awareness (an awareness about the structure of orthography) had been considered vital for reading acquisition. The awareness of phonological regularity and consistency had been found in advanced readers in recent research. Evidence based on simplified Chinese suggested the effect of semantic transparency on reading in school readers. Studies based on traditional Chinese also reported that reading acquisition, including the development of metalinguistic awareness, is affected by script, properties of characters in school curricula, approaches and strategies of reading training. This paper reports the comparison between corpora of simplified Chinese characters based on primary school textbooks and the updated Hong Kong Corpus of Primary School Chinese (HKCPSC). The proportion of characters in the total curriculum, the ratio of phonetic‐semantic compounds, visual complexity (defined by the number of strokes) and the levels of phonetic regularity and semantic transparency of Chinese characters across grades in the two corpora are compared. Two marked differences found are the frequency‐weighted proportion of regular characters and the proportion of semantically transparent characters across grades. The relationships between the data and recent findings of reading development in Chinese are discussed.
Notes
1. The term “frequency” here refers to the frequency of occurrence, i.e. the number of times a character occurs in a context (e.g. a database, a series of textbooks). For example, character X occurs 156 times while character Y occurs 3 times. High frequency characters occur more frequently while low frequency characters occur less frequently. There are different systems to classify high‐, mid‐, and low‐frequency characters. In our system, the frequency values of characters are ranked in descending order. The top 33% of characters are classified as high frequency, the following 33% as mid frequency and the last 33% as low frequency. Characters which do not occur at all are classified as “unfamiliar”.