Abstract
Megastudies with processing efficiency measures for thousands of words allow researchers to assess the quality of the word features they are using. In this article, we analyse reading aloud and lexical decision reaction times and accuracy rates for 2,336 words to assess the influence of subjective frequency and age of acquisition on performance. Specifically, we compare newly presented word frequency measures with the existing frequency norms of Kučera and Francis Citation(1967), HAL (Burgess & Livesay, Citation1998), Brysbaert and New Citation(2009), and Zeno, Ivens, Millard, and Duvvuri Citation(1995). We show that the use of the Kučera and Francis word frequency measure accounts for much less variance than the other word frequencies, which leaves more variance to be “explained” by familiarity ratings and age-of-acquisition ratings. We argue that subjective frequency ratings are no longer needed if researchers have good objective word frequency counts. The effect of age of acquisition remains significant and has an effect size that is of practical relevance, although it is substantially smaller than that of the first phoneme in naming and the objective word frequency in lexical decision. Thus, our results suggest that models of word processing need to utilize these recently developed frequency estimates during training or setting baseline activation levels in the lexicon.
Acknowledgments
The authors thank Melvin Yap and James Adelman for kindly giving access to their frequency ranks. They also thank the reviewers, Harald Baayen and Wayne Murray, for many helpful comments.
Notes
1 The results remained the same when restricted cubic splines were used instead of second-degree polynomials (there were only differences in the third digit of the percentages of variance accounted for).
2 The same results are obtained when the inverse values of the reaction time (RT) means are used or when the raw RT means are used.
3 To make sure that we gave frequency ranks the same priority as raw frequencies, we additionally checked the frequency ranks based on Kučera and Francis Citation(1967), Zeno et al., Celex, and the British National Corpus, collected by Adelman and Brown Citation(2008). A further advantage of these frequency ranks is that they have been corrected for words unlikely to be known to university students. The correlations between the new frequency ranks and the dependent variables were always lower than those between CORA and the dependent variables.