ABSTRACT
Modern vocabulary size tests are generally based on the notion that the more frequent a word is in a language, the more likely a learner will know that word. However, this assumption has been seldom questioned in the literature concerning vocabulary size tests. Using the Vocabulary of American-English Size Test (VAST) based on the Corpus of Contemporary American English (COCA), 403 English language learners were tested on a 10% systematic random sample of the first 5,000 most frequent words from that corpus. Pearson correlation between Rasch item difficulty (the probability that test-takers will know a word) and frequency was only r = 0.50 (r2 = 0.25). This moderate correlation indicates that the frequency of a word can only predict which words are known with only a limited degree of and that other factors are also affecting the order of acquisition of vocabulary. Additionally, using vocabulary levels/bands of 1,000 words as part of the structure of vocabulary size tests is shown to be questionable as well. These findings call into question the construct validity of modern vocabulary size tests. However, future confirmatory research is necessary to comprehensively determine the degree to which frequency of words and vocabulary size of learners are related.
Acknowledgements
I would like to thank Dan Dewey who advised me so much on this project as well as to the rest of the my thesis committee, Dee Gardner, Troy Cox, and Mark Davies. Thank you also to Jesse Egbert, Soo Jung Youn, Luke Plonsky, and the editors/reviewers for your incredible feedback.
Disclosure statement
No potential conflict of interest was reported by the author.
Notes
1 In this paper, when the term ‘frequency’ is used, it is meant to mean corpus frequency unless otherwise indicated.