101
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

An “Unabridged” Word-Frequency Count of American English: A Proposal for an Integrated System

Pages 294-302 | Published online: 16 Jun 2015

  • The bibliography of books and articles relating to word-frequency study is small: in John W. Black and Marian Ausherman, The Vocabulary of College Students in Classroom Speeches (Columbus, Ohio, 1955), only 30 bibliographical items are listed, and, while a number of these pertain to word-frequency, most are concerned with other issues or treat word-frequency only cursorily. [Addendum, December, 1966: In a paper delivered to the English 13 Group of the Modern Language Association, New York, December 27, 1966, William Card cited the following as his chief sources: Brown University Standard Corpus, 1961, ca. 1,000,000 running words; Godfrey Dewey's count, spring of 1918, ca. 100,000; Horn's Basic Writing Vocabulary, 1923 and ante, ca. 5,000,000; Rinsland's count of elementary schoolchildren's writing, 1937, ca. 6,000,000; Thorndike's Teacher's Word Book, ante 1921, ca. 4,500,000; Thorndike-Lorge combined count, 1943 and ante, ca. 18,000,000.]
  • Charles H. Voelker, “The One-Thousand Most Frequent Spoken-Words,” Quarterly Journal of Speech, XXVIII (1900), 193.
  • Edward L. Thorndike and Irving Lorge, The Teacher's Word Book of 30,000 Words (New York, 1944).
  • For the purposes of this article, word and term are used interchangeably. Both mean anything the investigator wishes, for, once text has been introduced, it can be sorted out on the basis of the subsets of characters between spaces or a group of such subsets. Context refers to a set of such subsets, which set can be as large as the investigator wishes to make it.
  • Black and Ausherman, pp. 1–3.
  • For the purpose of this study, it is assumed that the circulation of a periodical is the total number of individuals who read every word of every article of every issue. Although we know that such a condition does not, in fact, exist, still the statistics for any one issue, for any one article, for any one word, and for any one reader will be proportional, and the percentage of error can therefore be dismissed as insignificant.
  • For his help in constructing this formula, the author is indebted to Dr. Theodore Newman of the Arma Division, American Bosch Company, Inc. Readers will undoubtedly recognize the basic and significant similarity of this formula to the frequency formula:
  • At the time of the original computation of this article, the circulation of the Reader's Digest was about 15,000,000 per month; that of Word (published three times a year), about 2,000 per issue. For the purposes of analysis these figures will suffice.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.