Search in:

Advanced search

Journal of Quantitative Linguistics Volume 22, 2015 - Issue 1

Submit an article Journal homepage

800

Views

CrossRef citations to date

Altmetric

Articles

Type-token models: a comparative study

David MitchellEmyvale Medical Centre, Emyvale Monaghan, Republic of IrelandCorrespondence[email protected]

Pages 1-21 | Published online: 17 Dec 2014

Cite this article
https://doi.org/10.1080/09296174.2014.974456
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans Automatic Control, 19, 716–723.10.1109/TAC.1974.1100705
Web of Science ®Google Scholar
Alföldi, A. (1966). Early Rome and the Latins. Michigan: University of Michigan Press.
Google Scholar
Box, G., & Pierce, D. A. (1970). Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American Statistical Association, 65, 1509–1526.10.1080/01621459.1970.10481180
Google Scholar
Brainerd, B. (1972). On the relation between types and tokens in literary text. Journal of Applied Probability, 9, 507–518.10.2307/3212322
Web of Science ®Google Scholar
Brainerd, B. (1981). Some elaborations upon Gani”s model for the type-token relationship. Journal of Applied Probability, 18, 452–460.10.2307/3213291
Web of Science ®Google Scholar
Breusch, T., & Pagan, A. R. (1979). Simple test for heteroscedasticity and random coefficient variation. Econometrica, 47, 1287–1294.10.2307/1911963
Web of Science ®Google Scholar
Brunet, E. (1978). Vocabulaire de Jean Giraudoux: Structure et évolution Geneva: Slatkine.
Google Scholar
Burnham, K., & Anderson, D. R. (1998). Model selection and inference: A practical information-theoretical approach New York, NY: Springer-Verlag.10.1007/978-1-4757-2917-7
Google Scholar
Carlos, A., & Sanchez, P. (1997). Predictability of word forms (types) and lemmas in linguistic corpora. A case study based on the analysis of the CUMBRE corpus: An 8-million-word corpus of contemporary Spanish. International Journal of Corpus. Linguistics, 2, 259–280.
Google Scholar
Carroll, J. (1939). Diversity of vocabulary and the harmonic series law of word-frequency distribution. Psychological Record, 2, 377–386.
Google Scholar
Carroll, J. (1964). Language and thought Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Čelakovský, F. (1853). Ctení o srovnavací mluvnici slovanské na Universitě pražskě Prague: V komisí u F. Řivnáče.
Google Scholar
Chao, M. (1992). From animal trapping to type-token. Statistica Sinica, 2, 189–201.
Web of Science ®Google Scholar
Chotlos, J. (1944). Studies in language behaviour: IV. A statistical and comparative analysis of individual written language samples. Psychological Monographs, 56, 75–111.10.1037/h0093511
Google Scholar
Condon, E. (1928). Statistics of vocabulary. Science, 67, 300.10.1126/science.67.1733.300
PubMedGoogle Scholar
Cornell, T. (1995). The beginnings of Rome London, New York: Routledge.
Google Scholar
Covington, M., & McFall, J. D. (2010). Cutting the Gordian Knot: The moving average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17, 94–100.10.1080/09296171003643098
Web of Science ®Google Scholar
Crossley, S., Salsbury, T., & McNamara, D. S. (2010). The development of semantic relations in second language speakers: A case for Latent Semantic Analysis. Vigo International Journal of Applied Linguistics, 7, 55–74.
Web of Science ®Google Scholar
Daller, H., van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of bilinguals. Applied Linguistics, 24, 197–222.10.1093/applin/24.2.197
Web of Science ®Google Scholar
Dugast, D. (1979). Vocabulaire et stylistique Théâtre et dialogue. Travaux de linguistique quantitative Geneva: Slatkine-Champion.
Google Scholar
Ejiri, K., & Smith, A. E. (1993). Proposal of a new “constraint measure” for text. Paper presented at the Proceedings of the First International Conference on Quantatitive Lingustics, Qualico.
Google Scholar
Estoup, J. (1916). Gammes sténographiques. Technical report Paris: Institut Sténographique de France.
Google Scholar
Fox, J., & Weisberg, S. (2011). An R companion to applied regression (2nd ed.) California: Sage.
Google Scholar
Grothendieck, G. (2013). nls2: Non-linear regression with brute force. R package version, 2.
Google Scholar
Guiraud, P. (1954). Les caractères statistiques du vocabulaire Paris: Presses Universitaires de France.
Google Scholar
Ha, L., Stewart, D. W., Hanna, P., & Smith, F. J. (2006). Zipf and type-token rules for the English, Spanish, Irish and Latin languages. Web Journal of Formal, Computational and Cognitive Linguistics, 1(8), 1–12.
Google Scholar
Herdan, G. (1960). Type-token mathematics The Hague: Mouton.
Google Scholar
Hidaka, S. (2009). A sample-size-invariant estimation of lexical diversity. Paper presented at the 31st Annual Meeting Cognitative Science Society, Amsterdam.
Google Scholar
Hidaka, S. (2013). General type token distribution. Biometrika, 101(4), 999–1002.
Google Scholar
Johnson, W. (1939). Language and speech hygiene Chicago, IL: Institute of General Sematics.
Google Scholar
Johnson, W. (1944). Studies in language behavior: A program of research. Psychological Monographs, 56, 1–15.
Google Scholar
Kelih, E. (2010). The type-token relationship in Slavic parallel texts. Glottometrics, 20, 1–11.
Google Scholar
Köhler, R., & Galle, M. (1993). Dynamic aspects of text characteristics. In: L. Hrebícek & G. Altmann (Eds.), Quantitative text analysis (pp. 46–53). Wissenschaftlicher Verlag Trier: Trier.
Google Scholar
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.10.1214/aoms/1177729694
Google Scholar
Kuraszkiewicz, W. (1951). Obocznosc -”Ev- // -”Ov- W Dawnej Polszczyznie I W Dzisiejszych Gwarach Wroclaw: Nakladen Wroclawskiego Towarzystwa Naukowego.
Google Scholar
Lavrenko, V. (2001). A mathematical model of vocabulary growth Technical Report Massachusetts: Center for Intelligent Information Retrieval.
Google Scholar
Li, W., Miramontes, P., & Cocho, G. (2010). Fitting ranked linguistic data with two-parameter functions. Entropy, 12, 1743–1764.10.3390/e12071743
Web of Science ®Google Scholar
Li, J., & Zhang, F. (2011). Inter-textual vocabulary growth patterns for marine engineering English Bejing: Editorial office for contemporary foreign languages.
Google Scholar
Liu, H., & Xu, C. (2012). Quantitative typological analysis of Romance languages. Poznan Studies Contemp Ling, 48, 97–625.
Google Scholar
Lottner, C. (1858). Zeitschrift für vergleichende Sprachforschung auf dem Gebiete des Deutschen, Griechischen und Lateinischen. Ueber die Stellung der Italer innerhalb des indoeuropäischen Stammes, 7, 18–49.
Google Scholar
Maas, H.-D. (1972). Zusammenhang zwischen Wortschatzumfang und Länge Eines Textes. Zeitschrift für Literaturwissenschaft und Linguistik, 8, 73–79.
Google Scholar
Mačutek, J., & Wimmer, G. (2013). Evaluating goodness-of-fit of discrete distribution models in quantitative linguistics. Journal of Quantitative Linguistics, 20, 227–240.10.1080/09296174.2013.799912
Web of Science ®Google Scholar
Orlov, J. (1983). Ein Modell der Häuﬁgkeitsstruktur des Vokabulars. In: H. Guiter & M. V. Arapov (Eds.), Studies on Zipf “s Law (pp. 154–233). Bochum: Brockmeyer.
Google Scholar
Panas, E. (2001). The Generalized Torquist: Specification and estimation of a new vocabulary-text size function. Journal of Quantitative Linguistics, 8, 233–252.10.1076/jqul.8.3.233.4097
Google Scholar
Pāṇini. ( 4th century BC). Ashtadhyayi.
Google Scholar
Parry-Fielder, B., Collins, K., Fisher, J., Keir, E., Anderson, V., Jacobs, R., Scheffer, I. E., & Nolan, T. (2009). Electroencephalographic abnormalities during sleep in children with developmental speech-language disorders: a case-control study. Developmental and Medical Child Neurology, 51, 228–234.10.1111/dmcn.2009.51.issue-3
PubMed Web of Science ®Google Scholar
Pearson, K. (1897). On a form of spurious correlation that may arise when indices are used for the measurement of organs. Proceedings of the Royal Society of London, 60, 498.
Google Scholar
Peirce, C. (1906). Prolegomena to an apology for pragmaticism. The Monist, 16, 492–546.10.5840/monist190616436
Google Scholar
Pinheiro, J., Bates, D., Deb Roy, S., Sarkar, D., & the R Development Core Team. (2013). nlme: Linear and nonlinear mixed effects models. R package version 3.1-111.
Google Scholar
Plato. ( c360 BC). Politeia.
Google Scholar
Popescu, I.-I., Altmann, G., Grzybek, P., Jayaram, B. D., Köhler, R., Krupa, V., Macutek, J., Mehler, A., Pustet, R., Uhlírová, L., & Vidya, M. N. (2009). Word frequency studies Berlin, New York: de Gruyter.
Google Scholar
Ringe, D., Warnow, T., & Taylor, A. (2002). Indo-European and computational cladistics. Transactions of the Philosophical Society, 100, 59–129.10.1111/trps.2002.100.issue-1
Web of Science ®Google Scholar
Salsbury, T., Crossley, S. A., & McNamara, D. S. (2011). Psycholinguistic word information in second language oral discourse. Second Language Research, 27, 343–360.10.1177/0267658310395851
Web of Science ®Google Scholar
Sankoff, D., & Lessard, R. (1975). Vocabulary richness: a sociolinguistic analysis. Science, 190, 689–690.10.1126/science.190.4215.689
Web of Science ®Google Scholar
Satorra, A., & Bentler, E. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis Paper presented at the American StatisticalA Proceedings: Business and Economic Statistics Section, Alexandria, VA.
Google Scholar
Schleicher, A. (1853). Die ersten Spaltungen des Indogermanischen Urvolkes. Allgemeine Monatsschrift für Wissenschaft und Literatur, ( August), 786–787.
Google Scholar
Schmidt, M., & Lipson, H. (2009). Distilling free-form natural laws from experimental data. Science, 324(5923), 81–85.10.1126/science.1165893
PubMed Web of Science ®Google Scholar
Schmidt, M., & Lipson, H. (2013). Eureqa.
Google Scholar
Shapiro, S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611.10.1093/biomet/52.3-4.591
Web of Science ®Google Scholar
Sichel, H. (1986). Word frequency distributions and type-token characteristics. Mathematical Scientist, 11, 45–72.
Google Scholar
Sigler, L. (2002). Fibonacci’s Liber Abaci, Leonardo Pisano’s book of calculations.: New York, NY: Springer.10.1007/978-1-4613-0079-3
Google Scholar
Smith, F., & Devine, K. (1985). Storing and retrieving word phrases. Inform Process management, 21, 215–224.10.1016/0306-4573(85)90106-2
Web of Science ®Google Scholar
Somers, H. (1959). Analyse mathématique du langage Louvain: Nauwelaert.
Google Scholar
Suetonius. ( 1st century AD). De grammaticis et rhetoribus.
Google Scholar
Team, R. D. C. (2011). R: A language and environment for statistical computing Vienna, Austria: R Foundation for Statistical Computing.
Google Scholar
Templin, M. (1957). Certain language skills in children Minneapolis, MN: University of Minneapolis Press.
Google Scholar
Teuffel, W. (1870). Geschichte der Römischen Literatur Leipzig: CH Beck.
Google Scholar
Thomson, G., & Thompson, J. R. (1915). Outline of a measure for the quantitative analysis of writing vocabularies. British Journal of Psychology, 8, 52–69.
Google Scholar
Trissino, G. (1524). Epistola del Trissino de le lettere nuovamente aggiunte ne la lingua italiana: Vicentino.
Google Scholar
Tuldava, J. (1974). O statisticeskoj strukture teksta. Sovetskaja pedagogika i škola, 9, 5–33.
Google Scholar
Tuldava, J. (1980). A mathematical model of the vocabulary-text relation. Paper presented at the Proceedings of the 8th Conference on Compational Lingustics, Tokyo.
Google Scholar
Tuldava, J. (Ed.). (1993). The statistical structure of a text and its readability Trier: Wissenschaftlicher Verlag Trier.
Google Scholar
Tuldava, J. (1995). On the relation between text length and vocabulary size. In: J. Tuldava (Ed.), Methods in quantitative linguistics (pp. 131–150). Trier: Wissenschaftlicher Verlag Trier.
Google Scholar
Tuldava, J. (1996). The frequency spectrum of text and vocabulary. Journal of Quantitative Linguistics, 3, 38–50.10.1080/09296179608590062
Google Scholar
Tweedie, F., & Baayen, H. R. (1998). How variable may a constant be? Measures of lexical richness in perspective. Computers and the Humanities, 32, 323–352.10.1023/A:1001749303137
Google Scholar
Venables, W., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.): New York, NY: Springer.10.1007/978-0-387-21706-2
Google Scholar
Vulanovic, R., & Köhler, R. (2005). Syntax units and structures. In R. Köhler, G. Altmann & R. G. Piotrowski (Eds.), Quantitative Linguistics. An international handbook (pp. 274–291). Berlin, New York: de Gruyter.
Google Scholar
Wimmer, G. (2005). The type-token relation. In: R. Köhler, G. Altmann, & R. G. Piotrowski (Eds.), Quantitative Linguistics, An International Handbook (pp. 361–368). Berlin, New York: Walter de Gruyter.
Google Scholar
Yāska. ( 6th century BC). Nirukta.
Google Scholar
Yule, G. (1944). The statistical study of literary vocabulary Cambridge: Cambridge University Press.
Google Scholar
Zipf, G. (1932). Selective studies and the principle of relative frequency in language Cambridge, MA: Harvard University Press.10.4159/harvard.9780674434929
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Type-token models: a comparative study

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Type-token models: a comparative study

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date