800
Views
16
CrossRef citations to date
0
Altmetric
Articles

Type-token models: a comparative study

References

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans Automatic Control, 19, 716–723.10.1109/TAC.1974.1100705
  • Alföldi, A. (1966). Early Rome and the Latins. Michigan: University of Michigan Press.
  • Box, G., & Pierce, D. A. (1970). Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American Statistical Association, 65, 1509–1526.10.1080/01621459.1970.10481180
  • Brainerd, B. (1972). On the relation between types and tokens in literary text. Journal of Applied Probability, 9, 507–518.10.2307/3212322
  • Brainerd, B. (1981). Some elaborations upon Gani”s model for the type-token relationship. Journal of Applied Probability, 18, 452–460.10.2307/3213291
  • Breusch, T., & Pagan, A. R. (1979). Simple test for heteroscedasticity and random coefficient variation. Econometrica, 47, 1287–1294.10.2307/1911963
  • Brunet, E. (1978). Vocabulaire de Jean Giraudoux: Structure et évolution Geneva: Slatkine.
  • Burnham, K., & Anderson, D. R. (1998). Model selection and inference: A practical information-theoretical approach New York, NY: Springer-Verlag.10.1007/978-1-4757-2917-7
  • Carlos, A., & Sanchez, P. (1997). Predictability of word forms (types) and lemmas in linguistic corpora. A case study based on the analysis of the CUMBRE corpus: An 8-million-word corpus of contemporary Spanish. International Journal of Corpus. Linguistics, 2, 259–280.
  • Carroll, J. (1939). Diversity of vocabulary and the harmonic series law of word-frequency distribution. Psychological Record, 2, 377–386.
  • Carroll, J. (1964). Language and thought Englewood Cliffs, NJ: Prentice-Hall.
  • Čelakovský, F. (1853). Ctení o srovnavací mluvnici slovanské na Universitě pražskě Prague: V komisí u F. Řivnáče.
  • Chao, M. (1992). From animal trapping to type-token. Statistica Sinica, 2, 189–201.
  • Chotlos, J. (1944). Studies in language behaviour: IV. A statistical and comparative analysis of individual written language samples. Psychological Monographs, 56, 75–111.10.1037/h0093511
  • Condon, E. (1928). Statistics of vocabulary. Science, 67, 300.10.1126/science.67.1733.300
  • Cornell, T. (1995). The beginnings of Rome London, New York: Routledge.
  • Covington, M., & McFall, J. D. (2010). Cutting the Gordian Knot: The moving average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17, 94–100.10.1080/09296171003643098
  • Crossley, S., Salsbury, T., & McNamara, D. S. (2010). The development of semantic relations in second language speakers: A case for Latent Semantic Analysis. Vigo International Journal of Applied Linguistics, 7, 55–74.
  • Daller, H., van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of bilinguals. Applied Linguistics, 24, 197–222.10.1093/applin/24.2.197
  • Dugast, D. (1979). Vocabulaire et stylistique Théâtre et dialogue. Travaux de linguistique quantitative Geneva: Slatkine-Champion.
  • Ejiri, K., & Smith, A. E. (1993). Proposal of a new “constraint measure” for text. Paper presented at the Proceedings of the First International Conference on Quantatitive Lingustics, Qualico.
  • Estoup, J. (1916). Gammes sténographiques. Technical report Paris: Institut Sténographique de France.
  • Fox, J., & Weisberg, S. (2011). An R companion to applied regression (2nd ed.) California: Sage.
  • Grothendieck, G. (2013). nls2: Non-linear regression with brute force. R package version, 2.
  • Guiraud, P. (1954). Les caractères statistiques du vocabulaire Paris: Presses Universitaires de France.
  • Ha, L., Stewart, D. W., Hanna, P., & Smith, F. J. (2006). Zipf and type-token rules for the English, Spanish, Irish and Latin languages. Web Journal of Formal, Computational and Cognitive Linguistics, 1(8), 1–12.
  • Herdan, G. (1960). Type-token mathematics The Hague: Mouton.
  • Hidaka, S. (2009). A sample-size-invariant estimation of lexical diversity. Paper presented at the 31st Annual Meeting Cognitative Science Society, Amsterdam.
  • Hidaka, S. (2013). General type token distribution. Biometrika, 101(4), 999–1002.
  • Johnson, W. (1939). Language and speech hygiene Chicago, IL: Institute of General Sematics.
  • Johnson, W. (1944). Studies in language behavior: A program of research. Psychological Monographs, 56, 1–15.
  • Kelih, E. (2010). The type-token relationship in Slavic parallel texts. Glottometrics, 20, 1–11.
  • Köhler, R., & Galle, M. (1993). Dynamic aspects of text characteristics. In: L. Hrebícek & G. Altmann (Eds.), Quantitative text analysis (pp. 46–53). Wissenschaftlicher Verlag Trier: Trier.
  • Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.10.1214/aoms/1177729694
  • Kuraszkiewicz, W. (1951). Obocznosc -”Ev- // -”Ov- W Dawnej Polszczyznie I W Dzisiejszych Gwarach Wroclaw: Nakladen Wroclawskiego Towarzystwa Naukowego.
  • Lavrenko, V. (2001). A mathematical model of vocabulary growth Technical Report Massachusetts: Center for Intelligent Information Retrieval.
  • Li, W., Miramontes, P., & Cocho, G. (2010). Fitting ranked linguistic data with two-parameter functions. Entropy, 12, 1743–1764.10.3390/e12071743
  • Li, J., & Zhang, F. (2011). Inter-textual vocabulary growth patterns for marine engineering English Bejing: Editorial office for contemporary foreign languages.
  • Liu, H., & Xu, C. (2012). Quantitative typological analysis of Romance languages. Poznan Studies Contemp Ling, 48, 97–625.
  • Lottner, C. (1858). Zeitschrift für vergleichende Sprachforschung auf dem Gebiete des Deutschen, Griechischen und Lateinischen. Ueber die Stellung der Italer innerhalb des indoeuropäischen Stammes, 7, 18–49.
  • Maas, H.-D. (1972). Zusammenhang zwischen Wortschatzumfang und Länge Eines Textes. Zeitschrift für Literaturwissenschaft und Linguistik, 8, 73–79.
  • Mačutek, J., & Wimmer, G. (2013). Evaluating goodness-of-fit of discrete distribution models in quantitative linguistics. Journal of Quantitative Linguistics, 20, 227–240.10.1080/09296174.2013.799912
  • Orlov, J. (1983). Ein Modell der Häufigkeitsstruktur des Vokabulars. In: H. Guiter & M. V. Arapov (Eds.), Studies on Zipf “s Law (pp. 154–233). Bochum: Brockmeyer.
  • Panas, E. (2001). The Generalized Torquist: Specification and estimation of a new vocabulary-text size function. Journal of Quantitative Linguistics, 8, 233–252.10.1076/jqul.8.3.233.4097
  • Pāṇini. ( 4th century BC). Ashtadhyayi.
  • Parry-Fielder, B., Collins, K., Fisher, J., Keir, E., Anderson, V., Jacobs, R., Scheffer, I. E., & Nolan, T. (2009). Electroencephalographic abnormalities during sleep in children with developmental speech-language disorders: a case-control study. Developmental and Medical Child Neurology, 51, 228–234.10.1111/dmcn.2009.51.issue-3
  • Pearson, K. (1897). On a form of spurious correlation that may arise when indices are used for the measurement of organs. Proceedings of the Royal Society of London, 60, 498.
  • Peirce, C. (1906). Prolegomena to an apology for pragmaticism. The Monist, 16, 492–546.10.5840/monist190616436
  • Pinheiro, J., Bates, D., Deb Roy, S., Sarkar, D., & the R Development Core Team. (2013). nlme: Linear and nonlinear mixed effects models. R package version 3.1-111.
  • Plato. ( c360 BC). Politeia.
  • Popescu, I.-I., Altmann, G., Grzybek, P., Jayaram, B. D., Köhler, R., Krupa, V., Macutek, J., Mehler, A., Pustet, R., Uhlírová, L., & Vidya, M. N. (2009). Word frequency studies Berlin, New York: de Gruyter.
  • Ringe, D., Warnow, T., & Taylor, A. (2002). Indo-European and computational cladistics. Transactions of the Philosophical Society, 100, 59–129.10.1111/trps.2002.100.issue-1
  • Salsbury, T., Crossley, S. A., & McNamara, D. S. (2011). Psycholinguistic word information in second language oral discourse. Second Language Research, 27, 343–360.10.1177/0267658310395851
  • Sankoff, D., & Lessard, R. (1975). Vocabulary richness: a sociolinguistic analysis. Science, 190, 689–690.10.1126/science.190.4215.689
  • Satorra, A., & Bentler, E. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis Paper presented at the American StatisticalA Proceedings: Business and Economic Statistics Section, Alexandria, VA.
  • Schleicher, A. (1853). Die ersten Spaltungen des Indogermanischen Urvolkes. Allgemeine Monatsschrift für Wissenschaft und Literatur, ( August), 786–787.
  • Schmidt, M., & Lipson, H. (2009). Distilling free-form natural laws from experimental data. Science, 324(5923), 81–85.10.1126/science.1165893
  • Schmidt, M., & Lipson, H. (2013). Eureqa.
  • Shapiro, S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611.10.1093/biomet/52.3-4.591
  • Sichel, H. (1986). Word frequency distributions and type-token characteristics. Mathematical Scientist, 11, 45–72.
  • Sigler, L. (2002). Fibonacci’s Liber Abaci, Leonardo Pisano’s book of calculations.: New York, NY: Springer.10.1007/978-1-4613-0079-3
  • Smith, F., & Devine, K. (1985). Storing and retrieving word phrases. Inform Process management, 21, 215–224.10.1016/0306-4573(85)90106-2
  • Somers, H. (1959). Analyse mathématique du langage Louvain: Nauwelaert.
  • Suetonius. ( 1st century AD). De grammaticis et rhetoribus.
  • Team, R. D. C. (2011). R: A language and environment for statistical computing Vienna, Austria: R Foundation for Statistical Computing.
  • Templin, M. (1957). Certain language skills in children Minneapolis, MN: University of Minneapolis Press.
  • Teuffel, W. (1870). Geschichte der Römischen Literatur Leipzig: CH Beck.
  • Thomson, G., & Thompson, J. R. (1915). Outline of a measure for the quantitative analysis of writing vocabularies. British Journal of Psychology, 8, 52–69.
  • Trissino, G. (1524). Epistola del Trissino de le lettere nuovamente aggiunte ne la lingua italiana: Vicentino.
  • Tuldava, J. (1974). O statisticeskoj strukture teksta. Sovetskaja pedagogika i škola, 9, 5–33.
  • Tuldava, J. (1980). A mathematical model of the vocabulary-text relation. Paper presented at the Proceedings of the 8th Conference on Compational Lingustics, Tokyo.
  • Tuldava, J. (Ed.). (1993). The statistical structure of a text and its readability Trier: Wissenschaftlicher Verlag Trier.
  • Tuldava, J. (1995). On the relation between text length and vocabulary size. In: J. Tuldava (Ed.), Methods in quantitative linguistics (pp. 131–150). Trier: Wissenschaftlicher Verlag Trier.
  • Tuldava, J. (1996). The frequency spectrum of text and vocabulary. Journal of Quantitative Linguistics, 3, 38–50.10.1080/09296179608590062
  • Tweedie, F., & Baayen, H. R. (1998). How variable may a constant be? Measures of lexical richness in perspective. Computers and the Humanities, 32, 323–352.10.1023/A:1001749303137
  • Venables, W., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.): New York, NY: Springer.10.1007/978-0-387-21706-2
  • Vulanovic, R., & Köhler, R. (2005). Syntax units and structures. In R. Köhler, G. Altmann & R. G. Piotrowski (Eds.), Quantitative Linguistics. An international handbook (pp. 274–291). Berlin, New York: de Gruyter.
  • Wimmer, G. (2005). The type-token relation. In: R. Köhler, G. Altmann, & R. G. Piotrowski (Eds.), Quantitative Linguistics, An International Handbook (pp. 361–368). Berlin, New York: Walter de Gruyter.
  • Yāska. ( 6th century BC). Nirukta.
  • Yule, G. (1944). The statistical study of literary vocabulary Cambridge: Cambridge University Press.
  • Zipf, G. (1932). Selective studies and the principle of relative frequency in language Cambridge, MA: Harvard University Press.10.4159/harvard.9780674434929

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.