References
- Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans Automatic Control, 19, 716–723.10.1109/TAC.1974.1100705
- Alföldi, A. (1966). Early Rome and the Latins. Michigan: University of Michigan Press.
- Box, G., & Pierce, D. A. (1970). Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American Statistical Association, 65, 1509–1526.10.1080/01621459.1970.10481180
- Brainerd, B. (1972). On the relation between types and tokens in literary text. Journal of Applied Probability, 9, 507–518.10.2307/3212322
- Brainerd, B. (1981). Some elaborations upon Gani”s model for the type-token relationship. Journal of Applied Probability, 18, 452–460.10.2307/3213291
- Breusch, T., & Pagan, A. R. (1979). Simple test for heteroscedasticity and random coefficient variation. Econometrica, 47, 1287–1294.10.2307/1911963
- Brunet, E. (1978). Vocabulaire de Jean Giraudoux: Structure et évolution Geneva: Slatkine.
- Burnham, K., & Anderson, D. R. (1998). Model selection and inference: A practical information-theoretical approach New York, NY: Springer-Verlag.10.1007/978-1-4757-2917-7
- Carlos, A., & Sanchez, P. (1997). Predictability of word forms (types) and lemmas in linguistic corpora. A case study based on the analysis of the CUMBRE corpus: An 8-million-word corpus of contemporary Spanish. International Journal of Corpus. Linguistics, 2, 259–280.
- Carroll, J. (1939). Diversity of vocabulary and the harmonic series law of word-frequency distribution. Psychological Record, 2, 377–386.
- Carroll, J. (1964). Language and thought Englewood Cliffs, NJ: Prentice-Hall.
- Čelakovský, F. (1853). Ctení o srovnavací mluvnici slovanské na Universitě pražskě Prague: V komisí u F. Řivnáče.
- Chao, M. (1992). From animal trapping to type-token. Statistica Sinica, 2, 189–201.
- Chotlos, J. (1944). Studies in language behaviour: IV. A statistical and comparative analysis of individual written language samples. Psychological Monographs, 56, 75–111.10.1037/h0093511
- Condon, E. (1928). Statistics of vocabulary. Science, 67, 300.10.1126/science.67.1733.300
- Cornell, T. (1995). The beginnings of Rome London, New York: Routledge.
- Covington, M., & McFall, J. D. (2010). Cutting the Gordian Knot: The moving average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17, 94–100.10.1080/09296171003643098
- Crossley, S., Salsbury, T., & McNamara, D. S. (2010). The development of semantic relations in second language speakers: A case for Latent Semantic Analysis. Vigo International Journal of Applied Linguistics, 7, 55–74.
- Daller, H., van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of bilinguals. Applied Linguistics, 24, 197–222.10.1093/applin/24.2.197
- Dugast, D. (1979). Vocabulaire et stylistique Théâtre et dialogue. Travaux de linguistique quantitative Geneva: Slatkine-Champion.
- Ejiri, K., & Smith, A. E. (1993). Proposal of a new “constraint measure” for text. Paper presented at the Proceedings of the First International Conference on Quantatitive Lingustics, Qualico.
- Estoup, J. (1916). Gammes sténographiques. Technical report Paris: Institut Sténographique de France.
- Fox, J., & Weisberg, S. (2011). An R companion to applied regression (2nd ed.) California: Sage.
- Grothendieck, G. (2013). nls2: Non-linear regression with brute force. R package version, 2.
- Guiraud, P. (1954). Les caractères statistiques du vocabulaire Paris: Presses Universitaires de France.
- Ha, L., Stewart, D. W., Hanna, P., & Smith, F. J. (2006). Zipf and type-token rules for the English, Spanish, Irish and Latin languages. Web Journal of Formal, Computational and Cognitive Linguistics, 1(8), 1–12.
- Herdan, G. (1960). Type-token mathematics The Hague: Mouton.
- Hidaka, S. (2009). A sample-size-invariant estimation of lexical diversity. Paper presented at the 31st Annual Meeting Cognitative Science Society, Amsterdam.
- Hidaka, S. (2013). General type token distribution. Biometrika, 101(4), 999–1002.
- Johnson, W. (1939). Language and speech hygiene Chicago, IL: Institute of General Sematics.
- Johnson, W. (1944). Studies in language behavior: A program of research. Psychological Monographs, 56, 1–15.
- Kelih, E. (2010). The type-token relationship in Slavic parallel texts. Glottometrics, 20, 1–11.
- Köhler, R., & Galle, M. (1993). Dynamic aspects of text characteristics. In: L. Hrebícek & G. Altmann (Eds.), Quantitative text analysis (pp. 46–53). Wissenschaftlicher Verlag Trier: Trier.
- Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.10.1214/aoms/1177729694
- Kuraszkiewicz, W. (1951). Obocznosc -”Ev- // -”Ov- W Dawnej Polszczyznie I W Dzisiejszych Gwarach Wroclaw: Nakladen Wroclawskiego Towarzystwa Naukowego.
- Lavrenko, V. (2001). A mathematical model of vocabulary growth Technical Report Massachusetts: Center for Intelligent Information Retrieval.
- Li, W., Miramontes, P., & Cocho, G. (2010). Fitting ranked linguistic data with two-parameter functions. Entropy, 12, 1743–1764.10.3390/e12071743
- Li, J., & Zhang, F. (2011). Inter-textual vocabulary growth patterns for marine engineering English Bejing: Editorial office for contemporary foreign languages.
- Liu, H., & Xu, C. (2012). Quantitative typological analysis of Romance languages. Poznan Studies Contemp Ling, 48, 97–625.
- Lottner, C. (1858). Zeitschrift für vergleichende Sprachforschung auf dem Gebiete des Deutschen, Griechischen und Lateinischen. Ueber die Stellung der Italer innerhalb des indoeuropäischen Stammes, 7, 18–49.
- Maas, H.-D. (1972). Zusammenhang zwischen Wortschatzumfang und Länge Eines Textes. Zeitschrift für Literaturwissenschaft und Linguistik, 8, 73–79.
- Mačutek, J., & Wimmer, G. (2013). Evaluating goodness-of-fit of discrete distribution models in quantitative linguistics. Journal of Quantitative Linguistics, 20, 227–240.10.1080/09296174.2013.799912
- Orlov, J. (1983). Ein Modell der Häufigkeitsstruktur des Vokabulars. In: H. Guiter & M. V. Arapov (Eds.), Studies on Zipf “s Law (pp. 154–233). Bochum: Brockmeyer.
- Panas, E. (2001). The Generalized Torquist: Specification and estimation of a new vocabulary-text size function. Journal of Quantitative Linguistics, 8, 233–252.10.1076/jqul.8.3.233.4097
- Pāṇini. ( 4th century BC). Ashtadhyayi.
- Parry-Fielder, B., Collins, K., Fisher, J., Keir, E., Anderson, V., Jacobs, R., Scheffer, I. E., & Nolan, T. (2009). Electroencephalographic abnormalities during sleep in children with developmental speech-language disorders: a case-control study. Developmental and Medical Child Neurology, 51, 228–234.10.1111/dmcn.2009.51.issue-3
- Pearson, K. (1897). On a form of spurious correlation that may arise when indices are used for the measurement of organs. Proceedings of the Royal Society of London, 60, 498.
- Peirce, C. (1906). Prolegomena to an apology for pragmaticism. The Monist, 16, 492–546.10.5840/monist190616436
- Pinheiro, J., Bates, D., Deb Roy, S., Sarkar, D., & the R Development Core Team. (2013). nlme: Linear and nonlinear mixed effects models. R package version 3.1-111.
- Plato. ( c360 BC). Politeia.
- Popescu, I.-I., Altmann, G., Grzybek, P., Jayaram, B. D., Köhler, R., Krupa, V., Macutek, J., Mehler, A., Pustet, R., Uhlírová, L., & Vidya, M. N. (2009). Word frequency studies Berlin, New York: de Gruyter.
- Ringe, D., Warnow, T., & Taylor, A. (2002). Indo-European and computational cladistics. Transactions of the Philosophical Society, 100, 59–129.10.1111/trps.2002.100.issue-1
- Salsbury, T., Crossley, S. A., & McNamara, D. S. (2011). Psycholinguistic word information in second language oral discourse. Second Language Research, 27, 343–360.10.1177/0267658310395851
- Sankoff, D., & Lessard, R. (1975). Vocabulary richness: a sociolinguistic analysis. Science, 190, 689–690.10.1126/science.190.4215.689
- Satorra, A., & Bentler, E. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis Paper presented at the American StatisticalA Proceedings: Business and Economic Statistics Section, Alexandria, VA.
- Schleicher, A. (1853). Die ersten Spaltungen des Indogermanischen Urvolkes. Allgemeine Monatsschrift für Wissenschaft und Literatur, ( August), 786–787.
- Schmidt, M., & Lipson, H. (2009). Distilling free-form natural laws from experimental data. Science, 324(5923), 81–85.10.1126/science.1165893
- Schmidt, M., & Lipson, H. (2013). Eureqa.
- Shapiro, S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611.10.1093/biomet/52.3-4.591
- Sichel, H. (1986). Word frequency distributions and type-token characteristics. Mathematical Scientist, 11, 45–72.
- Sigler, L. (2002). Fibonacci’s Liber Abaci, Leonardo Pisano’s book of calculations.: New York, NY: Springer.10.1007/978-1-4613-0079-3
- Smith, F., & Devine, K. (1985). Storing and retrieving word phrases. Inform Process management, 21, 215–224.10.1016/0306-4573(85)90106-2
- Somers, H. (1959). Analyse mathématique du langage Louvain: Nauwelaert.
- Suetonius. ( 1st century AD). De grammaticis et rhetoribus.
- Team, R. D. C. (2011). R: A language and environment for statistical computing Vienna, Austria: R Foundation for Statistical Computing.
- Templin, M. (1957). Certain language skills in children Minneapolis, MN: University of Minneapolis Press.
- Teuffel, W. (1870). Geschichte der Römischen Literatur Leipzig: CH Beck.
- Thomson, G., & Thompson, J. R. (1915). Outline of a measure for the quantitative analysis of writing vocabularies. British Journal of Psychology, 8, 52–69.
- Trissino, G. (1524). Epistola del Trissino de le lettere nuovamente aggiunte ne la lingua italiana: Vicentino.
- Tuldava, J. (1974). O statisticeskoj strukture teksta. Sovetskaja pedagogika i škola, 9, 5–33.
- Tuldava, J. (1980). A mathematical model of the vocabulary-text relation. Paper presented at the Proceedings of the 8th Conference on Compational Lingustics, Tokyo.
- Tuldava, J. (Ed.). (1993). The statistical structure of a text and its readability Trier: Wissenschaftlicher Verlag Trier.
- Tuldava, J. (1995). On the relation between text length and vocabulary size. In: J. Tuldava (Ed.), Methods in quantitative linguistics (pp. 131–150). Trier: Wissenschaftlicher Verlag Trier.
- Tuldava, J. (1996). The frequency spectrum of text and vocabulary. Journal of Quantitative Linguistics, 3, 38–50.10.1080/09296179608590062
- Tweedie, F., & Baayen, H. R. (1998). How variable may a constant be? Measures of lexical richness in perspective. Computers and the Humanities, 32, 323–352.10.1023/A:1001749303137
- Venables, W., & Ripley, B. D. (2002). Modern applied statistics with S (4th ed.): New York, NY: Springer.10.1007/978-0-387-21706-2
- Vulanovic, R., & Köhler, R. (2005). Syntax units and structures. In R. Köhler, G. Altmann & R. G. Piotrowski (Eds.), Quantitative Linguistics. An international handbook (pp. 274–291). Berlin, New York: de Gruyter.
- Wimmer, G. (2005). The type-token relation. In: R. Köhler, G. Altmann, & R. G. Piotrowski (Eds.), Quantitative Linguistics, An International Handbook (pp. 361–368). Berlin, New York: Walter de Gruyter.
- Yāska. ( 6th century BC). Nirukta.
- Yule, G. (1944). The statistical study of literary vocabulary Cambridge: Cambridge University Press.
- Zipf, G. (1932). Selective studies and the principle of relative frequency in language Cambridge, MA: Harvard University Press.10.4159/harvard.9780674434929