References
- Aminikhanghahi, S., & Cook, D. J. (2017). A survey of methods for time series change point detection. Knowledge and Information Systems, 51(2), 339–367. https://doi.org/https://doi.org/10.1007/s10115-016-0987-z
- Bentz, C., Alikaniotis, D., Cysouw, M., & Ferrer-i-Cancho, R. (2017). The entropy of words — Learnability and expressivity across more than 1000 languages. Entropy, 19(6), 275. https://doi.org/https://doi.org/10.3390/e19060275
- Čech, R. (2015). Text length and the lambda frequency structure of a text. In G. K. Mikros & J. Mačutek (Eds.), Sequences in language and text (pp. 73–87). Walter de Gruyter GmbH.
- Chen, R., Liu, H., & Altmann, G. (2017). Entropy in different text types. Digital Scholarship in the Humanities, 32(3), 528–542. https://doi.org/https://doi.org/10.1093/llc/fqw008
- Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian knot: The moving-average type–token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94–100. https://doi.org/https://doi.org/10.1080/09296171003643098
- Daller, H., Milton, J., & Treffers-Daller, J. (Eds.). (2007). Modelling and assessing vocabulary knowledge. Cambridge University Press.
- Daller, H., Van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of bilinguals. Applied Linguistics, 24(2), 197–222. https://doi.org/https://doi.org/10.1093/applin/24.2.197
- Fan, F., Yang, Y., & Wang, Y. (2016). The probability distribution of textual vocabulary in the English language. Journal of Quantitative Linguistics, 23(1), 49–70. https://doi.org/https://doi.org/10.1080/09296174.2015.1071149
- Grabchak, M., Zhang, Z., & Zhang, D. T. (2013). Authorship attribution using entropy. Journal of Quantitative Linguistics, 20(4), 301–313. https://doi.org/https://doi.org/10.1080/09296174.2013.830551
- Guiraud, P. (1954). Les Caractères Statistiques du Vocabulaire. Presses Universitaires de France.
- Hausser, J., & Strimmer, K. (2009). Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. Journal of Machine Learning Research, 10(50), 1469–1484. http://jmlr.csail.mit.edu/papers/volume10/hausser09a/hausser09a.pdf
- Heaps, H. S. (1978). Information retrieval: Computational and theoretical aspects. Academic Press.
- Herdan, G. (1960). Type-token mathematics: A textbook of mathematical linguistics. Mouton.
- Herdan, G. (1966). The advanced theory of language as choice and chance. Springer.
- Johansson, V. (2008). Lexical diversity and lexical density in speech and writing: A developmental perspective. Lund Working Papers in Linguistics, 53, 61–79.
- Johnson, W. (1944). Studies in language behaviour: A program of research. Psychological Monographs, 56(2), 1–15. https://doi.org/https://doi.org/10.1037/h0093508
- Juola, P. (2008). Assessing linguistic complexity. In M. Miestamo, K. Sinnemäki, & F. Karlsson (Eds.), Language complexity: Typology, contact, change (pp. 89–108). John Benjamins Press.
- Juola, P. (2013). Using the Google N-Gram corpus to measure cultural complexity. Literary and Linguistic Computing, 28(4), 668–675. https://doi.org/https://doi.org/10.1093/llc/fqt017
- Koplenig, A., Wolfer, S., & Müller-spitzer, C. (2019). Studying lexical dynamics and language change via generalized entropies: The problem of sample size. Entropy, 21(5), 464. https://doi.org/https://doi.org/10.3390/e21050464
- Kubát, M. (2014). Moving window type-token ratio and text length. In G. Altmann, R. Čech, J. Mačutek, & L. Uhlířová (Eds.), Empirical approaches to language and text analysis (pp. 105–113). RAM-Verlag.
- Kubát, M., Mačutek, J., & Čech, R. (2020). Communists spoke differently: An analysis of Czechoslovak and Czech annual presidential speeches. Digital Scholarship in the Humanities. Retrieved Accessed March 4, 2020, from. https://doi.org/https://doi.org/10.1093/llc/fqz089
- Kubát, M., & Milička, J. (2013). Vocabulary richness measure in genres. Journal of Quantitative Linguistics, 20(4), 339–349. https://doi.org/https://doi.org/10.1080/09296174.2013.830552
- Liu, Z. (2016). A diachronic study on British and Chinese cultural complexity with Google Books N-grams. Journal of Quantitative Linguistics, 23(4), 361–373. https://doi.org/https://doi.org/10.1080/09296174.2016.1226431
- Lozano, A., Casas, B., Bentz, C., & Ferrer-i-Cancho, R. (2016). Fast calculation of entropy with Zhang’s estimator. In E. Kelih, R. Knight, J. Macutek, & A. Wilson (Eds.), Issues in quantitative linguistics 4 (pp. 273–285). RAM-Verlag. Dedicated to Reinhard Köhler on the occasion of his 65th birthday. No. 23 of the series “Studies in Quantitative Linguistics”. Retrieved July 26, 2019, from https://arxiv.xilesou.top/abs/1707.08290
- Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern Language Journal, 96(2), 190–208. https://doi.org/https://doi.org/10.1111/j.1540-4781.2011.01232_1.x
- Malvern, D., & Richards, B. (1997). A new measure of lexical diversity. In A. Ryan & A. Wray (Eds.), Evolving models of language: Papers from the annual meeting of the British association for applied linguistics held at the University of Wales, Swansea, September 1996 British studies in applied linguistics (pp. 58–71). Multilingual Matters.
- Malvern, D., & Richards, B. (2013). Measures of lexical richness. In C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 3968–3972). Blackwell Publishing Ltd.
- Mason, O., (2000). Parameters of collocation: The word in the centre of gravity. In J. Kirk (Ed.), Corpora galore: Analyses and techniques in describing English (pp. 267–280). Papers from the Nineteenth International Conference on English Language Research on Computerised Corpora (ICAME 1998). Amsterdam: Rodopi–Atlanta.
- McCarthy, P. M., & Jarvis, S. (2007). Vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459–488. https://doi.org/https://doi.org/10.1177/0265532207080767
- Mckee, G., Malvern, D., & Richards, B. (2000). Measuring vocabulary diversity using dedicated software. Digital Scholarship in the Humanities, 15(3), 323–338. https://doi.org/https://doi.org/10.1093/llc/15.3.323
- Miranda-García, A., & Calle-Martín, J. (2005). The validity of lemma-based lexical richness in authorship attribution: A proposal for the old English gospels. ICAME Journal, 29, 115–129. http://korpus.uib.no/icame/ij29/ij29-page115-130.pdf
- Nemenman, I., Shafee, F., & Bialek, W. (2002). Entropy and inference, revisited. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems 14 (pp. 472–478). MIT Press.
- Popescu, -I.-I., Čech, R., & Altmann, G. (2011). The lambda-structure of texts. RAM-Verlag.
- Popescu, -I.-I., Mačutek, J., & Altmann, G. (2008). Word frequency and arc length. Glottometrics, 17, 18–42. https://www.ram-verlag.eu/wp-content/uploads/2018/08/g17zeit.pdf
- Popescu, -I.-I., Mačutek, J., & Altmann, G. (2010). Word forms, style and typology. Glottotheory, 3(1), 89–96. https://doi.org/https://doi.org/10.1515/glot-2010-0006
- Popescu, -I.-I., Mačutek, J., Kelih, E., Čech, R., Best, K.-H., & Altmann, G. (2010). Vectors and codes of text. RAM-Verlag.
- Popescu, -I.-I., Vidya, M. N., Uhlířová, L., Pustet, R., Mehler, A., Mačutek, J., Krupa, V., Köhler, R., Jayaram, B. D., Grzybek, P., & Altmann, G. (2009). Word frequency studies. Mouton de Gruyter.
- Rajput, N. K., Ahuja, B., & Riyal, M. K. (2018). A novel approach towards deriving vocabulary quotient. Digital Scholarship in the Humanities, 33(4), 894–901. https://doi.org/https://doi.org/10.1093/llc/fqy014
- Read, J. (2000). Assessing vocabulary. Cambridge University Press.
- Richards, B. (1987). Type/token ratios: What do they really tell us? Journal of Child Language, 14(2), 201–209. https://doi.org/https://doi.org/10.1017/S0305000900012885
- Sadeghi, K., & Dilmaghani, S. K. (2013). The relationship between lexical diversity and genre in Iranian EFL learners’ writings. Journal of Language Teaching and Research, 4(2), 328–334. https://doi.org/https://doi.org/10.4304/jltr.4.2.328-334
- Shah, S. K., Gill, A. A., Mahmood, R., & Bilal, M. (2013). Lexical richness, a reliable measure of intermediate L2 learners’ current status of acquisition of English language. Journal of Education and Practice, 4(6), 42–47. https://www.iiste.org/Journals/index.php/JEP/article/view/4811/4890
- Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423. https://doi.org/https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
- Shannon, C. E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 30(1), 50–64. https://doi.org/https://doi.org/10.1002/j.1538-7305.1951.tb01366.x
- Smith, J. A., & Kelly, C. (2002). Stylistic constancy and change across literary corpora: Using measures of lexical richness to date works. Computers and the Humanities, 36(4), 411–430. https://doi.org/https://doi.org/10.1023/A:1020201615753
- Somers, H. H. (1966). Statistical methods in literary analysis. In J. Leeds (Ed.), The computer and literary style (pp. 128–140). Kent State University Press.
- Štajner, S., & Zampieri, M. (2013). Stylistic changes for temporal text classification. In I. Habernal & V. Matousek (Eds.), Text, speech, and dialogue: 16th international conference, TSD 2013, Pilsen, Czech Republic, September 2013 proceedings. Berlin: Springer-Verlag.
- Treffers-Daller, J., Parslow, P., & Williams, S. (2018). Back to basics: How measures of lexical diversity can help discriminate between CEFR levels. Applied Linguistics, 39(3), 302–327. https://doi.org/https://doi.org/10.1093/applin/amw009
- Wang, Y., & Liu, H. (2018). Is Trump always rambling like a fourth-grade student? An analysis of stylistic features of Donald Trump’s political discourse during the 2016 election. Discourse & Society, 29(3), 299–323. https://doi.org/https://doi.org/10.1177/0957926517734659
- Yule, G. U. (1944). The statistical study of literary vocabulary. Cambridge University Press.
- Zhang, Y. (2014). A corpus-based analysis of lexical richness of Beijing Mandarin speakers: Variable identification and model construction. Language Sciences, 44, 60–69. https://doi.org/https://doi.org/10.1016/j.langsci.2013.12.003
- Zhang, Y. (2015). Entropic evolution of lexical richness of homogeneous texts over time: A dynamic complexity perspective. Journal of Language Modelling, 3(2), 569–599. https://doi.org/https://doi.org/10.15398/jlm.v3i2.111
- Zhang, Z. (2012). Entropy estimation in turing’s perspective. Neural Computation, 24(5), 1368–1389. https://doi.org/https://doi.org/10.1162/NECO_a_00266
- Zhu, H., & Lei, L. (2018a). British cultural complexity: An entropy-based approach. Journal of Quantitative Linguistics, 25(2), 190–205. https://doi.org/https://doi.org/10.1080/09296174.2017.1348014
- Zhu, H., & Lei, L. (2018b). Is modern English becoming less inflectionally diversified? Evidence from entropy-based algorithm. Lingua, 216, 10–27. https://doi.org/https://doi.org/10.1016/j.lingua.2018.10.006