361
Views
3
CrossRef citations to date
0
Altmetric
Research Article

Part-of-Speech Sequences in Literary Text: Evidence From Ukrainian

&

References

  • Aitchison, L., Corradi, N., & Latham, P. E. (2016). Zipf’s law arises naturally when there are underlying, unobserved variables. PLoS Computational Biology, 12, e1005110, 1–32.
  • Altmann, G. (2014). Supra-sentence levels. Glottotheory, 5, 25–39.
  • Ausloos, M. (2010). Punctuation effects in English and Esperanto texts. Physica A, 389, 2835–2840.10.1016/j.physa.2010.02.038
  • Best, K.-H., & Zinenko, S. (1998). Wortkomplexität im Ukrainischen und ihre linguistische Bedeutung [Word complexity in Ukrainian and its linguistic meaning]. Zeitschrift für Slavische Philologie, 58, 107–123.
  • Best, K.-H., & Zinenko, S. (1999). Wortlängen in Gedichten des ukrainischen Autors Ivan Franko [Word length in poems of the Ukrainian author Ivan Franko]. In J. Genzor & S. Ondrejovič (Eds.), Pange lingua. Zborník na počest’ Viktora Krupu (pp. 201–213). Bratislava: Veda.
  • Buk, S. (2007). Korpus tekstiv Ivana Franka: sproba vyznačennja osnovnykh parametriv [Ivan Franko text corpus: An attempt to define basic parameters]. In V. A. Shyrokov (Ed.), Prykladna linhvistyka ta linhvistyčni tekhnolohiji: Megaling–2006 (pp. 72–82). Kyiv: Dovira.
  • Buk, S. (2011a). Prjama j avtors’ka mova velykoji prozy Ivana Franka: linhvostatystyčne doslidžennja u konteksti korpusnoji linhvistyky [Direct and author’s speech of the long prose fiction by Ivan Farnko: Linguostatistical study in the context of corpus linguistics]. Visnyk Lvivs’koho universytetu. Serija filolohična, 52, 199–209.
  • Buk, S. (2011b). Roman Ivana Franka “Dlja domašnjoho ohnyšča” kriz’ prymu častotnoho slovnyka [Ivan Franko’s novel “Dlja domashnjoho ohnyshcha” (“For the Hearth”) from the frequency dictionary perspective]. Movoznavstvo, 4, 56–66.
  • Buk, S. (2011c). Častyny movy u slovnyku i teksti Ivana Franka (na materiali velykoji prozy) [Parts of speech in the dictionary and text by Ivan Franko (on the material of long prose fiction)]. Linhvistyčni studiji, 22, 62–66.
  • Buk, S. (2012a). The epithetization index in a work of fiction (on the basis of the text corpus of Ivan Franko’s long prose fiction). In A. Obrębska (Ed.), Practical applications of linguistic research (pp. 73–85). Łódź: Primum Verbum.
  • Buk, S. (2012b). Častotnyj slovnyk romanu Ivana Franka “Osnovy suspil’nosti”: Interpretacija tvoru kriz’ pryzmu statystyčnoji leksykohrafiji [Frequency dictionary of Ivan Franko’s novel “Osnovy suspil’nosti” (“Pillars of Society”). Interpretation from the perspective of statistical lexicography] (F. S. Bacevyč, Ed.). Lviv: Lviv University Press.
  • Buk, S. (2013a). Kvantytatyvna parametryzacija tekstiv Ivana Franka: proekt ta joho realizacija [Quantitative parametrization of texts written by Ivan Franko: The project and its realisation]. Visnyk Lvivskoho universytetu. Serija filolohična, 58, 290–307.
  • Buk, S. (2013b). Častotnyj slovnyk povisti Ivana Franka “Boa constrictor” (redakcija 1884 r.) [Frequency dictionary of Ivan Franko’s narrative “Boa constrictor” (edition of 1884)]. In F. S. Bacevyč (Ed.), Stežkamy Frankovoho tekstu (komunikatyvni, kohnityvni ta linhvostatystyčni vymiry prozy) (pp. 202–501). Lviv: Lviv University Press.
  • Buk, S. (2014). A quantitative analysis of the novel “Ne spytavsy brodu” by Ivan Franko. Speech and Context: International Journal of Linguistics, Semiotics and Literary Science, 6, 100–112.
  • Buk, S. N., & Rovenchak, A. A. (2004). Rank–frequency analysis for functional style corpora of Ukrainian. Journal of Quantitative Linguistics, 11, 161–171.10.1080/0929617042000314912
  • Buk, S., & Rovenchak, A. (2007). Častotnyj slovnyk romanu Ivana Franka “Perekhresni stežky” [Frequency dictionary of Ivan Franko’s novel “Perekhresni stežky” (“The Cross-Paths”)]. In F. S. Bacevyč (Ed.), Stežkamy Frankovoho tekstu (komunikatyvni, stylistyčni ta leksyčni vymiry romanu “Perekhresni stežky”) (pp. 138–369). Lviv: Lviv University Press.
  • Buk, S., & Rovenchak, A. (2008). Menzerath-Altmann law for syntactic structures in Ukrainian. Glottotheory, 1, 10–17.
  • Denysiuk, I. (2008). Novatorstvo Franka-prozajika [Innovation of Franko the proser]. Ukrajinske literaturoznavstvo, 70, 138–152.
  • Ferrer i Cancho, R., & Solé, R. V. (2001). Two regimes in the frequency of words and the origins of complex lexicons: Zipf’s law revisited. Journal of Quantitative Linguistics, 8, 165–173.10.1076/jqul.8.3.165.4101
  • Ferrer-i-Cancho, R., & del Prado Martín, F. M. (2011). Information content versus word length in random typing. Journal of Statistical Mechanics: Theory and Experiment, 2011, L12002, 1–9.
  • Ferrer-i-Cancho, R., & Elvevåg, B. (2010). Random texts do not exhibit the real Zipf’s law-like rank distribution. PLoS ONE, 5, e9411, 1–10.
  • Franko, I. (1912). Kamenjari. Ukrajins’kyj tekst i pol’s’kyj pereklad. Deščo pro štuku perekladannja [The Quarriers. Ukrainian text and Polish translation. Something about the art of translation]. Lviv.
  • Ha, L. Q., Sicilia-Garcia, E. I., Ming, J., & Smith, F. J. (2002). Extension of Zipf’s law to words and phrases. In Shu-Chuan Tseng, Tsuei-Er Chen, & Yi-Fen Liu (Eds.), COLING 2002: Proceedings of the 19th International Conference on Computational Linguistics (Vol. 1, pp. 315–320). Stroudsburg, PA: Association for Computational Linguistics.
  • van Heuven, W. J. B., Mandera, P., Keuleers, E., & Brysbaert, M. (2014). SUBTLEX-UK: A new and improved word frequency database for British English. Quarterly Journal of Experimental Psychology, 67, 1176–1190.10.1080/17470218.2013.850521
  • Holovatch, Yu, & Palchykov, V. (2007). Fox Mykyta and networks of language. Journal of Physical Studies, 11, 22–33.
  • Holovatch, Yu, & Palchykov, V. (2017). Complex networks of words in fables. In R. Kenna, M. MacCarron, & P. MacCarron (Eds.), Maths meets myths: Quantitative approaches to ancient narratives (pp. 159–175). Cham: Springer International Publishing.10.1007/978-3-319-39445-9
  • Jayaweera, A. J. P. M. P., & Dias, N. G. J. (2014). Unknown words analysis in POS tagging of Sinhala language. Machine Learning and Applications: An International Journal, 1, 1–13.
  • Johnson, N. L., Kemp, A. W., & Kotz, S. (2005). Univariate discrete distributions. Hoboken, NJ: Wiley.10.1002/0471715816
  • Kaniadakis, G. (2001). Non-linear kinetics underlying generalized statistics. Physica A, 296, 405–425.10.1016/S0378-4371(01)00184-4
  • Kaniadakis, G. (2013). Theoretical foundations and mathematical formalism of the power-law tailed statistical distributions. Entropy, 15, 3983–4010.10.3390/e15103983
  • Kelih, E., Rovenchak, A., & Buk, S. (2014). Analysing h-point in lemmatised and nonlemmatised texts. In G. Altmann, R. Čech, J. Mačutek, & L. Uhlířová (Eds.), Studies in quantitative linguistics 17: Empirical approaches to text and language analysis; dedicated to Luděk Hřebiček on the occasion of his 80th birthday (pp. 81–93). Lüdenscheid: RAM-Verlag.
  • Köhler, R. (2006). The frequency distribution of the lengths of length sequences. In J. Genzor & M. Buckova (Eds.), Favete linguis. Studies in honour of Victor Krupa (pp. 145–152). Bratislava: Slovak Academic Press.
  • Köhler, R. (2015). Linguistic motifs. In G. K. Mikros & J. Mačutek (Eds.), Sequences in language and text (pp. 89–108). Berlin; Boston: Walter de Gruyter GmbH.
  • Köhler, R., & Naumann, S. (2016). Syntactic text characterisation using linguistic S-motifs. Glottometrics, 34, 1–8.
  • Mačutek, J., & Altmann, G. (2007). Discrete and continuous modelling in quantitative linguistics. Journal of Quantitative Linguistics, 14, 81–94.10.1080/09296170600850627
  • Manning, C. A. (Ed.). (1948). Ivan Franko, the poet of Western Ukraine: Selected poems. (P. Cundy, Trans.). New York, NY: Philosophical Library.
  • Mehta, P., & Majumder, P. (2016). Large scale quantitative analysis of three Indo-Aryan languages. Journal of Quantitative Linguistics, 23, 109–132.10.1080/09296174.2015.1071151
  • Montemurro, M. A. (2001). Beyond the Zipf-Mandelbrot law in quantitative linguistics. Physica A, 300, 567–578.10.1016/S0378-4371(01)00355-7
  • Pande, H., & Dhami, H. S. (2013). On mathematical modeling of pattern of occurrence of various constitutional components of language. International Journal of Mathematics and Scientific Computing, 3, 19–27.
  • Pastukh, T. (1996). Roman u systemi prozovykh tvoriv Ivana Franka [Novel in the system of Ivan Franko prose works]. Ukrajinske literaturoznavstvo, 62, 100–108.
  • Perebyjnis, V. S. (Ed.). (1967). Statystyčni parametry styliv [Statistical parameters of styles]. Kyjiv: Naukova dumka.
  • Popescu, I.-I., & Altmann, G. (2008). On the regularity of diversification in language. Glottometrics, 17, 94–108.
  • Popescu, I.-I., Altmann, G., Grzybek, P., Jayaram, B. D., Köhler, R., Krupa, V., … Vidya, M. N. (2009). Word frequency studies. Berlin-New York: Mouton de Gruyter.
  • Rovenchak, A. (2014). Trends in language evolution found from the frequency structure of texts mapped against the Bose-distribution. Journal of Quantitative Linguistics, 21, 281–294.10.1080/09296174.2014.911510
  • Rovenchak, A. (2015). Models of frequency spectrum in texts based on quantum distributions in fractional space dimensions. In I. Dumitrache, A. M. Florea, F. Pop, & A. Dumitraşcu (Eds.), 20th International Conference on Control Systems and Computer Science CSCS 2015: Proceedings, 27–29 May 2015, Bucharest, Romania (Vol. 2, pp. 645–649). Los Alamitos, CA: IEEE Computer Society.10.1109/CSCS.2015.44
  • Rovenchak, A., & Buk, S. (2011a). Application of a quantum ensemble model to linguistic analysis. Physica A, 390, 1326–1331.10.1016/j.physa.2010.12.009
  • Rovenchak, A., & Buk, S. (2011b). Defining thermodynamic parameters for texts from word rank-frequency distributions. Journal of Physical Studies, 15, 1005, 1–6.
  • Sanada, H. (2010). Distribution of motifs in Japanese texts. In P. Grzybek, E. Kelih, & J. Mačutek (Eds.), Text and language. Structures, Functions, Interrelations, Quantitative perspectives (pp. 183–193). Wien: Praesens Verlag.
  • Taylor, P., & Black, A. W. (1998). Assigning phrase breaks from part-of-speech sequences. Computer Speech and Language, 12, 99–117.10.1006/csla.1998.0041
  • Tsallis, C. (1988). Possible generalization of Boltzmann-Gibbs statistics. Journal of Statistical Physics, 52, 479–486.10.1007/BF01016429
  • Tuldava, J. (1996). The frequency spectrum of text and vocabulary. Journal of Quantitative Linguistics, 3, 38–50.10.1080/09296179608590062
  • Tyščenko, V. (1970). Častota častyn movy v riznykh funkcional’nykh styljakh sučasnoji ukrajins’koji movy [Frequency of parts of speech in various functional styles of modern Ukrainian]. In V. S. Perebyjnis, & M. P. Muravytska (Eds.), Pytannja strukturnoji leksykolohiji [Issues of the structure lexicology] (pp. 215–224). Kyjiv: Naukova dumka.
  • Wang, L. (2016). Part-of-speech studies in Chinese. Journal of Quantitative Linguistics, 23, 235–255.10.1080/09296174.2016.1169851
  • Williams, J. R., Lessard, P. R., Desu, S., Clark, E. M., Bagrow, J. P., Danforth, C. M., & Dodds, P. S. (2015). Zipf’s law holds for phrases, not words. Scientific Reports, 5, 12209, 1–7.
  • Wimmer, G., & Altmann, G. (1999). Thesaurus of univariate discrete probability distributions. Essen: Stamm.
  • Zipf, G. K. (1935). The psychobiology of language. New York, NY: Houghton Miin.
  • Zipf, G. K. (1949). Human behavior and the principle of least effort. Cambridge, MA: Addison-Wesley.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.