51
Views
0
CrossRef citations to date
0
Altmetric
Research Article

An Information-Theoretic Approach to Morphosyntactic Complexity in English, Dutch and German

ORCID Icon, ORCID Icon & ORCID Icon
Published online: 10 Jul 2024
 

ABSTRACT

Though equi-complexity of languages has long been an assumption in the study of language, recent research has argued that languages differ with regard to their morphological complexity. Some languages rely more on bound morphology to express grammatical meanings, whereas other languages rely more on lexical or word-order-based strategies. It is a moot point to what extent these differences correlate with demographic factors such as population size and language contact, and how complexity should be measured. In this study, we use information-theory, more specifically Kolmogorov Complexity, to assess morphological and word-order-based strategies and apply the procedure to three West Germanic languages, English, Dutch and German, which have been argued to form a continuum along the morphological complexity cline, plausibly due to different rates of demographic upheaval and concomitant language contact. Tracing the morphological and word-order-based complexity through time in parallel Bible translations, our results show that English is consistently less morphologically complex than its sisters, continuing on its path of morphological simplification. We see a statistically robust trade-off between morphology and word order complexity. We find support for earlier findings in the literature, with the exception of the difference between Dutch and German, which does not transpire in our results.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Data Availability Statement

The data that support the findings of this study are openly available in Zenodo at https://doi.org/10.5281/zenodo.8112777.

Notes

1. Transparency-based complexity is sometimes also classified within the relative approach because it conflates both irregularity-based and efficiency-based complexity.

2. For convenience, the concise description is provided in the programming language Python rather than in natural language.

3. However, it does not encompass redundancy-based complexity, nor any of the metrics within the relative approach (for an in-depth explanation see Ehret, Citation2017, pp. 2–5).

4. We tried to test if it is indeed the case that the morphological complexity measures for Dutch are higher due to the differences in word segmentation. We applied the morphological complexity distortion algorithm as explained in section 2.3, but we first removed all whitespaces from the texts. The resulting morphological complexity measures are not decidedly different from the results that we get when the texts are distorted normally (so without removing whitespaces beforehand), i.e. the positions of Dutch and German are not suddenly switched.

5. The Dutch Hernse Bijbel (1360), Delftse Bijbel (1477), and Professorenbijbel (1911) consist solely of a transcript of the Old Testament, while the Dutch Noord-Nederlandse Vertaling van het Nieuwe Testament (1399) and Hamelsveld Bijbel (1790) as well as the German Mentelin Bibel (1460), Kölner Bibel (1478), Mainzer Bibel (1661), and Rosalino Bibel (1781) are transcripts of the New Testament.

6. It is important to note that the word-final approach does not completely exempt roots and stems from being distorted. For instance, words like candy are more than three characters long and are therefore still the target of the distortion algorithm, as any of the characters n, d and y could be deleted. While this revised approach is more restrictive than the original one, it is still not aware of any specific morphological rules. Still, it seems interesting to compare the two approaches to find out which renders better results.

7. From a processing point of view (relative complexity), a language with strict word order (such as English) is easier to decode than a language with free word order. However, as mentioned earlier in the paper (in Section 2.2), the present methodology is embedded within the absolute view on complexity. This absolute view is mainly concerned with the number of word order rules. In this view, a language with rigid word order is more complex, because such a language has more constraints on the word order rules. This line of reasoning goes back at least to Greenberg (Citation1960), who calculates, as one of his metrics, the proportion of word order links over the total number of nexus.

8. We first considered a model with the morphological complexity as the dependent variable and as the independent variables year in interaction with language. However, the size of our dataset (47 observations) does not allow such an elaborate model with two variables and interaction effects.

9. The small number for the estimate is due to the measurement scales: differences in the complexity ratio are noticeable at two decimal places, and the independent variable scales per year. This scale-dependent measure means that the low estimates should not be taken as an indication of a vanishingly small effect. This is corroborated by the sizable R2 value for the significant trend in English.

10. We considered testing the claims in the literature on the demographic differences, working with average urban growth data (see also De Smet et al., Citation2017), using Granger Causality (see Moscoso Del Prado Martín, Citation2014; Rosemeyer & Van de Velde, Citation2021). Due to the low granularity of the time series (47 texts and only 7 centuries as time steps), this did not lead to any reliable results.

Additional information

Funding

This work was supported by the Research Foundation Flanders (FWO) under Grant number G071719N. We wish to thank Eloisa Ruppert for preliminary work.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 394.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.