611
Views
0
CrossRef citations to date
0
Altmetric
Research Article

On an Interaction Model of General Language Change

, ORCID Icon, &

ABSTRACT

In the following article, we construct an interaction model of general language change. This contributes in particular to quantitative studies on reversible language change initiated by G. Altmann by adding explanatory character in tracing global features of general language change back to the individual interaction of speakers. Although the corresponding coupled differential equations are (presumably) non-integrable, we use methods from the theory of dynamical systems to deduce the long-term behaviour (depending on four interaction parameters) of the model for any given initial constellation of speakers. Subsequent numerical analysis of real data on language change is used to justify the relevance of the constructed model for the practicing quantitative linguist. We show how data-fitting methods can be used to determine the four interaction parameters and predict from them the long-term behaviour of the system, i.e. if complete language change or reversible language change will take place.

1. Introduction and Historical Context

Since its discovery, Piotrowski’s law (Piotrovskaja & Piotrovskij, Citation1974) has been tested and verified in various contexts of Quantitative Linguistics. The widespread use of this law, often called the ‘S-law’, has several reasons. Firstly, the similarity to the simplest mathematical model of the spread of contagious diseases, the SIR model (for an explanation see below), is evident (or more precisely, the Piotrowski-Altmann law can even be deduced from that analogy) and puts language change on a sound footing: the interaction of individual speakers. Secondly, the corresponding dynamical system is exactly soluble, thus there exists a mathematical function with two parameters (later three parameters) which can be fitted data-analytically to an empirically obtained dataset resulting in the possibility to immediately judge the quality of the modelling procedure. There has been some effort to apply the Piotrowski-Altmann law to situations like reversible language change, where for example a word initially gains popularity before later growing out of fashion (cf. the well-known paper by Altmann (Citation1983)). Another situation where variants of the Piotrowski-Altmann law have been successfully applied is the two-stage change Vulanović and Baayen (Citation2007). Since Piotrowski’s law (Piotrovskaja & Piotrovskij, Citation1974) is a priori either monotonically increasing or decreasing, it is principally not suitable for modelling such situations. In Altmann (Citation1983) and Vulanović and Baayen (Citation2007), the constant coefficient in the logistic differential equation is replaced by a linear or a higher-order polynomial in the time variable to account for reversible language change or the two-stage change. This is therefore of quantitative value but not of explanatory value for the underlying processes. As Altmann states in (Altmann, Citation1983) on pages 87–89, the next step should be a model of language change which comprises reversible language change in such a way that it can be traced back to the genuine interactions of speakers. This is the first step to connect language change to social, economic or historic reasoning.

The motivation of this paper lies in the fact that we wish to construct such an interaction model for language change (the so-called PLC model) for speakers of some language, which contains Piotrowski-Altmann’s law as a special case but which also contains other forms of language change like the reversible language change.

If we assume that the widespread use of the Piotrowski-Altmann law rests on the two criteria that it explains language change as an interaction process and that it is completely integrable, we meet the first of these two criteria. This is a starting point for the PLC model to become a model of value to a practising quantitative linguist. Even though our model is not exactly soluble, it can be analysed with fairly straightforward mathematical methods, leading to a complete classification of the resulting scenarios of language change depending on four interaction parameters. The model is also accessible to numerical analysis, which we carry out doing in the last part of the paper to show how empirical data-fitting can be employed to obtain the interaction parameters in order to be able to make predictions about language change in special situations. We provide the used Code (in Python) in the form of a jupyter-notebook so that the interested reader has an interactive format to check the validity of the individual steps or to apply the described methods to own datasets.

Crucial to showing the described classification of language change is the non-existence of periodic orbits of the underlying dynamical system, which we show in this paper. This is of mathematical interest in its own right.

For the sake of completeness, we wish to point out that another approach, in spirit very similar to ours, is Wheeler (Citation2007), who proposes similar techniques for a different model and with a different focus.

2. The Description of the PLC Model

2.1. The Motivational Case: SIR Model

The PLC model (short for Progressive, Liberal, Conservative) is set up in analogy to the SIR model, which is often used to study the spread of contagious diseases like measles. Here S stands for the number of ‘Susceptible Individuals’, I for the number of ‘Infectious Individuals’ and R for the number of ‘Resistant Individuals’. The SIR model is a system of coupled differential equations which mimic the interaction of the described groups. If for example a susceptible and an infected individual meet, there is a certain probability that the number of infected individuals increases by one and the number of susceptible individuals decreases by one (i.e. the susceptible individual has been infected). The model has gained some popularity in the Corona pandemic, cf (Dönges et al., Citation0000). We are interested in a slightly different model adopted to the current situation of language change, hence we are not going to delve into the SIR model, which has only been named for motivational reasons (the interested reader is referred to (Wikipedia, Citation2023b)). Language change bears a strong resemblance: if two speakers meet (an ‘infected’ one who already uses some new linguistic construct like a new word and a ‘susceptible’ one, who is prone to learn the new word), there is a certain probability that the susceptible speaker will adopt the new word. We wish to point out that Piotrowski-Altmann’s law can be deduced from a special case of the SIR-model (where no ‘Resistant’ individuals are present) and results in the so-called ‘logistic equation’, which arises in various areas of mathematical modelling.

2.2. Assumptions of the PLC Model

In the following, we discuss the assumptions made to obtain a meaningful model. Firstly we are, as in the case of the Piotrowski-Altmann law, taking no local variations like borders or other language boundaries into account. There are also models that do, but they are of a very different flavour, cf (Vogl & Prochazka, Citation2020). We therefore assume a homogeneous speaker community of total size N distributed evenly over some fictional country. We further assume that our country is divided into equally sized ‘interaction’ spaces (imagine an equally spaced grid), meaning that if two individuals accidentally meet in such a space, there is a certain probability of interaction (expressed below by a Greek letter). Since we assume evenly distributed populations, the probability for an individual of group A to be in such an interaction space is proportional to the number N(A) of individuals of population A. Assuming the independence of the spatial distribution of two groups A and B, the probability of an individual of A and one of B to meet in a certain interaction space is proportional to N(A)N(B). We assume further that the total population of speakers has been exposed to a new linguistic construct like a new word which exists for a pre-existing one in the considered language (hence there is an old and a new version: ward, wurde in German, cf (Altmann, Citation1983).

Instead of the above-mentioned three groups (S,I,R) in the context of infectious diseases, we consider in the context of language change:

• (P): progressive speakers, who only use the new feature (e.g. word) and try to spread its use.

• (L): liberal speakers, who are indifferent towards using the new feature (but do not speak it)

• (C): conservative speakers, who refrain from using the new feature and try to convince others to use the old feature.

As in the SIR model, P(t) describes the number of individuals in the population of the progressive speakers at time t. Analogously for L(t),C(t). From the assumption about the total size, it follows for all times: P(t)+L(t)+C(t)=N.

Now we describe the interactions:

  1. If a P and an L meet: There is a certain probability that L is converted to P (encoded in α˜0).

  2. If a C and an L meet: There is a certain probability that L is converted to C (encoded in γ˜0).

  3. If a P and a C meet: There is a certain, individual probability that each is converted to L (encoded in β˜,δ˜0).

Remark 2.1. Please note that the Greek parameters introduced above are not actually the probabilities, since for the sake of simplicity additional aspects are incorporated into these parameters (e.g. number of interaction spaces and the like). Nevertheless, they are proportional to the interaction probabilities.

Thus, the coupled system of differential equations of the PLC model reads:

(1) P =α˜LPβ˜PC(1)
(2) L =α˜LPγ˜LC+(β˜+δ˜)PC(2)
(3) C =γ˜LCδ˜PC(3)
(4) N=L+C+P(4)

A prime indicates differentiation with respect to time (the independent variable in the above system of coupled differential equations). In the following, we are going to deduce properties of the described dynamical system and set them into perspective with regard to different scenarios of language change.

3. Mathematical Analysis of the Model

A priori, the PLC model is three-dimensional, but since the total number of speakers is preserved, it can be reduced to a two-dimensional dynamical system.

3.1. Reduction to Two Dimensions

Applying N=P+L+C to eliminate the variable L, we obtain the system:

(5) P =α˜P(NP)(α˜+β˜)PC(5)
(6) C =γ˜C(NC)(γ˜+δ˜)PC(6)

For convenience, it is best to normalize the involved quantities and express the system in terms of the lower-case letters and in the renamed parameters:

(7) x=P/N;y=C/N;α=Nα˜;(7)
(8) β=N(α˜+β˜);γ=Nγ˜;δ=N(γ˜+δ˜).(8)

A short calculation shows that the PLC model is equivalent to the following dynamical system:

(9) x˙=αx(1x)βxy(9)
(10) y˙=γy(1y)δxy(10)

The dynamical variables x,y describe the fraction of the total population of progressive and conservative speakers, respectively. Hence the range of x,y is restricted to the unit interval x,y[0,1] and α,β,γ,δR0+. Moreover, since the sum of the fractions cannot exceed 1: x+y1. Thus, the dynamics of the system is restricted to the triangle:

(11) Δ={(x,y)R2|0x,yx+y1}.(11)

In the sequel, we refer to the above dynamical system (Δ together with the equations) as the PC system.

3.1.1. Remark on the Redefined Parameters

As can be seen directly from the definition α<βαβ<1 as well as γ<δγδ<1 (if generically α˜,β˜,γ˜,δ˜>0). Since this is crucial for the analysis in the following we generically deduce: αβ<0;γδ<0. Further, it follows straight away that D:=αγβδ<0. For a discussion of the non-generic cases, see Section 3.5.

3.1.2. Remark on the Relation to the Piotrowski-Altmann Law

What immediately springs to mind is the relation to the logistic growth. In each of the two equations, the first term precisely describes the logistic growth which in the second term is reduced proportionally to interactions.

Thus, we can deduce instantaneously that the Piotrowski-Altmann law is comprised within the PC model. To be more precise, if we start out with no conservative speakers at all, then their number remains zero for all times and we end up with:

(12) P =α˜P(NP)(12)

which is the logistic differential equation or in relative terms:

(13) x˙=αx(1x)(13)

For the sake of completeness, we include the solution to the differential equation 13 (the usual form of the well-known Piotrowski-Altmann law):

(14) x(t)=11+eαt+b(14)

To include incomplete language change, the law is generalized to:

(15) x(t)=c1+eαt+b(15)

3.1.3. Remark on Possible Generalizations and Their Implications

It is not hard to show that the model can also be applied to certain situations where the interaction parameters are also negative. A necessary condition is given by:

(16) δ+βα+γ(16)

This condition assures that the flow of the dynamical system points inwards of Δ. The interpretation of the interactions is then as follows: A negative β˜ for example can be interpreted as the possibility that progressive speakers can turn conservative speakers into progressive ones without having to change them to liberal speakers first. The actual process is more involved but can be seen in exactly that way. For example, if β˜ is negative and δ˜=β˜, it is assured on the one hand that the number of liberal speakers will not change in time if progressive and conservative speakers meet. On the other hand, meeting of progressive and conservative speakers results in a decline in the number of conservative speakers and in an increase in the number of progressive speakers.

Linguistically, this case may reflect a new loanword that does not seem really alien to a conservative speaker from the very beginning because its phonological and phonotactic structure is also typical of words of their own language, e.g. German cool, Trend, Flirt or Trick, taken in from English. (So in a way, the conservative speaker behaves progressively, possibly without intending to.)

A very interesting issue arises if one allows more than one negative parameter. As it turns out in this case, given the right choice of parameters, the dynamical system even allows periodic orbits. This gives rise to a periodically repeating process! We plan to investigate this matter further in future research but refrain from doing so in the current paper.

A possible numerical example with one negative parameter (curve-fitting for the e-epithesis 4) is given by: α=0.1076;β=2.3732;γ=0.0377;δ=1.1806. More details about the mathematical analysis in this case are given in Section 3.5.

The methods described in the subsequent sections are also applicable to a much wider class of models describing the interactions of speakers. To give an inspiration of possible generalizations, we wish to point out that the conclusions of the paper carry over almost unaltered to models governed by systems of coupled differential equations of the form:

(17) x˙=α(x)(ax+by+C)(17)
(18) y˙=β(y)(cx+dy+D)(18)

where α,β are arbitrary positive, smooth functions and where a,b,c,d,C,D are real numbers.

3.2. The Critical Points

A short calculation shows that the critical points of the PC system (where both x˙, y˙ vanish, i.e. where the dynamical system becomes stationary) lie at:

(19) C0=(0,0);Cx=(1,0);Cy=(0,1);(19)
(20) C=γ(αβ)D,α(γδ)D.(20)

It is not hard to see that the point C always lies within Δ. This corresponds to the following scenarios:

  1. At C0: in our speaker community, there are only liberal speakers, but with no exposure to the new feature (so only the old feature is used).

  2. At Cx: in our speaker community, there are only progressive speakers, so everyone adopts the new feature (complete language change).

  3. At Cy: in our speaker community, there are only conservative speakers, so nobody adopts the new feature.

  4. At C: depending on the parameters, there is a fixed share of speakers using the new feature and the rest not using it (incomplete language change).

3.3. Outlook and Putting into Context

We are going to show that in the long term, the scenarios described above are the only possible outcomes allowed by the PC model. By this we mean that given the parameters α,β,γ,δ, any imaginable speaker constellation will for long times approach one of the described scenarios and we are able to tell what the outcome will be. We therefore expect great predictive power of our model when combined with suitable data-fitting procedures. To be more precise, we expect that given a dataset of how the usage of some new feature has developed over some period of time, after applying a data-fitting procedure to obtain estimates for the parameters, we can predict how the usage of the feature under consideration will terminate, i.e. if it becomes extinct, will be used by all speakers or only by a certain share of speakers. This gives a precise meaning to the classification of scenarios of language change claimed in the introduction.

3.4. Classification of Long-Term Behaviour Depending on Generic Parameters

Dynamical systems, even in the supposedly easiest cases, show an intricate complexion of possible behaviour. The study of dynamical systems started with famous researchers like Newton, Lagrange and Poincaré trying to understand problems of celestial mechanics. Even two-dimensional dynamical systems can show chaotic behaviour and are in general hopelessly difficult to analyse (cf. Hilberts 16.th problem (Wikipedia, Citation2023a)). A recurring tool which often proves to be successful in the analysis of dynamical systems is the number of periodic orbits and their relation. A periodic orbit describes a specific configuration which returns to its initial position after evolving for some time T (the period) according to the rules of the dynamical system. An infamous example in the solar system is given by the trajectory of the Earth around the Sun due to the gravitational law. Its period is obviously given by one year. We are going to show in the following that the PC system does not allow any periodic orbits. This will be the starting point for further analysis of the long-term behaviour of the PC system. This section deals with the generic case if all interaction parameters α˜,β˜,γ˜,δ˜ are greater than zero. In Section 3.5 we will discuss the model in a singular setting. In the following, it is therefore understood that: α˜,β˜,γ˜,δ˜>0.

Theorem 3.1.

The PC System does not allow periodic orbits.

The idea of the proof of Theorem 3.1 is as follows: we will argue by contradiction and assume that there exists such a periodic orbit p(t)=(x(t),y(t)) of period T. We proceed by showing that according to the rules of the PC system, the average position

(21) xˉ=1T0Tx(t)dt;yˉ=1T0Ty(t)dt(21)

of the periodic orbit coincides with the critical point C. The desired contradiction is then obtained by showing that a potentially existing periodic orbit cannot encircle the critical point but has to stay on one side of the critical point and would thus pull the average in x,y away from C towards that side.

3.4.1. Fish-Trapping in the PC System

In order to formalize the last step in the outlined sketch of the proof, we start by proving a proposition called the fish trap in the following. To this end, we firstly calculate the locus of vanishing derivative in the x-direction and the y-direction, respectively. Setting x˙,y˙ to zero, a short calculation shows that

(22) gx:y=αβ(1x);gy:y=1δγx,(22)

are the lines of vanishing x˙ and y˙, respectively.

Obviously, along the line gx, the dynamical system has to flow in the y-direction (any flowing in the x-direction is prohibited by its very definition) and along the line gy the dynamical system flows in the x-direction. The intersection of both lines distinguishes the critical point C at which all derivatives vanish. More can be said about the flow: above the line gx, the x-derivative is negative, below the line gx, the x-derivative is positive. Mutatis mutandis, the same is true for gy. The area between the two lines together with the critical lines gx,gy except point C (cf. grey area in ) make up the fish trap, which traps any flow-line for all times (hence the nomenclature).

Figure 1. The fish-trap.

Figure 1. The fish-trap.

Definition 3.2.

The sectors II and IV () together with the lines gx,gy but without the critical point C make up an area referred to as the fish trap.

Proposition 3.3

(Fish trap). If a trajectory υ enters the fish trap, it will stay there for all times.

Proof.

In order to avoid distraction from the main line of argumentation, the proof is deferred to the appendix.□

3.4.2. Non-Existence of Periodic Orbits

Now we can turn towards proving theorem 3.1.

Proof.

Assume to the contrary that there exists a non-trivial periodic orbit of period T, which is notated by p(t)=(x(t),y(t)). By non-trivial we mean that it lies completely in the interior of Δ and that there exist times t,t  such that p(t)p(t ). In particular, no critical point lies on the trajectory of the periodic orbit, since it would have to stay there for all times. By the assumptions of non-triviality 0<x,y<1, we can thus integrate

(23) 0Tx˙xdt=0T(α(1x)βy)dt.(23)

Since the integral of x˙x equals ln(x), by the periodicity, it follows straight away that the left hand side vanishes. Hence, we find:

(24) 0=αTα0Txdtβ0Tydt.(24)

Dividing by T yields:

(25) α=αxˉ+βyˉ,(25)

where xˉ=1T0Txdt denotes the mean value of x, analogously for yˉ. The same can be done with y˙y, yielding:

(26) γ=γyˉ+δxˉ.(26)

Solving the two equations for xˉ,yˉ, we can conclude that the point with coordinates (xˉ,yˉ) coincides with the critical point C. By non-triviality P=p(0) is not any of the critical points, thus assume first that P lies on or above the line gy.

If p(t) stays above gy for all times (for example if it stays in sector I), then y(t)>1δγx(t) for all t. Integrating over the period T and dividing by T gives

(27) γyˉ>γδxˉ(27)

showing that the second equation obtained above for the mean values xˉ,yˉ is violated. Thus p(t) has to cross gy eventually. By proposition 3.3, p(t) is then confined to sector II for all times coming. Since it cannot leave the fish trap anymore, being periodic, it already had to start there (it cannot reach the potential starting point anymore). But then it had been below gy already for all times, leading to an analogous contradiction as before. This shows that the assumption about the existence of a periodic orbit had been wrong from the beginning, proving the assertion of the theorem.□

3.4.3. Convergence of Trajectories

Due to H. Poincaré and I. Bendixson, there is a strong result about the behaviour of two-dimensional dynamical systems related to their periodic orbits. In the case, we are concerned, it basically asserts that any trajectory converges either towards a periodic orbit or to a critical point. Having previously excluded the existence of periodic orbits, using the Poincaré-Bendixson theorem, we can deduce strong conclusions for the PC system about the long-term behaviour:

Theorem 3.4.

Every trajectory in the PC system converges to one of the critical points.

Proof.

In order to avoid distraction from the intended application to linguistics, we defer the proof together with the required prerequisites to the appendix.

We summarize what we have obtained: Given any Point QΔ, if we follow the trajectory through Q in the PC system long enough, we will end up arbitrarily close to one of the critical points C0,Cx,Cy,C. Which one is determined by the parameters α,β,γ,δ. Hence, the long-term behaviour of the language change modelled by the PC system is completely determined by the parameters.

We are going to describe this dependence more precisely. To this end, consider the flow of the dynamical system in the vicinity of a critical point Cp at a specific time t0 and at a time t0+dt infinitesimally later. Since the flow fixes the critical point Cp (the flow is stationary there), points near Cp are flowing to points near Cp. Hence, the flow between time t0 and time t0+dt amounts to a linear map of neighbourhoods of Cp, the linearization of the flow which captures its essential features. The linear map is given by the 2×2 matrix A of partial derivatives (‘linearization of the flow’) of the equations of motion. Now a positive eigenvalue of A corresponds to a point keeping its direction but flowing away from the critical point (trajectories close to this one are thus repelled by the critical point), whereas a negative eigenvalue of A corresponds to point keeping its direction but flowing towards the critical point (thus nearby trajectories are attracted). To briefly describe the remaining cases: if an eigenvalue is zero, then the flow stagnates in the corresponding direction (consists of fixed points), whereas for a complex eigenvalue of A the flow in the vicinity of the corresponding critical point would show rotatory character. Thus by calculating the eigenvalues of A for all critical points of the PC system, we can determine the behaviour of the flow. More details can be found in any standard textbook on dynamical systems or differential equations, for example in Zill and Cullen (Citation2001).

Therefore, we differentiate both defining equations of the PC system with respect to x,y and form the matrix:

(28) A=x˙∂xx˙∂yy˙∂xy˙∂y=α2αxβyβxδyγ2γyδx(28)

We need to substitute the critical points for x,y into A and then find the eigenvalues. A positive eigenvalue corresponds to a repelling eigen direction, a negative eigenvalue to an attracting eigendirection.

1. C0=(0,0):

(29) A=α00γ,(29)

obviously has two repelling eigendirections (the x- and y-axes).

2. Cx=(1,0):

(30) A=αβ0γδ,(30)
the eigenvalues are α and γδ. By the discussion above γδ<0. Thus, there exist two attractive eigendirections (one along the x-axis). The critical point is thus a sink and attracts all trajectories in the vicinity.

3. Cy=(0,1):

(31) A=αβ0δγ,(31)
the eigenvalues are γ and αβ. This corresponds to two attractive eigen directions (one along the y-axis). The critical point is again a sink.

4. C=(xcrit,ycrit)=γ(αβ)D,α(γδ)D:

(32) A=αxcritβxcritδycritγycrit,(32)
We begin by showing that the matrix A must have a negative and a positive eigenvalue. If a 2×2 matrix with real coefficients has a complex eigenvalue λ, it must have a second complex eigenvalue which equals the complex conjugate λ of the first, but then detA=λλ>0. Now the determinant equals (cf. discussion on parameters above):
(33) detA=Dxcritycrit=γ(αβ)α(γδ)D<0.(33)
Hence we must have a positive and a negative real eigenvalue of A. Therefore, we must have a repelling and an attracting eigendirection.

3.4.4. Description of the Resulting Scenario

Please note that C0 is a repelling critical point. Thus we discard it from the list below, because the trajectories will never end there. Further note that away from the x,y-axis, the trajectories are pushed into Δ. From the previous deduction, we find the following scenario: each of the critical points Cx,Cy is attracting. Moreover, C is a saddle point (has a repelling and an attracting eigendirection). Thus except for two trajectories coming in exactly in the attracting eigendirection, all other trajectories end in either Cx or Cy. These two trajectories constitute the so-called separatrix of the PC system since they separate the space of trajectories in those converging to Cx or Cy.

3.5. Classification of Long-Term Behaviour in the Singular Cases

In this section, we are going to repeat the analysis above for special cases of the interaction parameters and discuss their implications. We therefore recall the definitions

α=Nα˜;
β=N(α˜+β˜);γ=Nγ˜;δ=N(γ˜+δ˜).

In general, the critical point C is situated in the interior Δ of the triangle Δ. We call this the generic situation. The situation where the critical point C does not exist or is situated on the boundary ∂Δ of the triangle Δ is called singular. The above analysis deals therefore with the generic case. To analyse the singular cases, we discuss the different possible loci which the critical point C can have on the boundary ∂Δ. Therefore, consider for nomenclature:

Figure 2. Nomenclature of ∂Δ.

Figure 2. Nomenclature of ∂Δ.

1. Generic case

We included this point only to show that CΔ (our definition of being generic) is equivalent to the conditions on the parameters used in the previous section. Indeed, from CΔ in follows straight away that:

(34) 0<γαβD<10<αγδD<1.(34)

From this we directly conclude: 0<γ and α<β (by definition of α,β it is clear that αβ) and analogously 0<α and γ<δ. Thus, the generic case is equivalent to 0<α<β and 0<γ<δ together with D<0.

2. One negative parameter

It must be assured that the flow of the dynamical system points not outwards of Δ. Otherwise the flow could generate a negative proportion of liberal speakers. The condition β+δα+γ (by calculation of the dot-product (x˙,y˙)T(1,1)T0) ensures this. A short inspection reveals that the critical point C has at least one negative component, so that it is not situated within Δ. Thus, the flow of the dynamical system in Δ is reduced to two of the sectors I,II,III,IV in . The conclusions are the same as before in such a situation (convergence to one of the critical points as in proposition 7.8).

3. C=(q,0)a

The line gy:y=1δγx is bound to go through Cy=(0,1) whereas the line gx:y=αβ(1x) is bound to go through Cx=(1,0). Since C=gxgy, we see that C=(q,0) forces gx to be identical with the x-axis. Hence α=0 and x˙0 in all of Δ (in our terminology from above, only the sectors I,II survive in Δ since III,IV lie beneath the line gx). But on the x-axis (y=0y˙=0) the flow can in general only be horizontal, but since gx is defined as the line where x˙=0 vanishes, it follows straight away that all points on a are critical. The methods of the generic case go through unaltered, except that we have to show that any limit-set can contain only one critical point (which follows straight away from a monotonicity argument). Thus we get the same conclusions: any trajectory ends at one of the critical points (here Cy or a point of the segment {(p,0)|0p1}). As before consider the matrix A obtained from linearization of the flow. At the point Cp=(p,0) we find:

(35) Ap=0βp0γδp.(35)

For p<q, the quantity γδp is positive, whereas for p>q it is negative. Hence, to the left of the critical point C, the critical points of the x-axis are repelling whereas to the right of C they are attracting. Again, the fish-trapping lemma remains valid, thus any trajectory entering sector II can only converge to the critical point Cy, thus realizing reversible language change as no progressive speakers exist. All other trajectories which never enter sector II will end up at one of the attractive critical points with p>q. This is an instance of incomplete language change since in the long term, we end up with a percentage p of progressive speakers and the rest being liberal speakers (no conservative speakers left) not using the new feature.

4. C=(0,q)b

Exactly analogue to the previous case by exchanging x,y. The interpretation is as follows: any trajectory either ends on Cx, in the case of which we have complete language change, or they end at one of the critical points (0,p) with p>q. In this case, no progressive speakers are present and hence we have reversible language change (the new feature becomes extinct).

5. q0 in the third case

This means that CC0 and that γ0 (we already established α=0 before). Thus, we get the implications of both previous singular cases: All points on either a,b are critical and attracting. A short calculation shows that the convergence to these critical points flows along the straight lines with equations:

y=δβx+constant. The interpretations from above carry over: either incomplete or reversible language change.

6. q1 in fourth case

Now all trajectories are lying in sector II and are thus converging to Cy. This is another instance of reversible language change with the new feature becoming extinct.

7. q1 in third case

Now all trajectories are converging to Cx and we find complete language change in all cases.

8. Points on c become critical

In this case, we must have that both gx,gy equal the line with equation y=1x. This implies that α=β and γ=δ. But then, all points on y=1x are critical (being on gx they satisfy x˙=0 and by the same argument y˙=0.) An argument similar to the one before shows that the points Cp=(p,1p) are attractive critical points and all trajectories will end at one of these critical points. Again this is an instance of incomplete language change, so that in this case we find only progressive (percentage p) and conservative speakers (percentage 1p) whereas no liberal speakers are present anymore.

4. Application to General Language Change

4.1. General Conclusions from the Mathematical Section

In this section, we only discuss the generic case, since for convenience we included the interpretation of the singular cases in section 3.5. The application to language change is now clear if we recall the meaning of the variable x (the percentage of progressive speakers in the speaking community) and the variable y (the percentage of conservative speakers in the speaking community). Hence, we find a very specific choice of initial data (ratios of progressive to conservative speakers), which results in an unstable equilibrium. Even the slightest alteration of this ratio leads to a totally different long-term behaviour, either to complete language change or to the extinction of the new feature (reversible language change). Thus, we have now qualitatively solved the PLC model for language change completely, given the interaction parameters α,β,γ,δ. Provided that the model proves successful to describe real data of language change, the gain from such a model is a considerable improvement to older approaches due to the fact that as soon as any data-fitting procedure has produced estimates for the interaction parameters, the PLC model allows for predictions about the long-term behaviour of the system without having to do any further calculations. In the prevailing majority of modelled situations, fairly strong conclusions about the stability of the predictions can be concluded from the interaction parameters.

4.2. Testing the PLC Model on Empirical Data

The aim of this section is to describe how we approached the numerical testing of the PLC model. We are not going to repeat well-known facts about data-fitting procedures which have been described extensively, for example in the context of language change in Altmann (Citation1983). We use the relevant packages of the scipy, numpy modules of the programming language Python. To get better accessibility for the reader, we use a jupyter-notebook which allows for a hybrid environment consisting of the executable Code and explanations thereof. The files used below can be downloaded from GitLab under the following link https://gitlab.fosbos-rosenheim.de/pub/.

Only one aspect which to our knowledge is not (obviously) standard material on data-fitting is the fact that due to the (presumed) non-integrability of the PLC model we had to obtain the family of functions allowed for the data-fitting procedure by numerical integration. To this end, we applied another Python module using a suitable Runge-Kutta solver. We start the numerical analysis by comparing the PLC model to the well-studied Piotrowski-Altmann law.

4.2.1. Comparison to the Well-Studied Piotrowski-Altmann Law

In the following, we compare the Piotrowski-Altmann law to the PLC model. We take data from Best and Kohlhase on page 97 in (Best & Kohlhase, Citation1983), which was also used by Altmann in (Altmann, Citation1983). The data below show the development over time of the percentage of usage of the (new) word wurde opposed to the (old) word ward in German.

Using Piotrowski’s law, Altmann obtained as optimal function: f(t)=11+70.4286e0.4642t. The graphic shows the solution from the PLC model (red) in comparison to Altmann’s optimal function (blue) on the dataset described above. The fitting parameters for both models can be found in the Appendix 7.3.

4.2.2. Testing on the e-Epithesis

The e-epithesis is, according to Imsiepen(Citation1983), an example of reversible language change. By e-epithesis one understands the phenomenon in Early Modern High German to put an additional -e to the end of strong verbs in the preterite, for example sahe opposed to sah. Over time, this tendency initially started to increase before eventually growing completely out of fashion. We use the data from Imsiepen(Citation1983).

In , we compare the fitting of the model from Altmann in (Altmann, Citation1983), the generalized logistic curve from (Vulanović & Baayen, Citation2007) using a third-order polynomial in the time variable and the PLC model. The graph of Altmann’s attempt is orange, the generalized logistic curve is green whereas ours is red. The parameters of each model are given in the Appendix 7.3. Although the data is quite spread out, one can see that our model does show an improvement to the older models.

4.2.3. Testing on the Development of Periphrastic Do

According to Vulanović and Baayen (Citation2007), there is another form of language change, exemplified by the proportion of periphrastic-do constructions around 1560 which can be viewed as a two-stage change. By this it is meant that the data shows a slow-down (or even a decrease) before it starts increasing again. Vulanović and Baayen used a generalized logistic function in (Vulanović & Baayen, Citation2007), where they applied (besides another approach) a polynomial of order k=3 in the time-variable. We compare the generalized logistic function to the PLC model. Even though the PLC model cannot cope with an actual decrease (under the assumption of time-independent coefficients!), the following example shows how fitting of a slow-down is realized by the PLC model. In (Vulanović & Baayen, Citation2007) six different sentence types are analysed, here we consider only the two most interesting ones for our application and refer the interested reader to https://gitlab.fosbos-rosenheim.de/pub/ for fitting of the other examples. Originally, the data has been collected by Ellegåard (Citation1953) (Table 7). We use the data as presented in (Vulanović & Baayen, Citation2007) (Table 1). There the 13 periods in (Ellegåard, Citation1953) have been reduced to 11 to compensate for the different number of texts in the periods considered (cf (Vulanović & Baayen, Citation2007). p. 2 for further discussion).

The graphs show good fitting results for both models (parameters are given in 4), but the PLC model shows less tendency to overshooting on the (time-)ends of the datasets ().

4.2.4. Prediction Capabilities of the PLC Model

In this section, we want to examine the capability of the PLC model to predict the future development of processes of language change. We therefore left out the last six data points in the case of the e-epithesis and the last three data points in the case of affirmative declarative do and applied the fitting procedures to the remaining datasets (parameters are given in 7.3). In the graphics below the missed out data points are displayed in red so that it becomes possible to judge how well the model approximates the future development of the process under consideration.

5. Conclusion

As stated, the PLC model is not limited to the context of linguistics. Since it abstractly describes the interplay between progressive, liberal and conservative influences, it should be also applicable to various different settings situated in sociology, economics or political sciences. This is subject to further study.

Acknowledgments

We are deeply grateful to Professor Gabriel Altmann, without whose positive feedback we would have never considered to prepare this work for publication. This article grew out of a semester work in 2020 of one of us (E.S.) and an email exchange with Prof. Altmann. Sadly, during the preparation, Prof. Altmann passed away. We also like to thank the anonymous referees for valuable suggestions which considerably improved the content and the presentation of the manuscript.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Supplementary Material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/09296174.2024.2344946

Additional information

Funding

The publication of this article was financially supported by the Kultur- und Sozialstiftung des Oberbürgermeisters der Stadt Rosenheim Dr. Michael Stöcker, Rosenheim Technical University of Applied Sciences and Verein der Freunde und Förderer der Staatlichen Fachoberschule und Berufsoberschule Rosenheim e. V.

References

  • Altmann, G. (1983). Das Piotrowski-Gesetz und seine Verallgemeinerungen. In K. H. Best & J. Kohlhase, (eds) Exakte Sprachwandelforschung (pp. 59–90). Göttinger Schriften zur Sprach- und Literaturwissenschaft, GöSSL 2.
  • Beardon, A. (1997). Limits, a new approach to Real Analysis. Springer Verlag.
  • Best, K. H., & Kohlhase, J. (1983). Der Wandel von ward zu wurde. In K. H. Best and J. Kohlhase (Eds.), Exakte Sprachwandelforschung (pp. 91–102). Göttinger Schriften zur Sprach- und Literaturwissenschaft, GöSSL 2.
  • Dönges, P., Götz, T., Krüger, T., Niedzielewski, K., Priesemann, V., & Schäfer, M. SIR-Model for households. https://arxiv.org/pdf/2301.04355.pdf
  • Ellegåard, A. (1953). The auxiliary do: The establishment and regulation of its use in English. Almquist&Wiksell.
  • Imsiepen, U. (1983). Die e-Epithese bei starken Verben im Deutschen. In K. H. Best & J. Kohlhase (Eds.), Exakte Sprachwandelforschung. Göttinger Schriften zur Sprach- und Literaturwissenschaft, GöSSL 2.
  • Piotrovskaja, A. A., & Piotrovskij, R. G. (1974). Mathematičeskie modeli v diachronii i tekstoobrazovanii. In Statistika reči i avtomatičeskij analiz teksta (pp. 361–400). Leningrad, Nauka 1974.
  • Teschl, G. (2012). Ordinary differential equations and dynamical systems (Vol. 140). Graduate Studies in Mathematics, Amer. Math. Soc.
  • Verhulst, F. (1996). Nonlinear differential equations and dynamical systems, Universitext (2nd ed.). Springer-Verlag.
  • Vogl, G., & Prochazka, K. (2020). Was treibt den Sprachwechsel? Physik Journal, 19(2), 35–39.
  • Vulanović, R., & Baayen, H. (2007). Fitting the development of periphrastic do in all sentence types. In P. Grzybek & R. Köhler (Eds.), Exact methods in the study of language and text (pp. 679–688). Mouton de Gruyter.
  • E. S. Wheeler. (2007). Language change in a communication network. In P. Grzybek & R. Köhler (Eds.), Exact methods in the study of language and text (pp. 689–698). Mouton de Gruyter.
  • Wikipedia. (2023a). Hilberts 16.Tes Problem. Retrieved June 18, 2023, from. https://de.wikipedia.org/wiki/Hilbertsche_Probleme#Hilberts_sechzehntes_Problem
  • Wikipedia. (2023b). SIR-Modell. Retrieved September 5, 2023, from https://de.wikipedia.org/wiki/SIR-Modell
  • D. G. Zill, & M. R. Cullen. (2001). Differential equations with Boundary-Value Problems (5th ed.). Brooks/Cole.

Appendix

In this appendix, we give the detailed proofs of some results in the sections above which are necessary for exactness but which are not needed on a first reading to grasp the main ideas this work is based upon. Furthermore, we present the parameters of the fitting curves of Section 4.2.

7.1. Proof of the Fish-trap

Proposition 7.1

(Fish trap). If a trajectory υ enters the fish trap, it will stay there for all times.

Proof.

Without loss of generality, the trajectory υ(t) starts in sector I and enters the fish trap on the boundary of sector II at time 0. Hence

(36) υ(0)gy;υ˙(0)=c0c>0.(36)

Now assume the opposite to the statement of the proposition, i.e. there exists a time t such that υ(t) is outside the fish trap either in sector I or in sector III. By the Intermediate Value Theorem (IVT) reference to Beardon (Citation1997) (the page number of the reference is 52) there exists a time τ where υ leaves the fish trap into the relevant sector, say without loss of generality, sector I. Thus υ(τ)gy. To leave the fish trap, the velocity vector υ˙(τ)=x˙y˙ has to point out of the fish trap and thus has to satisfy:

x˙y˙δγ>0.

Hence δx˙+γy˙>0. Now on the line gy, it follows by definition that y˙=0, hence δx˙>0. With δ>0, this yields x˙>0. But this contradicts the fact that x˙ is negative along that part of the line gy lying above gx. Hence, the assumption was wrong, and υ(t) cannot leave the fish trap into sector I. The argument carries over verbatim to the case of entering into sector III or with the trajectory starting in sector III. If it started already in sectors II or IV, it could not leave by the same arguments.

7.2. Proof of the Main Theorem

To state the theorem (and deduce our conclusions), we must first fix some notation.

Definition

7.2 (Dynamical system). Let f:MR2 be a smooth function from an open set MR2 into R2. Then, the differential equation

(37) xt=fxt(37)

defines a dynamical system on M.

Definition

7.3 (Trajectory of a dynamical system). Given a dynamical system on the set M. A smooth curve υ:]a,b[M is called an integral of the dynamical system if its velocity vector υ˙(t) coincides with f(υ(t)) for all times. The set of points traced out in M by υ is called the corresponding trajectory.

Remark 7.4. By abuse of notation, if there is no danger of confusion, we will sometimes use the terminology trajectory and integral interchangeably.

Definition

7.5 (limit set). Given a dynamical system on the set M. A point PM belongs to the limit set of the point QM if there exists a sequence of times tj going to such that the integral υ of the dynamical system through Q satisfies

(38) limjυ(tj)=P.(38)

We denote the limit set (the set of all points sharing the property above) for Q by

(39) ω(Q)M.(39)

Theorem 7.6

(Poincaré-Bendixson). (Theorem 7.16 on page 223 in Citation2012) Let a dynamical system on M be given, fix a point xM and suppose ω(x) is compact, connected and contains finitely many critical points. Then one of the following cases holds:

Now we can state:

  1. ω(x) is a critical point

  2. ω(x) is a non-trivial periodic orbit

  3. ω(x) consists of finitely many critical points {xj} and non-closed trajectories connecting them.

To deal with assumptions in theorem 7.6 we quote:

Theorem 7.7.

(Theorem 4.2 in Citation1996) If a trajectory through QM is confined to some bounded region, then the set ω(Q) is compact, connected and non-empty.

We need one more result for the proof:

Proposition 7.8.

If PΔ is not a critical point, then there does not exist an integral υ:[0,[Δ and a sequence tj such that

(40) limjυ(tj)=P.(40)

Proof.

Assume that there is such a point P. Since the argument carries over mutatis mutandis to the other sectors, we will assume without loss of generality that P is in sector I. It is clear that υ has to stay in sector I for all times, because once it would enter sectors II or IV it will never be able to return close to P in sector I due to proposition 3.3. But then υ will be above gy and gx for all times. By the defining property of gx,gy this implies that for υ˙=(x˙,y˙) we have:

(41) x˙<0,y˙<0.(41)

Hence x(tj) as well as y(tj) are monotonically decreasing sequences of real numbers bounded below by zero. By the Archimedean axiom and the assumptions made about υ, they converge as follows:

(42) x(tj)x(P);y(tj)y(P).(42)

Note that it must hold

(43) x(tj)x(P);y(tj)y(P)∀j.(43)

Due to the monotonicity it also follows for

(44) tj<t<tix(P)x(ti)x(t)x(tj).(44)

Hence for t>tj it follows that

(45) ||υ(t)P||2=|x(t)x(P)|2+|y(t)y(P)|2(45)
(46) |x(tj)x(P)|2+|y(tj)y(P)|2,(46)
(47) ||υ(t)P||||υ(tj)P||.(47)

Now we are going to evaluate the mean values as in the proof of theorem 3.1 and deduce the desired contradiction. Given any ε>0, choose a ν>0 such that for x within distance ν of x(P) and y within distance ν of y(P) we have:

(48) |ln(x)ln(x(P))|<ϵ;|ln(x)ln(x(P))|<ϵ.(48)

Such a ν exists by the continuity of ln in x(P),y(P).

Now choose j so big such that

(49) ||υ(tj)P||<ν,(49)

hence we find for all t>tj:

(50) |x(t)x(P)|<ν|ln(x(t))ln(x(P))|<ϵ(50)
(51) |y(t)y(P)|<ν|ln(y(t))ln(y(P))|<ϵ.(51)

As before consider the integrals

(52) 1ttjtjtx˙xdt=1ttjln(x(t))ln(x(tj))ϵttj.(52)

On the other hand, we have from the PC system:

(53) x˙x=α(1x)βy.(53)

Integrating as before shows

(54) αα1ttjtjtxdtβ1ttjtjtydtϵttj(54)

Consider 1ttjtjtxdt and note, as was discussed before, that

(55) x(t)x(P)<νx(P)x(t)x(P)+ν(55)

which yields

(56) x(P)1ttjtjtxdtx(P)+ν.(56)

We abbreviate xˉ(j,t)=1ttjtjtxdt;yˉ(j,t)=1ttjtjtydt. Hence, we find that (xˉ(j,t),yˉ(j,t)) lies in the ν-neighborhood of P.

By choosing ε small enough and t big enough, we find that the right hand side ϵttj of the inequality (54) becomes arbitrarily small. This in turn implies that xˉ(j,t),yˉ(j,t) closely satisfies the linear equation

(57) αβ(1xˉ(j,t))=yˉ(j,t),(57)

which is the line gx. The analogous argument for y˙y implies that xˉ(j,t),yˉ(j,t) closely satisfies the equation of the line gy, hence xˉ(j,t),yˉ(j,t) are arbitrarily close to the critical point C. However, we have shown that xˉ(j,t),yˉ(j,t) converge to the point P, which is different from C by assumption, hence for ε small enough and t big enough we get the desired contradiction.□

With these prerequisites we can now turn towards the classification:

Theorem 7.9.

Every trajectory in the PC system converges to one of the critical points.

Proof.

Any trajectory of the PC system is bounded since it is confined to Δ, theorem 7.7 implies that the limit set of any point is then non-empty, compact and connected. Together with the fact that there exist only four critical points we can apply theorem 7.6. Since theorem 3.1 prohibits the existence of periodic orbits, the limit-set of any point in Δ is either a critical point or a connected set consisting of some critical points together with trajectories between them. But proposition 7.8 also excludes non-critical points on a trajectory connecting the critical points. Thus, we are left with only critical points as limit-sets. The same argument as in the proof of proposition 7.8 guarantees that the convergence is actually true in the continuous sense (to exclude the possibility that a trajectory might recede from a limit point between times ti,tj). We briefly recall the argument. Either υ is completely contained in sectors I or III, then we are either above or below both of gx,gy and the convergence is monotonic. If it enters II or IV, by proposition 3.3 it will stay there for all times and also experience monotonic convergence.

7.3. The parameters of the Fitting Functions

For the sake of completeness, we present the parameters of the fitting functions in Section 4.2. First recall the definition of parameters of the fitting functions:

  1. Piotrowski-Altmann

    f(t)=c1+aebt

  2. Altmann, k=2

    f(t)=c1+ae(bt+dt2)

  3. Altmann, k=3

    f(t)=c1+ae(k0+k1t+k2t2+k3t3)

  4. PLC model

x(t) solving

(58) x˙=αx(1x)βxy(58)
(59) y˙=γy(1y)δxy(59)

Piotrowski-Altmann;

Figure 3. PLC and Piotrowski-Altmann.

Figure 3. PLC and Piotrowski-Altmann.

- Piotrowski-Altmann

a=70.4286
b=0.4642
c=1.0000

- PLC model

α=0.6142±0.1381
β=2.6240±50.0777
γ=2.1509±5.4645
δ=4.2129±7.2251
x0=0.0067±0.0044
y0=0.0000±0.0000

e-epithesis;

Figure 4. e-epithesis.

Figure 4. e-epithesis.

- Altmann k=2

a=174.7431±323.4998
b=0.6895±0.3641
c=0.4139±0.1840
d=0.0186±0.0098

- Altmann k=3

c=0.3973±0.1320
k0=1.5936±1.2400
k1=0.1060±0.3113
k2=0.0336±0.0303
k3=0.0010±0.0007

- PLC model

α=0.1076±0.0256
β=2.3732±144.0511
γ=0.0377±1.4282
δ=1.1806±4.1077
x0=0.0618±0.0199
y0=0.0001±0.0034

affirmative declarative (AD);

Figure 5. Affirmative declarative.

Figure 5. Affirmative declarative.

- Altmann k=2

a=1.3349±29.0541
b=3.4284±11.6307
c=0.0508±0.0123
d=0.6198±1.5044

- Altmann k=3

c=4288.0894±281522975.0661
k0=58.0758±65802.4601
k1=16.7839±15.5784
k2=1.9223±1.8064
k3=0.0701±0.0671

- PLC model

α=2.1602±2.3368
β=2.5054±2.1382
γ=3.5855±8.9612
δ=0.3424±12.2549
x0=0.0000±0.0000
y0=0.0000±0.0000

negative declarative (ND);

Figure 6. Negative declarative.

Figure 6. Negative declarative.

- Altmann k=2

a=159.0507±2610.5674
b=0.8750±3.9980
c=0.5660±4.0071
d=0.0272±0.4874

- Altmann k=3

c=0.6009±0.2411
k0=44.1164±33.5010
k1=15.7658±13.1213
k2=1.8539±1.6519
k3=0.0720±0.0683

- PLC model

α=1.7308±10.8865
β=8.2517±366.9127
γ=1.1318±17.8137
δ=3.1689±31.6397
x0=0.0000±0.0010
y0=0.0003±0.0160

e-epithesis-prediction;

Figure 7. e-epithesis.

Figure 7. e-epithesis.

- Altmann k=3

c=0.3973±0.1320
k0=1.5937±1.2400
k1=0.1059±0.3113
k2=0.0336±0.0303
k3=0.0010±0.0007

- Altmann k=3-predict

c=1.9536±113.1033
k0=3.8067±60.6009
k1=0.1604±0.4607
k2=0.0006±0.0969
k3=0.0001±0.0026

- PLC model

α=0.1076±0.0256
β=2.3732±144.0511
γ=0.0377±1.4282
δ=1.1806±4.1077
x0=0.0618±0.0199
y0=0.0001±0.0034

- PLC model-predict

α=0.1109±0.0814
β=2.7888±585.9505
γ=0.0614±15.6047
δ=0.9330±43.0411
x0=0.0602±0.0244
y0=0.0001±0.0327

affirmative declarative-prediction;

Figure 8. Affirmative declarative.

Figure 8. Affirmative declarative.

- Altmann k=3

c=4288.0894±281522975.0661
k0=58.0758±65802.4601
k1=16.7839±15.5784
k2=1.9223±1.8064
k3=0.0701±0.0671

- Altmann k=3-predict

c=0.0962±0.0184
k0=15.4560±50.7514
k1=27.1872±30.2628
k2=6.8091±6.0113
k3=0.4529±0.3692

- PLC model

α=2.1602±2.3368
β=2.5054±2.1382
γ=3.5855±8.9612
δ=0.3424±12.2549
x0=0.0000±0.0000
y0=0.0000±0.0000

- PLC model-predict

α=2.0863±16.9947
β=2.6160±197.3765
γ=2.6927±222.9634
δ=1.1746±2518.9027
x0=0.0000±0.0000
y0=0.0000±0.0001