620
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Approximate reduction of linear population models governed by stochastic differential equations: application to multiregional models

ORCID Icon &
Pages 461-479 | Received 10 Jul 2017, Accepted 11 Sep 2017, Published online: 04 Oct 2017

ABSTRACT

In this work we develop approximate aggregation techniques in the context of slow-fast linear population models governed by stochastic differential equations and apply the results to the treatment of populations with spatial heterogeneity. Approximate aggregation techniques allow one to transform a complex system involving many coupled variables and in which there are processes with different time scales, by a simpler reduced model with a fewer number of ‘global’ variables, in such a way that the dynamics of the former can be approximated by that of the latter. In our model we contemplate a linear fast deterministic process together with a linear slow process in which the parameters are affected by additive noise, and give conditions for the solutions corresponding to positive initial conditions to remain positive for all times. By letting the fast process reach equilibrium we build a reduced system with a lesser number of variables, and provide results relating the asymptotic behaviour of the first- and second-order moments of the population vector for the original and the reduced system. The general technique is illustrated by analysing a multiregional stochastic system in which dispersal is deterministic and the rate growth of the populations in each patch is affected by additive noise.

2010 MATHEMATICS SUBJECT CLASSIFICATION:

1. Introduction

Nature offers many examples of systems with an inherent complexity whose study leads to mathematical models with a large number of state variables whose analytical study is, in most cases, not feasible. In order to be able to extract important information about the behaviour of some of these complex models, one can resort to ‘approximate aggregation methods’. These are mathematical techniques, which are usually applied in systems governed by processes with different time scales, in which appropriate approximations are introduced in order to transform the system under consideration into a reduced system with a lower number of variables, called ‘global variables’. In this way, the behaviour of the original system can be approximated, but not known with exactitude, in terms of the knowledge of the behaviour of the reduced system.

Approximate aggregation techniques in population dynamics have been widely studied in the context of deterministic systems with different time scales both in continuous and discrete time (see the review in [Citation2] and the references therein), as well as in discrete stochastic models incorporating either environmental [Citation19] or demographic stochasticity [Citation21].

Stochastic differential equations (SDEs) can be thought of as resulting from the introduction of environmental stochasticity in the coefficients of deterministic ODEs. In spite of the difficulty of their analytical treatment, they have become popular as population modeling tools (see for example [Citation6, Citation8] for the scalar case and [Citation13, Citation24] for applications to competition models and epidemiology respectively).

The aim of this work is to formulate an approximate aggregation technique valid to reduce fast-slow linear population models governed by SDEs, to give sufficient conditions for the solutions of the models to be positive and to relate the asymptotic behaviour of the first- and second-order moments of the population vector for the original and the reduced system. The original model is built by considering a fast deterministic process that converges to an equilibrium, together with a slow process in which the parameters are affected by additive noise. In the resulting system the quotient between the rate of diffusion squared and the speed of drift is O(ε), ε0, where ϵ is a measure of the difference of time scales between the fast and the slow process. This is in contrast with most previous works in the field [Citation5, Citation10, Citation11, Citation16] which are valid in other situations. An exception is [Citation14] which covers our case of interest but which only deals with the relationships between the original and the reduced systems in finite time intervals.

The existing literature on the field of the approximate reduction of SDEs has been developed mainly in the context of physics and control theory. Our approach, although general in nature, is aimed to population dynamics applications. Specifically, we will employ it to study a multiregional model consisting of a single population living in a multipatch environment in such a way that dispersal is deterministic and the growth rate of the population in each patch is affected by additive noise. Models with spatial heterogeneity have been widely studied through aggregation techniques (see [Citation3] for the continuous time case and [Citation2] for discrete time), normally making use of the fact that in many practical situations the dispersal of individuals amongst the different patches is fast with respect to other processes like demography or competition (an exception to this is [Citation20]). In this work we contemplate a situation that has not been dealt with so far: dispersal between some spatial patches can be fast with respect to demography whereas migration between some other patches can happen at the same time scale of demography. For example we can think of a population of birds living in different islands in such a way that inter-island movements happen at the scale of demography but, in comparison, intra-island migrations are fast.

The manuscript is organized as follows: in Section 2 we present the general formulation of linear SDE models and, in order to be able to use them in population dynamics applications, we give sufficient conditions that guarantee that if initial conditions are positive, the solution to such a system remains positive for all times. Section 3 presents the general formulation of a linear two time scale population model with stochasticity affecting the slow process. Section 4 carries out the reduction of the system, which is accomplished by letting the fast process reach equilibrium and defining and adequate set of new variables. In Section 5 we apply the general technique to the case of a stochastic multiregional model, with special attention to the case in which all migrations are fast with respect to demography, since it is the most frequent situation in practice and moreover in that case the reduced system is scalar. Section 6 presents a result that allows one to relate the asymptotic behaviour of the first and second statistical moments of the population vector for the original and the reduced system, and applies it to study the asymptotic behaviour of the multiregional models of Section 5. A brief discussion of results and the Appendix with mathematical proofs complete the manuscript.

2. Linear population models with two time scales

Throughout the paper we assume we are working in a complete probability space (Ω,F,{Ft}t0,P) where the filtration {Ft}t0 satisfies the usual conditions [Citation18]. Let us consider a structured population modelled by an linear autonomous homogeneous stochastic differential equation. The model has the form (1) dX(t)=AX(t)dt+h=1mBhX(t)dWh(t),X(0)=X0,(1) where X(t)=(X1(t),X2(t),,Xd(t))Rd is the population vector, A=[aij]Rd×d, Bh=[bijh]Rd×d, h=1,2,,m and W(t)=(W1(t),W2(t),,Wm(t))T is a m-dimensional standard Wiener process defined in the previous probability space. Moreover, we assume that X0Rd is a non-random vector. Models of the kind (Equation1) are obtained from a linear deterministic model (2) dX(t)dt=AX(t),X(0)=X0(2) if we add noise to the population vital rates aij. Indeed, system (Equation1) can be written in the form dXi(t)=j=1d(aijdt+bij1dW1(t)++bijmdWm(t))Xj(t),i=1,,d from where we see that each coefficient bijh characterizes the intensity of the noise dWh affecting aij. Note that the case in which the noises dWj(t) are correlated can be reduced to this setting through an appropriate transformation [Citation1, p. 126]. Systems of the kind (Equation1) can be interpreted in the sense of Ito or in the sense of Stratonovich, and the results obtained in the two cases differ [Citation23]. However, one can choose any of the two interpretations as long as one defines appropriately the parameters of the model [Citation7]. In this work we will make use of the Ito interpretation. It is well known [Citation1, p. 126] that in the previous conditions there exists a unique solution to (Equation1) which is continuous with probability one.

In order to be a valid model for a population, the solution of (Equation1) for any positive initial condition must remain non-negative for all times, so we turn our attention to this property. Given a vector or matrix x we will write x0 (resp. x>0) to denote that all the components of x are non-negative (resp. positive). Let us recall that a square matrix A is said to be a Metzler matrix or a essentially non-negative matrix when it has non-negative off-diagonal entries, that is, aij0 for ij [Citation22]. It is well known that if A is a Metzler matrix then solutions of the deterministic system (Equation2) meet the desired property. However, in model (Equation1) additional requirements are needed in order to ensure positivity of solutions. The next result gives sufficient conditions:

Theorem 2.1

If matrix A is a Metzler matrix and matrices B1,,Bk are diagonal, then given X(0)=X0>0, the solution of (Equation1) verifies that P(X(t)>0 for all t0)=1.

Proof.

See Appendix.

3. A model with two time scales

We suppose a stage-structured population in which individuals are classified into stages or groups attending to any characteristic of the life cycle. Moreover, each of these groups is divided into several subgroups that can correspond to different spatial patches, different individual activities or any other characteristic that could change the life cycle parameters. The model is therefore general in the sense that we do not state in detail the nature of the population or the subpopulations.

We consider the population being subdivided in q populations (or groups). Each group is subdivided in subpopulations (subgroups) in such a way that for each i=1,2,,q, group i has Ni subgroups. Therefore, the total number of subgroups is N:=N1+N2++Nq. We denote the fast time as τ, while the slow time will be denoted by t. In this way we have t=ετ where ε<<1 is a small number that represents the ratio between the slow and the fast times.

Le xij(τ) be the density of subpopulation j of population i at time τ, with i=1,2,,q and j=1,2,,Ni. In order to describe the population of group i at time τ we will use vector xi(τ)=(xi1(τ),xi2(τ),,xiNi(τ))TRNi, i=1,2,,q, where T denotes transposition. The composition of the total population is then given by vector X(τ)=(x1(τ)T,x2(τ)T,,x1(τ)T)TRN.

In the evolution of the population we will consider two linear processes whose corresponding characteristic time scales, are very different from each other. In order to include in our model both time scales we will model these two processes, to which we will refer as the fast and the slow dynamics, by two different matrices.

In principle, we will make no special assumptions regarding the characteristics of the slow dynamics other than linearity and restrictions to guarantee positivity of solutions to the slow process. Thus, we will assume that the parameters of the slow process are defined by (3) S+h=1kBhdWh(t)dt,(3) where: S.1. Matrix SRN×N models the deterministic part of the slow process and has non-negative off-diagonal entries, that is, S is a Metzler matrix. We consider S divided into blocks Sij, 1i,jq in such a way that S=S11S1qSq1Sqq. Each block Sij=[sijαβ] has dimensions Ni×Nj and characterizes the rates of transference of individuals from the subgroups of group j to the subgroups of group i. More specifically, for each α=1,2,,Ni and each β=1,2,,Nj, entry sijαβ represents the rate of transference of individuals from subgroup β of group j to subgroup α of group i.

S.2. W1,,Wk are independent standard Wiener processes (so that by dW1(t)/dt,,dWk(t)/dt we denote the associated weakly defined gaussian white noises) and, for each h=1,,k, matrix BhRN×N models the contribution of noise dWh(t) to the dynamics of the slow process. In order to be able to guarantee positivity of solutions (Theorem 2.1), we assume that matrices Bh are diagonal, that is, Bh=diag(B1h,,Bqh),h=1,,kwith Bih=diag(bi1h,,biNih),i=1,,q. Note from Equation (Equation3) that biαh models the intensity of the noise dWh(t) affecting coefficient siiαα.

In this context, we say that an eigenvalue λ of a certain square matrix A is strictly dominant when the real part of λ is strictly larger than the real part of the rest of the eigenvalues of A.

As far as the behaviour of the fast dynamics is concerned, we will make the following three assumptions:

F.1. The fast process is deterministic.

F.2. The fast dynamics is an internal process for each group, that is, there is no transference of individuals from one group to a different one. Therefore, for each i=1,,q, the fast dynamics of group i will be represented by a Metzler matrix Fi of dimensions Ni×Ni. We will assume that Fi is irreducible in the sense that there exists r>0 such that rI+Fi is a primitive non-negative matrix [Citation22]. Therefore, the matrix that governs the fast dynamics for the whole population is (4) F=diag(F1,F2,,Fq).(4)

F.3. The fast process in each group has a non-trivial equilibrium point which is asymptotically stable. More specifically, for each i=1,,q matrix Fi has eigenvalue 0 and it is strictly dominant so that the rest of the eigenvalues of Fi have negative real parts. Since Fi is a Metzler irreducible matrix, eigenvalue 0 is simple [Citation22, Theorem 2.6] and moreover, there exist positive right and left eigenvectors vi and ui of Fi associated to eigenvalue 0 for which we choose the following normalization conditions (5) Fivi=0;uiTFi=0Tvi1=1;uiTvi=1,(5) where 1 denotes the 1-vector norm.

In order to incorporate both time scales in our model, we will make use of parameter ε=t/τ. The model, to which we will refer as original system, has the following form in the slow time (6) dX(t)=1εFX(t)dt+SX(t)dt+h=1kBhX(t)dWh(t).(6) Alternatively, using the fast time τ we have (7) dX(τ)=FX(τ)dτ+εSX(τ)dτ+εh=1kBhX(τ)dWh(τ),(7) where we have made an abuse of notation and kept the same notation for X and the Wh in the new time, and we have used [Citation1, p. 47] that if W(t) is a standard Wiener process then so is εW(t/ε). We stress that the quotient between the rate of diffusion squared and the speed of drift is O(ε), ε0, which is in contrast with most approaches in the analysis of two-time scales systems [Citation5, Citation10, Citation11, Citation16] which are valid in other situations.

4. Approximate reduction of the model

In order to reduce the original system (Equation7), we will use the fact that the fast process has an asymptotic stable equilibrium, and we will approximate this system by another one in which the fast process has reached equilibrium.

Let i=1,,q be fixed and let Fˆi:=viuiT. From Equation (Equation5), we have that if the system were governed by the fast process exclusively, for any initial condition xiR+Ni the population of each group i would tend to vector xˆi:=Fˆixi=(uiTxi)vi. From this expression we note that vector vi defines the equilibrium population structure for group i and ui is a vector of reproductive values, that is, the larger uij, the higher the contribution of the j-th subgroup of the ith group to the equilibrium population. Therefore ui characterizes the size of the equilibrium population.

We define matrix Fˆ in the following way, Fˆ=diag(Fˆ1,Fˆ2,,Fˆq), and then for any initial condition XR+N, the equilibrium population for the fast dynamics in the whole system is given by Xˆ=FˆX. Let us define the non-negative matrices V:=diag(v1,v2,,vq)RN×q,U:=diag(u1T,u2T,,uqT)Rq×N whose interpretation is immediate bearing in mind what we pointed out about vi and ui.

Some of the properties of these matrices are gathered in the following lemma, whose proof is straightforward:

Lemma 4.1

Matrices F,Fˆ,V and U verify:

  1. Fˆ=VU, UV=Iq and the columns of V are independent and constitute a basis of ImFˆ.

  2. FFˆ=0.

Now, from Equation (Equation6) we will build an auxiliary system replacing the state variables in the right side by its equilibrium values for the fast process and use that FFˆ=0, (8) dX(t)=1εFFˆX(t)dt+SFˆX(t)dt+h=1kBhFˆX(t)dWh(t)=SFˆX(t)dt+h=1kBhFˆX(t)dWh(t).(8) Now we define the vector of global variables as (9) Y=(y1,,yq)T:=UXRq(9) and, multiplying Equation (Equation8) on the left by U and using that Fˆ=VU and Equation (Equation9) we obtain the aggregated system (10) dY(t)=S¯Y(t)dt+h=1kB¯hY(t)dWh(t),(10) where we have defined (11) S¯=[s¯ij]:=USV=u1TS11v1u1TS1qvquqTSq1v1uqTSqqvqRq×q(11) (12) B¯h:=UBhV=diagα=1N1v1αu1αb1αh,,α=1NqvqαuqαbqαhRq,h=1,,k.(12) Note that the global variables Y(t) defined by Equation (Equation9) have the following expression in terms of the variables X(t) of the auxiliary system: yi(t)=uiTxi(t)=ui1xi1(t)+ui2xi2(t)++uiNixiNi(t),i=1,,q. Note that:

  1. yi(t) is a linear combination of the variables corresponding to group i, being the coefficients of the combination the components of vector ui. Recall that ui is a vector of reproductive values for the fast process in group i. Therefore, for each j=1,,Ni, variable xij(t) has a relative weight in yi(t) which is proportional to uij, that is, proportional to the contribution to the total equilibrium population that an individual initially present in group i and subgroup j would have in the case that the system were governed by the fast process exclusively.

  2. The global variables are conservative for the fast process. Indeed, suppose that the fast process is the only one acting in the system. Then we would have X˙(t)=FX(t)/ε and using that UF=0, Y˙(t)=UX˙(t)=UFX(t)/ε=0.

  3. The components of the matrices representing the drift and the diffusion for the reduced system are certain linear combinations of their analogues for the original system, where the coefficients of the combination are determined by the equilibrium characteristics of the fast process.

The next result together with Theorem 2.1 guarantees that the original and the aggregated systems have positive solutions for any positive initial conditions.

Proposition 4.2

F+εS and S¯ are Metzler matrices and matrices B¯h, h=1,,k are diagonal. Therefore, according to Theorem 2.1 both the original system (Equation7) and the aggregated system (Equation10) verify that for any positive initial condition the solution remains positive for all t>0 with probability one.

Proof.

F+εS is clearly a Metzler matrix for it is the sum of Metzler matrices. Now let ij. Since S is a Metzler matrix we have that Sij0 and using the fact that ui and vj are positive vectors, from Equation (Equation11) it follows that s¯ij=uiTSijvj0 and so S¯ is a Metzler matrix. Moreover, from Equation (Equation12) it is clear that B¯h is a diagonal matrix for each h=1,,k.

5. Multiregional models with two time scales

In this section we will illustrate the reduction technique by applying it to a multiregional model.

5.1. Model setting

We consider a population living in a multipatch system. We assume that there are a number N of different patches among which the individuals can migrate. The growth of the population in each patch is linear and is affected by stochasticity. Migration among patches is assumed to be deterministic.

We number the patches in the form (i,α), i=1, 2, α=1,,Ni, where N:=N1+N2. Coming back to the terminology of Section 3, the first index, i, defines the ‘group’ of patches and the second, α, the ‘subgroup’ within that group. Note that in our setting we have chosen that the number q of groups is 2 just for the sake of simplicity in the expression of the matrices involved, but the generalization of the model to an arbitrary number q of groups is straightforward. Let xiα(t) denote the population in group i and subgroup j at time t and let X=(x11,,x1N1,x21,,x2N2)TRN be the population vector.

We assume that migration is fast within each group of patches, that is, from any patch (i,α) to any other patch of the form (i,β), βα. Migration is assumed to be slow between patches belonging to different groups, that is, from any patch (i,α) to any other patch of the form (j,β), with ji. We can think of each group of patches as spatial regions located close to each other so that migration between them is easy, whereas different groups of patches correspond to regions amongst which migration is more difficult. For example, in a population of birds each group of patches can correspond to an island and the subgroups can correspond to the different spatial locations within an island, so that intra-island migrations are fast with respect to inter-island movements.

Let τ and t denote, respectively, the times corresponding to the fast and the slow migrations and let ε:=τ/t be the ratio between both. We assume that the growth of the population takes place in the slow time scale. For each pair (i,α), i=1, 2, α=1,,Ni, let riα be the deterministic population growth rate in patch (i,α) and let us assume that this growth rate is affected by a noise defined by a certain linear combination σiα1dW1(t)/dt++σiαkdWk(t)/dt of (weakly defined) independent white noise processes, where σiαh0 for each h=1,,k. Now we define matrices Ri:=diag(ri1,,riNi)RNi×Ni,i=1,,qR:=diag(R1,R2)RN×NGih:=diag(σi1h,,σiNih)RNi×Ni,i=1,,q,h=1,,kGh:=diag(G1h,G2h)RN×N,h=1,,k. Regarding the slow migration between different groups of patches, for each ij and each α=1,,Ni, β=1,,Nj, we define lijaβ0 as the (slow) migration coefficient from patch (j,β) to patch (i,α). Similarly we define liiaβ=0 for each α,β=1,,Ni with αβ (as there is no slow migration within a group of patches) and liiαα:=j=1, jiqβ=1, βαNjljiβα,i=1,2,α=1,,Ni. Let us define matrices Lij:=[lijαβ]RNi×Nj, i,j=1,,q and L:=L11L12L21L22. Then the slow process, that is, the joint effect of the growth process and of the slow migration between patches of different groups, can be modelled by the following system of SDEs dX(t)=(L+R)X(t)dt+h=1kGhX(t)dWh(t),X(0)=X0. Regarding the fast migration between patches of the same group, for each i=1, 2 and α,β=1,,Ni with αβ, let miαβ0 be the (fast) migration rate from patch (i,β) to patch (i,α). For β=α we define (13) miαα:=β=1, βαNimiβα.(13) Now let Mi:=[miαβ]RNi×Ni, i=1, 2. We assume that the miαβ are such that Mi is irreducible for each i=1, 2. From Equation (Equation13) we have that the columns of each Mi add up to zero and so matrix Mi+I is a non-negative primitive column stochastic matrix. Therefore 0 is the (strictly) dominant eigenvalue of Mi and moreover it is simple. Let ui:=1i=(1,,1)TRNi and vi=(vi1,,viNi)TRNi be its associated positive left and right eigenvectors, where we assume the normalization condition 1iTvi=1. Note that vector vi defines the equilibrium distribution between the different patches of group i when we consider the fast migration as the only process acting on the system. Now we define matrices M:=diag(M1,M2)RN×NV:=diag(v1,v2)RN×2,U:=diag(11T,12T)R2×NMˆi:=vi1iTRNi×Ni,i=1,2,Mˆ:=diag(Mˆ1,Mˆ2)RN×N. Then, the complete model that takes into account the joint effect of the slow population growth in each patch and the slow and fast migrations between patches is given by dX(t)=1εM+L+RX(t)dt+h=1kGhX(t)dWh(t) or, using the fast time τ, (14) dX(τ)=(M+ε(L+R))X(τ)dτ+εh=1kGhX(τ)dWh(τ),X(0)=X0,(14) which constitutes a system of N linear SDEs.

Note that under the above conditions we are in the general setting of Section 3 by taking S=L+R, F=M, Fˆ=Mˆ, Bh=Gh,h=1,,k and Hypotheses S1, S2, F1, F2 and F3 for the slow and fast processes are met.

5.2. Model reduction

Therefore we can proceed to the aggregation of the original system following the procedure developed in Section 4. We define the global variables Y:=UX, that is, yi(t)=1iTxi(t)=xi1(t)+xi2(t)++xiNi(t),i=1,2, so that yi is the total population in all patches of group i, and the reduced aggregated system is the two dimensional SDE (15) dY(t)=S¯Y(t)dt+h=1kG¯hY(t)dWh(t),Y(0)=UX0,(15) where S¯:=U(L+R)V=11T(L11+R1)v111TL12v212TL21v111T(L22+R2)v2R2×2G¯h:=UGhV=diag(11TG1hv1,12TG2hv2),h=1,,k. In order to be more specific we will consider the particular case in which N1=N2=2, that is, there are 4 patches (1,1), (1,2), (2,1) and (2,2), and the population vector is X=(x11,x12,x21,x22)T. Dispersal is fast between patches (1,1) and (1,2) and between (2,1) and (2,2), and is slow in the rest of cases.

Let i=1, 2 be fixed. We will keep the notation introduced above except in the case of the fast migrations, where in order to simplify it we will denote by pi and qi the (fast) migration rates from (i,1) to (i,2) and from (i,2) to (i,1) respectively. Therefore Mi=piqipiqi,i=1,2. We assume 0<pi,qi<1 so that matrix Mi is irreducible and vector vi has the form vi=(qi,pi)T/(pi+qi). Then the complete model has the form (Equation14) with Gh=diag(σ11h,σ12h,σ21h,σ22h), h=1,,k and M+ε(L+R) given by p1ε(l2111+l2121r11)q1εl1211εl1212p1q1ε(l2112+l2122r12)εl1221εl1222εl2111εl2112p2ε(l1211+l1221r21)q2εl2121εl2122p2q2ε(l1212+l1222r22). The global variables are Y=(y1,y2)T=(x11+x12,x21+x22)T and the reduced system has the form (Equation15) with (16) S¯=s¯11s¯12s¯21s¯22=q1(l2111l2121+r11)+p1(l2112l2122+r12)p1+q1q1(l1211+l1221)+p1(l1212+l1222)p2+q2q1(l2111+l2121)+p1(l2112+l2122)p1+q1q2(l1211l1221+r21)+p2(l1212l1222+r22)p2+q2G¯h=diag(σ¯1h,σ¯2h)=diagq1σ11h+p1σ12hp1+q1,q2σ21h+p2σ22hp2+q2,h=1,,k.(16)

5.3. Case in which migration is fast

Let us now consider the particular but relevant case in which there is only one group of patches, and therefore all the migrations among the patches are fast with respect to demography. This has been the usual setting in the literature of approximate aggregation techniques [Citation2, Citation3]. We can consider this case as a particular instance of the setting in Section 5.1 when we take q equal to 1 instead of equal to 2. In order to simplify the notation, we omit the index 1 regarding the group of patches. Therefore, we denote the patches as j=1,,N, and the vectors and matrices have the form X=(x1,,xN)T, R:=diag(r1,,rN), L=0 (there are no slow migrations), M:=[mαβ] and Gh:=diag(σ1h,,σNh). Therefore, the original system (Equation14) takes the form dX(τ)=(M+εR)X(τ)dτ+εh=1kGhX(τ)dWh(τ). In this case there is only one global variable y(t)=x1(t)+x2(t)++xN(t) and if v=(v1,,vN)T and 1=(1,,1)T are the positive right and left eigenvectors of M associated to eigenvalue 0 scaled so that 1Tv=1, we have V=v, U=1T. Therefore, the reduced system is the scalar SDE dy(t)=s¯y(t)dt+h=1kg¯hy(t)dWh(t),y(0)=j=1Nxj(0), where (17) s¯:=1TRv=j=1Nvjrj,g¯h=1TGhv=j=1Nvjσjh.(17)

6. Relationships between the original and the reduced systems

The aim of this Section is to relate the asymptotic behaviour of the first- and second-order moments of the solution of the original and the reduced systems (Equation8) and (Equation10) introduced in Section 3. The two systems fit in the framework of [Citation14] in which the quotient between the rate of diffusion squared and the speed of drift is O(εp), ε0, with p>1/2, but the results obtained in that reference, essentially a stochastic extension of the Tikhonov theory of singular perturbations, are valid only for finite time intervals.

The first- and second-order moments of the solution of general linear stochastic equations with a deterministic initial condition are finite and can be calculated as the solution of certain linear ordinary differential equations. Specifically [Citation1], for system (Equation1) EX(t) is the unique solution to the equation ddtEX(t)=AEX(t),EX(0)=X0, whereas the matrix of second-order moments H(t):=E(X(t)X(t)T)Rd×d is the unique non-negative-definite symmetric solution of the equation (18) dH(t)dt=AH(t)+H(t)AT+h=1kBhH(t)(Bh)T,H(0)=X0X0T.(18) Regarding Equation (Equation18), in order to work with a more tractable expression, we will make use of the Kronecker matrix product and the ‘vec’ operator (see [Citation9] for details). The Kronecker matrix product for two matrices A=[aij]Cm×n and B=[bij]Cr×s is defined as a matrix of size mr×ns with mn blocks in which the block in position (i,j) has the form aijB. For any matrix ACm×n, vecA is defined as the column vector that contains, in order, the columns of A. For any matrices A, Y, and B for which the product AYB makes sense, one has vec(AYB)=(BTA)vecY [Citation9, Theorem 6.8], and so applying the ‘vec’ operator to both sides of Equation (Equation18) we obtain that the second-order moments verify ddtvecH(t)=A2vecH(t),vecH(0)=vec(X0X0T),

where A2=INA+AIN+h=1kBhBhRd2×d2 and IN denotes the N×N identity matrix.

Therefore, the asymptotic behaviour of the first- and second-order moments of system (Equation1) can be characterized in terms of the dominant eigenvalue and eigenvectors of matrices A and A2. We will now use this fact to study the moments of systems (Equation8) and (Equation10).

In the case of the original system (Equation8), the analogous to matrices A and A2 in the previous reasonings are matrices CRN×N and C2RN2×N2 given by (19) C(ε):=F+εS,C2(ε):=F2+εS2,(19) where we have defined F2:=INF+FIN, S2:=INS+SIN+h=1kBhBh. In the case of the aggregated system (Equation10) the corresponding matrices are C¯(ε)=εS¯Rq×q,C¯2(ε)=εS¯2Rq2×q2 where S¯2:=IqS¯+S¯Iq+h=1kB¯hB¯h. We introduce the following two hypotheses:

Hypothesis 6.1

Matrix S¯ has a simple and strictly dominant real eigenvalue μ associated to non-negative right and left eigenvectors r and l, respectively.

Hypothesis 6.2

Matrix S¯2 has a simple and strictly dominant real eigenvalue μ2 associated to non-negative right and left eigenvectors r2 and l2, respectively.

From Hypothesis 6.1 we have that the first-order moments of the reduced system have the following asymptotic behaviour limtEY(t)exp(μt)=l,y0l,rr, where y0:=UX0 and , denotes the standard vector scalar product. Analogously, from Hypothesis 6.2 we have, for the second-order moments, limtvecE(Y(t)Y(t)T)exp(μ2t)=l2,y0y0l2,r2r2. The next result relates the asymptotic behaviour of the first- and second-order moments of the original system (Equation8) with that of the reduced system when ϵ is small, that is, when the separation between the time scales of migration and of demography is large enough:

Theorem 6.1

(a) Let Hypothesis 6.1 hold. Then for small enough ε>0, matrix C(ε)=F+εS has a simple and strictly dominant eigenvalue λ(ε) that can be written in the form λ(ε)=εμ+O(ε2) with associated right and left non-negative eigenvectors r(ε) and l(ε) such that r(ε)=Vr+O(ε);l(ε)=UTl+O(ε). Consequently, for the original system (Equation8) we have: (20) limτEX(τ)exp(εμ+O(ε2))=l,UX0l,rVr+O(ε).(20) (b) Let Hypothesis 6.2 hold. Then for small enough ε>0, matrix C2(ε)=F2+εS2 has a simple and strictly dominant eigenvalue λ2(ε) that can be written in the form (21) λ2(ε)=εμ2+O(ε2)(21) with associated right and left non-negative eigenvectors r2(ε) and l2(ε) such that (22) r2(ε)=(VV)r2+O(ε);l2(ε)=(UU)Tl2+O(ε).(22) Consequently, the asymptotic behaviour of the second-order moment of the original system is characterized by: (23) limτvecE(X(t)X(t)T)exp(εμ2+O(ε2))=l2,(UU)vec(X0X0T)l2,r2(VV)r2+O(ε).(23)

Proof.

See Appendix.

From the last theorem it follows in particular that if μ<0 (resp. μ>0), then λ(ε)<0 (resp. λ(ε)>0) for small enough ϵ, so that if the expected value of the population vector tends to zero (resp. infinity) in the reduced system the same happens for the original one when ϵ is small. Something analogous happens for the second-order moments regarding μ2 and λ2(ε).

Let us apply this result to relate the behaviour of the multiregional models of Section 5. In the first place we will consider the model with q=2, N1=N2=2. Note from (Equation16) that if there is at least a non-zero coefficient lijαβ, matrix S¯ is irreducible and then Hypothesis 6.1 holds. Its dominant eigenvalue is given by μ=s¯11+s¯22+(s¯11s¯22)2+4s¯12s¯212 and so the rate of growth of the first-order moments for the original system is εμ+O(ε2) for small enough ϵ. This expression allows one to study how the different combinations of the migration and growth parameters affect the asymptotic behaviour of the expected value of the population. In particular, μ<0 if and only if trS¯<0 and detS¯>0. Since S¯ has order two we can calculate right and left eigenvectors r and l associated to μ and then apply Equations (Equation20) and (Equation23) to obtain the full information about the asymptotic behaviour of the first-order moments of the system.

We could argue analogously for the second-order moments: in this case matrix S¯2 is of order four and so an analytical computation of μ2, r2 and l2 is non-feasible, but for instance we can use the Routh–Hurwitz criterion [Citation17] to obtain conditions for μ2 to be negative, so that εμ2+O(ε2) will also be negative for small ϵ.

In the case of the system of Section 5.3, the analysis is simpler cause the aggregated system is scalar and so are S¯ and S¯2, so that Hypotheses 6.1 and 6.2 hold trivially. Moreover, μ=s¯ and μ2=2s¯+h=1k(g¯h)2 where s¯ and the g¯h are given by Equation (Equation17) and we can take r=l=r2=l2=1. Therefore from Equations (Equation20) and (Equation23) we have limτEX(τ)exp(εμ+O(ε2))=y(0)v+O(ε),limτvecE(X(t)X(t)T)exp(εμ2+O(ε2))=y(0)2(vv)+O(ε). Note in particular that, when ϵ is small enough, the expected value of the population vector tends to zero if s¯<0, and the second-order moments also tend to zero if s¯<h=1k(g¯h)2/2.

7. Discussion

In this work we have presented a technique to carry out the reduction of linear two-scale population models governed by SDEs, therefore allowing one to simplify the treatment of these models. The reduction of the original model with N variables is carried out by letting the fast process reach equilibrium and defining an appropriate set of q global variables, q<N, which are linear combinations of the state variables and are conservative for the fast process. Moreover, we have obtained conditions that guarantee the positivity of solutions to the model.

We have also presented a result that allows one to know the the asymptotic behaviour of the first- and second-order statistical moments of the population vector of the original system through the computation of the dominant eigenvalues and eigenvectors of certain matrices of dimensions q×q and q2×q2 associated to the reduced system.

The aggregation technique has been applied to study stochastic multiregional models in which migration between some patches can be fast with respect to demography whereas dispersal between other patches happens at the scale of demography. In the simpler case in which all migrations are fast, the reduced system is a scalar SDE whose analysis is straightforward, therefore allowing one to easily characterize the asymptotic behaviour of those moments for the original multiregional model.

This work suggests possible lines of future development: on the first hand, and still within the linear setting, work needs to be done to try to relate the stochastic stability [Citation15] of the origin in the original and the reduced system, for this will provide more information regarding the persistence-extinction of the population. Secondly, in order to be able to study more realistic population models, the technique should be extended to nonlinear settings.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

Authors are supported by MINECO (Ministerio de Economía, Industria y Competitividad), Spain, project MTM2014-56022-C2-1-P.

References

  • L. Arnold, Stochastic Differential Equations: Theory and Applications, Wiley Interscience, New York, 1974.
  • P. Auger, R.B. de La Parra, J.-C. Poggiale, E. Sánchez, and L. Sanz, Aggregation methods in dynamical systems and applications in population and community dynamics, Phys. Life Rev. 5(2) (2008), pp. 79–105. doi: 10.1016/j.plrev.2008.02.001
  • P. Auger, J. Poggiale, and E. Sánchez, A review on spatial aggregation methods involving several time scales, Ecol. Complex. 10 (2012), pp. 12–25. doi: 10.1016/j.ecocom.2011.09.001
  • H. Baumgärtel, Analytic Perturbation Theory for Matrices and Operators, Vol. 15, Springer, New York, 1985.
  • N. Berglund and B. Gentz, Geometric singular perturbation theory for stochastic differential equations, J. Differential Equations 191(1) (2003), pp. 1–54. doi: 10.1016/S0022-0396(03)00020-2
  • C.A. Braumann, Variable effort fishing models in random environments, Math. Biosci. 156(1) (1999), pp. 1–19. doi: 10.1016/S0025-5564(98)10058-5
  • C.A. Braumann, Harvesting in a random environment: Itô or stratonovich calculus? J. Theoret. Biol. 244(3) (2007), pp. 424–432. doi: 10.1016/j.jtbi.2006.08.029
  • C. Carlos and C.A. Braumann, General population growth models with allee effects in a random environment, Ecol. Complex. 30 (2016), pp. 26–33. doi: 10.1016/j.ecocom.2016.09.003
  • M. Fiedler, Special Matrices and their Applications in Numerical Mathematics, Kluwer Boston, Inc., 1986.
  • N. Herath and D. Del Vecchio, Model order reduction for linear noise approximation using time-scale separation, in 2016 IEEE 55th Conference on Decision and Control (CDC), IEEE, Chicago, IL, USA, 2016, pp. 5875–5880.
  • N. Herath and D. Del Vecchio, Model reduction for a class of singularly perturbed stochastic differential equations: Fast variable approximation, in American Control Conference (ACC), 2016, IEEE, Boston, MA, USA, 2016, pp. 3674–3679.
  • R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 2012.
  • D. Jiang, J. Yu, C. Ji, and N. Shi, Asymptotic behavior of global positive solution to a stochastic sir model, Math. Comput. Modelling 54(1) (2011), pp. 221–232. doi: 10.1016/j.mcm.2011.02.004
  • Y. Kabanov and S. Pergamenshchikov, Two-Scale Stochastic Systems: Asymptotic Analysis and Control, Vol. 49, Springer, Berlin, 2003.
  • R. Khasminskii, Stochastic Stability of Differential Equations, Vol. 66, Springer, Berlin, 2012.
  • H.J. Kushner, Weak Convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems, Vol. 3, Birkhäuser, Boston, 1990.
  • J.A. Linda, An Introduction to Mathematical Biology, Pearson, Upper Saddle River, NJ, 2007.
  • X. Mao, Stochastic Differential Equations and their Applications, Woodhead Publishing; 2nd ed. ( January 13, 2008), Oxford, 1997.
  • L. Sanz and J. Alonso, Approximate aggregation methods in discrete time stochastic population models, Math. Model. Nat. Pheno. 5(6) (2010), pp. 38–69. doi: 10.1051/mmnp/20105603
  • L. Sanz and R. Bravo de la Parra, Variables aggregation in a time discrete linear model, Math. Biosci. 157(1) (1999), pp. 111–146. doi: 10.1016/S0025-5564(98)10079-2
  • L. Sanz, A. Blasco, and R. Bravo de la Parra, Approximate reduction of multi-type Galton–Watson processes with two time scales, Math. Models Methods Appl. Sci. 13(04) (2003), pp. 491–525. doi: 10.1142/S0218202503002659
  • E. Seneta, Non-Negative Matrices and Markov Chains, Springer Science & Business Media, New York, 2006.
  • M. Turelli, Random environments and stochastic calculus, Theoret. Popul. Biol. 12(2) (1977), pp. 140–178. doi: 10.1016/0040-5809(77)90040-5
  • C. Xu and S. Yuan, Competition in the chemostat: A stochastic multi-species model and its asymptotic behavior, Math. Biosci. 280 (2016), pp. 1–9. doi: 10.1016/j.mbs.2016.07.008

Appendix

Proof

Proof of Theorem 2.1.

Let us fix X0>0 and let k0 be sufficiently large so that for every i=1,,d, the ith component of X0 verifies X0i>1/k0. For each integer kk0 we define the stopping time τk:=inf{t0:Xi(t)1/k for some i=1,,d}. Clearly τk is increasing as k. Let us set τ:=limkτk. We want to show that τ= with probability one, and to do so we will proceed by contradiction. Let us assume that the previous statement is false. Then there exists T>0 and ε(0,1) such that P(τT)>ε, and therefore there exists an integer k1k0 such that (A1) P(τkT)εforall kk1.(A1) Let R˚+d=(0,)×(d)×(0,), let v:R˚+dR be any C2 function and let V(t):=v(X(t)). From Ito's theorem [Citation1] we have (A2) dV(t)=Lv(X(t))dt+vx(X(t))h=1mBhX(t)dWh(t),(A2) where Lv(x)=i,j=1dvxiaijxj+12h=1mi,j=1d(BhxxT(Bh)T)ijxixj2vxixj=i,j=1dvxiaijxj+12h=1mi=1dbihbjhxixj2vxixj and we have used that the Bh are diagonal, Bh=diag(b1h,,bdh). Let us choose v(x):=i=1dlogxi/(1+xi), which is positive and C2(R˚+d). Straightforward calculations show that (A3) Lv(x)=i=1daii1+xii=1djiaijxj1+xi+12h=1mi=1d(bih)21+2xi(1+xi)2.(A3) Note that, if xi>0 for all i=1,,d, then the first and the third term in Equation (EquationA3) are bounded from above in R˚+d, and moreover, since A is a Metzler matrix, the second term is negative in R˚+d. Therefore there exists K0 such that Lv(x)K in R˚+d. Clearly X(tτk), where we define ab:=min{a,b}, belongs to R˚+d for all t0, and therefore, using the previous bound we have from Equation (EquationA2) V(X(Tτk))V(X0)+K(Tτk)+0Tτkvx(X(t))h=1mBhX(t)dWh(t) and therefore, using that the expected value of the stochastic integral is zero (A4) EV(X(Tτk))V(X0)+KE(Tτk)V(X(0))+KT.(A4) On the other hand, let us define Ωk={τkT} for kk1, so that Equation (A1) means that P(Ωk)ε. Now for all ωΩk there exists some i=1,,d such that Xi(t,ω)=1/k and, since v is a sum of positive summands, V(X(τk,ω))log(1/k)/(1+1/k). Since V is non-negative, if we denote by 1Ωk the indicator function of Ωk we can write EV(X(Tτk))E[1ΩkV(X(τk))]εlog(1/k)/(1+1/k) and taking the limit k we obtain limkEV(X(Tτk))= which clearly contradicts (EquationA4) and therefore it must be τ= a.s. as we wanted to prove.

We state as a Theorem a collection of results from Baumgärtel [Citation4] about perturbation of semisimple eigenvalues of matrices that we will use in the proof of Theorem 6.1:

Theorem A.1

Essentially Theorems 2 and 3 in [Citation4], pp. 267–269

Let E be a complex linear space of finite dimension N and let A(ε) be a linear operator defined on E that admits the following holomorphic expansion A(ε)=A0+εA1+ε2A2+,|ε|<R. Let λ0 be a semi-simple eigenvalue of A0 and let P be its associated eigenprojection.

  1. Let λ(ε) be a perturbed eigenvalue such that λ(ε)λ0 when ε0. Then λ(ε) has the form λ(ε)=λ0+μjε+o(ε), ε0, for some j=1,,q, where μj, j=1,,q are the eigenvalues of the restriction TImP of operator T:=PA1P to ImP. Moreover, associated to λ(ε) we can choose a (right) eigenvector x(ε) that is continuous at ε=0 and such that x0:=x(0)0 verifies Tx0=μjx0.

  2. In the case that μ is a simple eigenvalue of T, λ(ε) and x(ε) are holomorphic functions of ϵ.

Proof

Proof of Theorem 6.1

For small ϵ we will consider matrix C(ε)=F+εS as a perturbation of matrix F. From Equation (Equation4) and the properties of each matrix Fi, we have that 0 is a semisimple eigenvalue of F and that it is strictly dominant. Therefore (A5) F=(VV)diag(0q,Σ)UU(A5) is a Jordan canonical decomposition of F [Citation12] where VRN×(Nq), URN×(Nq) and ΣR(Nq)×(Nq) are certain matrices and Σ is associated to eigenvalues with strictly negative real part. From Equation (EquationA5) and Lemma 4.1, we have that Fˆ=VU is the eigenprojection matrix of F associated to eigenvalue 0.

Using the continuity of eigenvalues on matrix entries, the dominant eigenvalue of C(ε) for small enough ϵ necessarily must belong to the set of eigenvalues of C(ε) that tend to 0 when ε0. Using part 1 of Theorem A.1 applied to C(ε)=F+εS with A0=F, A1=S, λ0=0 and P=Fˆ, we conclude that this set of eigenvalues has the form (A6) 0+εμj+o(ε),j=1,,q,(A6) where the μj are the eigenvalues of the restriction TImFˆ of T:=FˆSFˆ to ImFˆ. Clearly ImFˆ is invariant for T and, using Equation (Equation11) and Lemma 4.1, we have T=FˆSFˆ=VUSVU=VS¯U, so that S¯ is the matrix of the restriction TImFˆ expressed in the basis of ImFˆ defined by the columns of V. Therefore, TImFˆ and S¯ have the same eigenvalues and, moreover, if x and y are, respectively, right and left eigenvectors of S¯ associated to a certain eigenvalue λ, then Vx0 and UTy0 are right and left eigenvectors of TImFˆ associated to eigenvalue λ.

Therefore, using Hypothesis 6.1 it follows that μ is a simple eigenvalue for TImFˆ and is associated to right and left eigenvectors Vr0 and UTl0, respectively. Moreover, μ is strictly dominant for TImFˆ so that, using Equation (EquationA6), μ(ε):=εμ+o(ε) is a simple eigenvalue for C(ε) and strictly dominant when ϵ is small enough.

Now, using part 2 of Theorem A.1 and the fact that μ is simple for TImFˆ, it follows that μ(ε) has the form μ(ε):=εμ+O(ε2) and that is associated to right and left eigenvectors which can be written in the form Vr+O(ε) and UTl+O(ε) respectively. Then, for small enough ϵ limτEX(τ)exp(εμ+O(ε2))=(lTU+O(ε))X0(lTU+O(ε))(VrT+O(ε))(Vr+O(ε))=l,UX0l,rVr+O(ε) and so Equation (Equation20) is proved.

(b) The proof is analogous to that of part (a). A notable property of the Kronecker product that we will use in the sequel is [Citation9, Theorem 6.1] (A1A2Am)(B1B2Bm)=(A1B1)(A2B2)(AmBm). We will use Equation (Equation19) and for small ϵ will consider matrix C(ε) as a perturbation of matrix F2=INF+FIN. First, let us study the spectral properties of F2. Let vˆi, i=1,,N, denote the ith. column of matrix (VV), that is, vˆi is an eigenvector of F associated to eigenvalue 0 if 1iq or to an eigenvalue with strictly negative real part if q+1iN.

From [Citation9, Theorem 6.5] we know that the eigenvalues of INF+FIN are the set {λi+λj:λiσ(F), i=1,,N} and that, associated to each eigenvalue λi+λj we have right and left eigenvectors vˆivˆj and uˆiuˆj where vˆi and uˆi are, respectively, right and left eigenvectors of F associated to eigenvalue λi, i,j=1,,N. Taking into account the spectrum of F we conclude that σ(F2) consists of eigenvalue 0 with multiplicity q2 associated to right (resp. left) left eigenvectors vˆivˆj, (resp. uˆiuˆj) i,j=1,,q, and N2q2 eigenvalues with strictly negative real parts associated to right (resp. left) eigenvectors vˆivˆj (resp. uˆiuˆj), where either i or j belong to the set {q+1,,N}. Clearly, eigenvalue λ=0 is semisimple and is strictly dominant for F2.

It is easy to check that if A and B are any matrices with the block structure A=(a1ak), B=(B1Bs) where the ai are column vectors, then AB=(a1B1a1BsakB1akBs). Applying this result we can write (vˆ1vˆ1vˆ1vˆqvˆqvˆ1vˆqvˆq)=VV(uˆ1uˆ1uˆ1uˆquˆquˆ1uˆquˆq)=UTUT so that we can write a Jordan canonical decomposition of F2 in the form F2=(VVV2)diag(0q2,Σ2)UUU2, where Σ2, V2 and U2 are appropriate matrices. From this expression we see that Fˆ2:=(VV)(UU)=(VU)(VU)=FˆFˆ is the eigenprojection of F2 corresponding to eigenvalue 0.

Like in part (a), we know that, for small enough ϵ, the dominant eigenvalue of C2(ε) necessarily must belong to the set of eigenvalues of C2(ε) that tend to 0 when ε0. Let us consider the restriction T2ImFˆ2of matrix T2:=Fˆ2S2Fˆ2 to ImFˆ2. Clearly ImFˆ2 is invariant for T2 and, using Equation (Equation11), Lemma 4.1 and the properties of the Kronecker product we have T2=Fˆ2S2Fˆ2=(VV)S¯2(UU) so that S¯2 is the matrix of T2ImFˆ2 expressed in the basis of ImFˆ2 defined by the columns of VV. Therefore, T2ImFˆ2 and S¯2 have the same eigenvalues and, moreover, if x and y are, respectively, right and left eigenvectors of S¯2 associated to a certain eigenvalue λ, then (VV)x0 and (UU)Ty0 are right and left eigenvectors of T2ImFˆ2 associated to eigenvalue λ.

Now let us assume Hypothesis 6.2. Then we can reason exactly as we did in part (a), substituting C(ε), F, S, S¯, Fˆ, V and U for C2(ε), F2, S2, S¯2, Fˆ2, VV and UU, respectively, and obtain that for small enough ϵ, C2(ε) has a simple and strictly dominant eigenvalue that can be written in the form (Equation21) and associated to eigenvectors with the form (Equation22), from where the asymptotic behaviour (Equation23) follows.