1,041
Views
2
CrossRef citations to date
0
Altmetric
Research Article

Evaluating different epidemiological models with the identical basic reproduction number ℛ0

Pages 849-870 | Received 09 Apr 2020, Accepted 09 Nov 2020, Published online: 30 Nov 2020

Abstract

Epidemiological models with the identical basic reproduction number R0 may behave differently on both short time scale and long time scale. In this paper, we compare the predicted final sizes for several deterministic epidemic models and estimate the probabilities of a minor/major outbreak for continuous-time Markov chain (CTMC) models, all epidemic models have the identical R0. It is proved that the final size predicted by the epidemic model with homogeneous mixing is larger than with heterogeneous mixing. For CTMC models with heterogeneous mixing, the probabilities of a minor outbreak initiated by superspreaders and non-superspreaders are calculated and compared. For both deterministic modelling and stochastic modelling, numerical simulations are performed to support the mathematical analysis.

AMS Subject Classifications:

1. Motivation and introduction

Mathematical epidemiological models are widely used to study the outbreaks of infectious diseases. Such models are important in aiding public health authorities to assess the risks associated with the epidemics, especially for emerging infectious diseases. It has been mentioned in [Citation13–15,Citation26] that the standard practice to analyse epidemic outbreaks is to formulate a compartmental model, then to apply observed early epidemiological data to infer the key parameters in the model, finally to use this model to predict the long-term behaviour of the transmission and compare different public health management strategies. The key variable basic reproduction number R0 is generally inferred from the early observed exponential growth rate which is indicated by daily data [Citation32]. However, starting from the fixed value of R0, there are many choices of epidemiological models. Therefore, in order to make relatively accurate prediction of how an epidemic will develop, it is crucial to select reliable mathematical models to correctly reflect the initial behaviours of the outbreaks and calculate the final sizes based on these mathematical models.

As discussed in [Citation12,Citation13,Citation18,Citation31], the progress of the epidemic can be significantly slower than expected and the real final epidemic size can be much smaller than the estimated final size. The general growth model was proposed as (1) C(t)=rCp(t),(1) where C(t) is the cumulative cases up to time t and 0p1 is the ‘deceleration of growth’ parameter. If p = 1 the formula has the exponential solution, while if p<1 the growth is sub-exponential. In [Citation18], it was pointed out that the sub-exponential growth in the early phase of an epidemic is a rule rather the exception. The phenomenon of smaller actual final sizes is more common for diseases which are considered very dangerous [Citation12,Citation13]. One reasonable explanation for the discrepancy between the estimated final size and the real epidemic size is the sub-exponential early growth of the epidemic. Sub-exponential growth has been observed in variety of epidemiological situations [Citation12,Citation13,Citation18]. Some possible factors to cause sub-exponential growth include the spatial structure of the population under study such as clustering [Citation17], the reactive behavioural changes of individuals in the population [Citation12,Citation13], the consideration of nonlinear incidence rates [Citation17] and other possible contact patterns [Citation12,Citation13,Citation19]. Therefore, the concept of effective reproduction numbers has been introduced in [Citation18] to accurately track the development of transmissions. The second possible explanation is the heterogeneous mixing of the population in a single location [Citation12,Citation13]. It has been argued in [Citation13] that heterogeneity of mixing can be modelled by the autonomous dynamical models and the linearization theory shows that the early growth is always exponential. However, the consideration of heterogeneous mixing indeed has an impact on the final epidemic size. Early efforts have been made in [Citation12,Citation13] in order to prove that, with the identical basic reproduction number R0, the epidemic model with heterogeneous mixing always predicts a smaller final size than the model with homogeneous mixing. Although the analytical proof was not obtained in [Citation12,Citation13], several numerical simulations were performed in [Citation12,Citation13] to support this statement. Thus, the first aim of this paper is to provide more detailed analysis of different deterministic epidemic models with regard to final size estimations and comparison. It is suggested in [Citation13] that the research on how to derive and analyse mechanistic models to determine the situations that have smaller final epidemic sizes and slower early growths will be one of the important and challenging directions for epidemic modelling.

Throughout this paper, the final size of an epidemic is defined as the number of individuals who have been affected during the course of the transmission (2) The final size of an epidemic=S0S,(2) where S is the number of susceptible individuals who eventually escape from the epidemic. For the case that the number of initial infectious cases is small, the total population size N can be used to measure the initial size of susceptible population S0 [Citation12,Citation13].

The second aim of this paper is to study and compare the dynamical behaviours in early phases for different stochastic epidemic models with identical R0. Such stochastic epidemic models are formulated based on the corresponding deterministic compartmental models. Although the topic of stochastic modelling is huge and there are many types of stochastic models to choose [Citation22], the most commonly used and natural form is Markov jump processes, or the continuous-time Markov chain (CTMC) models [Citation1,Citation22]. When the size of initial infectious cases is very small, the CTMC models are appropriate to describe and simulate the epidemic outbreaks. The multivariate branching process (BP) analysis is usually applied to approximate the probability of a minor epidemic outbreak [Citation1,Citation2,Citation5,Citation6,Citation16]. The term ‘probability of a minor/major outbreak’ originates from [Citation33] to quantify whether an outbreak of a relatively large number of individuals being infected can be established. With the development of epidemics, deterministic epidemic models can be used to describe further dynamics. It has been suggested in several publications that the combination of deterministic modelling and stochastic modelling provides a more accurate description for the transmission [Citation14]. Some of the previous major work in stochastic modelling with applications in epidemiology include [Citation1,Citation2,Citation7,Citation16,Citation22,Citation24]. There is also some previous work discussing the stochastic epidemic modelling with heterogeneous mixing [Citation8,Citation30]. It has been shown in [Citation30] that the probability of a major outbreak is greater if the epidemic is initiated by the group of superspreaders. In this paper, with the constraint that all models share the identical R0, we formulate different CTMC models based on the corresponding deterministic compartmental models and analyse the outbreaks.

This paper is organized as follows: in Section 2, we formulate the deterministic epidemic modes and compare the predicted final epidemic sizes, then perform several sets of numerical simulations to confirm the analytical results. In Section 3, we formulate the CTMC models and apply the multi-type branching process (BP) theory to analyse the behaviours during the early outbreak stage for such stochastic models. The stochastic simulation algorithm (SSA) is then applied to generate exact realizations to verify the analytical results. All numerical simulations are implemented in Python package StochPy [Citation27].

2. Deterministic compartmental modelling

We first formulate the SIR epidemic model (Equation3) with homogeneous mixing. Based on this fundamental model, we consider the epidemic model (Equation7) with time-dependent contact rate, the epidemic model (Equation12) with heterogeneous mixing and the epidemic model (Equation19) that includes both the behavioural changes and the heterogeneity. For each epidemic model, we study the basic reproduction number R0 and the derived final size relations.

2.1. The original SIR epidemic model with homogeneous mixing

The original SIR epidemic model consists of three compartments. S(t) denotes the number of individuals who are susceptible to the disease at time t, I(t) denotes the number of individuals who are infectious at time t, and R(t) denotes the number of individuals who have recovered from the disease and are with full immunity against reinfection. The contact rate β is assumed to be a constant, which means that a member in the population makes contact sufficient to transmit infection with β others per unit time. The recovery rate from the compartment I to the compartment R is γ, which gives the average duration of the infectious period 1/γ. The population size N is constant.

The simple SIR epidemic model is (3) S=βSIN,I=βSINγI,R=γI.(3) The basic reproduction number R0=βγ is defined as the average number of secondary cases generated by introducing only one infectious individual into the whole population (see [Citation11,Citation14]). When R01, the disease will die out, while when R0>1, a major outbreak will occur. Under the condition of R0>1, the population size N is large and the initial size of infectious cases is small, the final epidemic size has been well studied in many publications (e.g. [Citation9,Citation14]) and can be expressed in the following implicit equation: (4) logNS=R0(1SN).(4) Once the basic reproduction number R0 is estimated, the final epidemic size can be approximated by Equation (Equation4). It can be observed in Equation (Equation4) that the final size is solely determined by the value of R0. Furthermore, S in Equation (Equation4) is uniquely determined by R0 and the estimation of its upper bound is available. The final epidemic size can be expressed as a monotonically increasing function of R0. With the larger value of R0, the predicted final epidemic size is larger. The following two lemmas summarize the results.

Lemma 2.1

S in Equation (Equation4) is uniquely determined by the value of R0, and has upper bound of NR0.

Proof.

We denote y:=NS, then the left-hand side of Equation (Equation4) is denoted by f1(y):=log(y), the right-hand side of (Equation4) is denoted by f2(y):=R0(11/y). For any fixed R0>1, we analyse the Equation (Equation4) step by step:

  1. f1(1)=0 and f2(1)=0,

  2. f1()= and f2()=R0,

  3. f1(y)=1/y and f2(y)=R0/y2. Both functions f1 and f2 are monotonically increasing functions of y. And we have that if y(1,R0), f1(y)<f2(y); while if y(R0,), f1(y)>f2(y). If we start from y = 1, the increase of function f1 is slower than the increase of function f2; and starting from yR0, f1 increases faster than f2.

Based on the properties of functions f1 and f2, there exists one and only one solution of Equation (Equation4), and y=NS>R0. Thus, S<NR0. For the case of R0<1, it is clear that two curves generated by f1 and f2 have no intersection for y>1. Only solution for such equation is that y = 1, which indicates there is no outbreak. Because of the linearity between the final size and S, it is straightforward that the final size is uniquely determined by the value of R0 as well. The following Figure  shows the case of R0=2.

Figure 1. The uniqueness of the final epidemic size, with respect to the basic reproduction number R0.

Figure 1. The uniqueness of the final epidemic size, with respect to the basic reproduction number R0.

Lemma 2.2

As R0>1 increases, S in (Equation4) decreases and the final epidemic size increases.

Proof.

It can be easily proved by considering S as a function about R0 and differentiate both sides of Equation (Equation4) with respect to R0. We have (R0N1S)S=1SN.Since we have 1SN>0 and R0N1S<0 from the previous lemma, we have S<0. This result is reasonable, because if R0 is larger and the underlying disease is more contagious, the transmission of the epidemic will leave less individuals untouched eventually. The epidemic model predicts a larger final size.

2.2. The epidemic model with behavioural response

We modify model (Equation3) by considering the time-dependent contact rate function (5) a(t):=β((1ϕ)eκt+ϕ)(5) with κ>0 (see [Citation18] for more details). β represents the initial contact rate. a(t) is bounded by the initial contact rate a(0) and the eventual contact rate a(), (6) β=a(0)a(t)a()=ϕβ.(6)

In papers [Citation12,Citation13], the behavioural response function is set to have the form βeκt. This is a special case by assuming that, under the circumstance of extremely dangerous infectious diseases, the final contact rate in the population will approach 0 as all individuals tend to avoid any type of contact with others. All other parameters in model (Equation7) are following the assumptions for the SIR epidemic model (Equation3). (7) S=a(t)SIN,I=a(t)SINγI,R=γI.(7) The basic reproduction number R0=βγ of model (Equation7) has the same value as R0 of model (Equation3). Because model (Equation7) is non-autonomous, the explicit formulation of final size relation is impossible to obtain (see [Citation12]). However, we are able to obtain the following result of comparison.

Theorem 2.1

Under the condition of R0>1, the final epidemic size for model (Equation7) is no greater than the final epidemic size for model (Equation3).

Proof.

First, we have a(t)β for 0t<. To distinguish populations for two models (Equation3) and (Equation7), we use the subscripts 1 and 2 for model (Equation3) and model (Equation7), respectively. Addition of the first two equations of model (Equation3) gives [S1(t)+I1(t)]=γI1(t),and the integration indicates that (8) NS1()=γ0I1(t)dt.(8) Similarly, for model (Equation7), (9) NS2()=γ0I2(t)dt.(9) Integration of first equations of model (Equation3) and model (Equation7) gives (10) logNS1()=βγ[1S1()N],logNS2()=0a(t)I2(t)dtNβγ[1S2()N].(10) Since logNx is a monotonic decreasing function of x, this implies S2()S1().We further have NS2()NS1(),as desired.

2.3. The epidemic model with heterogeneous mixing

As mentioned in [Citation12,Citation13,Citation18], another explanation for the smaller epidemic final size than expected is the heterogeneous mixing of the population. One typical way to consider the heterogeneous mixing is to assume that the population can be divided into two sub-populations based on different activity levels, one is with size N1 and the other one is with size N2 (see [Citation10]). We assume that individuals in group i make βi contacts in unit time and then define pij the probability that one contact made by an individual of group i that is with an individual of group j, for i, j = 1, 2. We have p11+p12=p21+p22=1,and the balance relation equation (11) p21β2N2=p12β1N1.(11) We assume that the recover rates for both sub-populations are equal, which suggests that the only source of heterogeneity is the different activity levels. Similar with assumptions made in [Citation10,Citation12,Citation13], we set β1>β2, which indicates that sub-population 1 is more active than sub-population 2. Sub-population 1 can be treated as the group of superspreaders who are responsible for the majority of transmissions.

The epidemic model with two sub-populations is (12) S1=S1[p11β1I1N1+p12β1I2N2],I1=S1[p11β1I1N1+p12β1I2N2]γI1,R1=γI1,S2=S2[p21β2I1N1+p22β2I2N2],I2=S2[p21β2I1N1+p22β2I2N2]γI2,R2=γI2.(12) By using the next generation matrix approach (described in [Citation20]), the basic reproduction number can be expressed as R0=12(p11β1+p22β2γ+(p11β1p22β2)2+4p12p21β1β2γ).As discussed in [Citation10], it is necessary to make some assumptions about the nature of mixing between two sub-populations in order to obtain more useful expression. One reasonable assumption is the proportional mixing, which suggests that mixing is random but constrained by the activity level. Thus, we assume (13) pij=βjNjβ1N1+β2N2,i,j=1,2.(13) Then we have p11=p21=:p1 and p12=p22=:p2. Furthermore, the basic reproduction number R0 is (14) R0=p1β1+p2β2γ.(14)

We explain the formula of R0 in Equation (Equation14) in the following way: suppose that the disease is initiated by an infectious outsider. Then with the probability of p1, the outsider contacts a susceptible individual in sub-population 1, and with the probability of p2, this outsider will contact with a susceptible individual in sub-population 2. Thus, the first infectious case of the wholly susceptible population will appear in sub-population 1 with probability p1 and will appear in sub-population 2 with probability p2. If the first infectious individual belongs to the sub-population 1, the number of secondary infectious cases is β1/γ, while if the first infectious individual belongs to sub-population 2, the number of the secondary infectious cases is β2/γ. So the number of total secondary infectious cases by introducing one infectious individual is p1β1+p2β2γ.

For the rest of this paper, we use the notations p1 and p2 to replace the parameters p11, p12, p21 and p22 in model (Equation12). As mentioned in [Citation12,Citation13], in order to guarantee that the basic reproduction number R0 in (Equation14) for model (Equation12) has the same value as the model (Equation3) with homogeneous mixing, we need to make (15) β=p1β1+p2β2,(15) which implies that the contact rate β for the homogeneous-mixing model (Equation3) is the weighted average of contact rates β1 and β2. Assume that the average contact rate β is estimated from the epidemiological data, then what are the actual activity levels for the group of superspreaders and for the group of non-superspreaders? To answer this question, we first denote k=N2N1 representing the ratio between the group size of the non-superspreaders and the group size of the superspreaders. Then we assume that β1=c1β with c1>1 and β2=c2β with c2<1. c1 and c2 account for the scaled activity levels for each group. Thus, we are able to express the proportion of transmissions due to the group of the superspreaders c1c1+kc2.

Moreover, formula (Equation15) leads to the equation (16) c1(c11)=kc2(1c2).(16) Since we have c2(0,1), it is straightforward to have kc2(1c2)(0,k4) with the critical point c2=12. This indicates the upper bound for c1 (17) max{c1}=1+1+k2.(17) The final size relation for model (Equation12) with heterogeneous mixing can be expressed as the following system of equations (see [Citation10]): (18) logN1S1()=p1β1γN1S1()N1+p2β1γN2S2()N2,logN2S2()=p1β2γN1S1()N1+p2β2γN2S2()N2.(18) The above final size relation (Equation18) for the epidemic model with multiple sub-groups has been studied in several papers, such as [Citation4,Citation10,Citation12,Citation13,Citation28,Citation29]. However, if the basic reproduction numbers R0 for model (Equation12) and model (Equation7) are set to be identical, what is the relation between the final epidemic size N(S1()+S2()) in Equation (Equation18) and the final size NS in Equation (Equation4)? Will the epidemic model with homogeneous mixing always predict the worst result with the largest final epidemic size?

To compare the value of S and the value of S1()+S2(), we first focus on Equation (Equation18). Simple calculations lead to logN1S1()/logN2S2()=β1β2.We further have (S1()N1)β2=(S2()N2)β1.As the assumption is that N1 represents the group of superspreaders, we have N1N2 and β1β2. If we denote σ:=β1β21, the above equation becomes S1()N1=(S2()N2)σ.For a relatively large value of σ, no matter what value S1()N1 is, the value of S2()N2 cannot be much smaller than 1. In other words, the group of non-superspreaders is less affected by the epidemic. Considering that the size of the group of non-superspreaders is relatively large, the overall population is less affected. Thus, we conclude that the total final size is smaller than predicted by the model with homogeneous mixing. For the larger σ, the discrepancy between predictions of both models with homogeneous mixing and heterogeneous mixing is more obvious. To support the statement, many combinations of parameter values β1, β2, N1 and N2 are tested in the subsection of numerical experiments. We use the following theorem to summarize our study. However, the detailed analytical (algebraic) proof for this conjecture remains for future work.

Conjecture 2.1

For epidemic models with the identical value of basic reproduction number R0, the final size for the epidemic model with homogeneous mixing is larger or equal to the final size of the epidemic model with heterogeneous mixing.

Following the study of final size relations in papers [Citation12,Citation13], we investigate more of this subject in Section 2.5 from the point view of numerical simulations. Several combinations of heterogeneous mixing parameters are studied to numerically test the above conjecture.

2.4. The epidemic model with behavioural changes and heterogeneity

(19) S1=S1a1(t)[p1I1N1+p2I2N2],I1=S1a1(t)[p1I1N1+p2I2N2]γI1,R1=γI1,S2=S2a2(t)[p1I1N1+p2I2N2],I2=S2a2(t)[p1I1N1+p2I2N2]γI2,R2=γI2.(19) The model is modified by considering two behavioural response functions a1(t) and a2(t), both functions are monotonically decreasing functions and have the expressions (20) a1(t):=β1((1ϕ1)eκ1t+ϕ1),a2(t):=β2((1ϕ2)eκ2t+ϕ2).(20) The initial contact rates β1 and β2 are varied and the decreasing rates κ1,2 for both sub-groups have different values as well. For sub-group with the faster response, the value of the decreasing rate is larger. The final contact rates are determined by the values of ϕ1 and ϕ2. For model (Equation19), the basic reproduction number R0 has the same value comparing as model (Equation12). Because the model is also non-autonomous, the explicit formulation of the final size relation for model (Equation19) is impossible to obtain. We argue that if the statement in Conjecture 2.1 is correct and considering the analytical results for model (Equation7), then model (Equation19) should predict the smallest final epidemic size.

2.5. Numerical simulations on deterministic models

2.5.1. Numerical simulation 1

We now perform several numerical experiments to compare the final sizes predicted by different epidemic models which have been discussed, most of the parameter values are taken from [Citation13]. The size of the population under study is N = 5000. It is also possible to test for cases with larger population sizes (see [Citation12,Citation13]). The recovery rate γ=15.61. For model (Equation3), the contact rate β=0.27, which is also the initial contact rate for model (Equation7). For model (Equation7), we set ϕ=0 and the decreasing rate κ in the response function is 0.0023. For models (Equation12) and (Equation19) with heterogeneous mixing, the population sizes are N1=1000 and N2=4000, respectively. The (initial) contact rates β1=1613β and β2=113β. By the definition of proportionate mixing, p1=0.8 and p2=0.2 can be calculated. The careful setup is to conform with the 20/80 rule which states that commonly roughly 20% of the individuals in a population are responsible for 80% of the disease cases (also discussed in [Citation13,Citation21]). Sub-group 1 is considered as the group of superspreaders. For model (Equation19), ϕ1=ϕ2=0 and κ1=κ2=0.0023, imply that individuals in both sub-groups have the same response to the epidemic. It is also plausible to assume that two response functions are different. The basic reproduction numbers R0 for all models can be calculated as R0=p1β1+p2β2γ=βγ=1.5147.Figure  shows the dynamics of the models. We summarize the values of final sizes for the models in Table .

Figure 2. The x-axis represents time (in days) and the y-axis represents the numbers of individuals in different compartments with respect to time. Outbreaks are initiated by one infectious individual. Plots on lower left and on lower right represent the sums of S1+S2, I1+I2 and R1+R2 for models (Equation12) and (Equation19), respectively.

Figure 2. The x-axis represents time (in days) and the y-axis represents the numbers of individuals in different compartments with respect to time. Outbreaks are initiated by one infectious individual. Plots on lower left and on lower right represent the sums of S1+S2, I1+I2 and R1+R2 for models (Equation12(12) S1′=−S1[p11β1I1N1+p12β1I2N2],I1′=S1[p11β1I1N1+p12β1I2N2]−γI1,R1′=γI1,S2′=−S2[p21β2I1N1+p22β2I2N2],I2′=S2[p21β2I1N1+p22β2I2N2]−γI2,R2′=γI2.(12) ) and (Equation19(19) S1′=−S1a1(t)[p1I1N1+p2I2N2],I1′=S1a1(t)[p1I1N1+p2I2N2]−γI1,R1′=γI1,S2′=−S2a2(t)[p1I1N1+p2I2N2],I2′=S2a2(t)[p1I1N1+p2I2N2]−γI2,R2′=γI2.(19) ), respectively.

Table 1. The final epidemic sizes for epidemic models (Equation3), (Equation7), (Equation12) and (Equation19).

Here are some observations regarding the plots in Figure :

  1. The final size predicted by the SIR epidemic model with homogeneous mixing is the largest of all models studied. With the considerations of behavioural response and heterogeneity, the final sizes predicted by the corresponding models are significantly smaller. This result is consistent with the previous analysis and with the numerical experiments performed in [Citation12,Citation13].

  2. For models with heterogeneous mixing, it does not matter whether the epidemic is initiated by an infectious superspreader or not, the predictions on the final sizes are identical. This is consistent with the final size relation in Equation (Equation18). The difference between two types of initiations are mainly discussed in stochastic epidemic modelling.

2.5.2. Numerical simulation 2

We simulate the transmissions of epidemics in different scenarios, where the population size ratios are k = 4, 9, 99, 999. In real life, the population size for the group of superspreaders is significantly smaller comparing with the other group. Thus, the value of k should be extremely large. For each case, we first set the range of c2(0,1) to guarantee that it represents the relatively lower activity level for the group of non-superspreaders. Formula (Equation16) implies that c1 is determined by c2 uniquely. p1 and p2 can be calculated based on known k, c2 and c1. We then run the numerical simulations for the epidemic model (Equation12) with heterogeneous mixing, with different combinations of parameter values. The predicted final sizes for different combinations are plotted in Figure . It is shown in the figure that the final size for the model with homogeneous mixing is always larger. For the epidemic model with heterogeneous mixing, the final sizes get larger and approach the one predicted by the homogeneous mixing model when c2 increases. This makes sense because the model with heterogeneous mixing will turn into homogeneous mixing if the heterogeneity becomes less obvious, indicated by the larger c2.

Figure 3. The comparison of final sizes predicted by the epidemic model (Equation3) with homogeneous mixing and the epidemic models (Equation12) with heterogeneous mixing, with different combinations of parameters. All models share the identical basic reproduction number R0.

Figure 3. The comparison of final sizes predicted by the epidemic model (Equation3(3) S′=−βSIN,I′=βSIN−γI,R′=γI.(3) ) with homogeneous mixing and the epidemic models (Equation12(12) S1′=−S1[p11β1I1N1+p12β1I2N2],I1′=S1[p11β1I1N1+p12β1I2N2]−γI1,R1′=γI1,S2′=−S2[p21β2I1N1+p22β2I2N2],I2′=S2[p21β2I1N1+p22β2I2N2]−γI2,R2′=γI2.(12) ) with heterogeneous mixing, with different combinations of parameters. All models share the identical basic reproduction number R0.

3. The continuous-time Markov chain (CTMC) models

Stochastic modelling is a natural choice to describe the early stage of an outbreak. There are two common choices of assumptions on distributions of infectious periods [Citation16]. The first type of assumption is that the infectious period is exponentially distributed, and the second type is to assume that the infectious period is non-random. In this paper, we focus on the first type and formulate the continuous-time Markov chain (CTMC) models. Such models are also called the stochastic compartmental models since they are constructed based on the deterministic compartmental models. In this section, following the setup of deterministic epidemic models discussed in the previous section, we formulate the corresponding CTMC models. We further analyse and compare the probabilities of a minor outbreak for different CTMC models.

3.1. The CTMC epidemic models and the corresponding probabilities of a minor outbreak

3.1.1. The basic CTMC model with homogeneous mixing

The random variable (St,It,Rt) is the state vector to represent the population sizes of the susceptible class, the infectious class and the removed (recovered) class at time t, respectively. For all time t0, St0, It0 and Rt0. The stochastic process can be described as (21) P((St+Δt,It+Δt,Rt+Δt)(St,It,Rt)=(1,1,0))=βStItNΔt+o(Δt),P((St+Δt,It+Δt,Rt+Δt)(St,It,Rt)=(0,1,1))=γItΔt+o(Δt),(21) with the complimentary probability P((St+Δt,It+Δt,Rt+Δt)(St,It,Rt)=(0,0,0))=1βStItNΔtγI(t)Δt+o(Δt).For each Markov jump process, it has exponentially distributed time between the jumps. All probabilities defined in (Equation21) are actually conditioned on the state at time t, (St,It,Rt). More detailed discussions regarding the stochastic compartmental modelling can be found in [Citation1,Citation2,Citation22]. For this CTMC model, if the basic reproduction number R0=βγ is greater than one, it is possible that there is a major outbreak. If R0 is less than one, It will reach zero relatively soon for certain and there is a minor epidemic outbreak. The branching process (BP) theory is then applied to estimate the probability of a minor outbreak.

The disease-free equilibrium is (S¯,I¯,R¯)=(N,0,0). We apply the probability generating function (pgf) to analyse the dynamics of the disease-related variable I. The domain of the pgf is u[0,1], and the pgf represents the number of ‘offspring’ generated by infected individuals. In order to apply pgfs from the BP approximation to this Markov-chain process, we assume that the initial number of infectious individuals is small and the ‘birth’ of new infectious individuals occur independently of each other (see [Citation1]). For the pgf, it is assumed that there is only one initial infectious individual I0=1, so the pgf is defined as in the birth and death process. For CTMC model (Equation21), the pgf has the following formula: (22) f(u)=βu2+γβ+γ.(22) The probability of a minor outbreak given that I0=1 is the smallest fixed point q of the offspring pgf (Equation22) in interval q(0,1] (see [Citation1–3]).

If R01, there is only one fixed point q(0,1] which is q = 1. While if R0>1 there are two fixed points, one is q = 1 and the other one is q<1 which is the probability of a minor epidemic outbreak. We solve q as follows: (23) βq2+γβ+γ=q,q=γβ=1R0.(23)

3.1.2. The CTMC epidemic model with heterogeneous mixing

The random variable (S1,t,I1,t,R1,t,S2,t,I2,t,R2,t) is the state vector to represent the sizes of the susceptible sub-population 1, the infectious sub-population 1, the recovery sub-population 1, the susceptible sub-population 2, the infectious sub-population 2 and the recovery sub-population 2 at time t, respectively. The stochastic compartmental model is formulated based on its deterministic counterpart model (Equation12) and can be expressed in (Equation24). For the purpose of simplicity, only related compartments are involved in probability definitions. (24) P((S1,t+Δt,I1,t+Δ)(S1,t,I1,t)=(1,1))=S1,tβ1[p1I1,tN1+p2I2,tN2]+o(Δ),P((I1,t+Δt,R1,t+Δ)(I1,t,R1,t)=(1,1))=γI1,t+o(δ),P((S2,t+Δt,I2,t+Δ)(S2,t,I2,t)=(1,1))=S2,tβ2[p1I1,tN1+p2I2,tN2]+o(Δ),P((I2,t+Δt,R2,t+Δ)(I2,t,R2,t)=(1,1))=γI2,t+o(Δ)(24) all complimentary probabilities are omitted. Slightly different than model (Equation21), we apply the multi-type BP theory to investigate the behaviour near the disease-free equilibrium, (S1¯,I1¯,R1¯,S2¯,I2¯,R2¯)=(N1,0,0,N2,0,0). The domain of the pgf is u[0,1]2 because there are two disease-related variables I1 and I2. For each pgf, it is assumed that there is only one initial infectious individual, either I1,0=1 or I2,0=1. The offspring pgf has the following formula: (25) f1(u)=p1β1u12+p1β2N2N1u1u2+γp1β1+p1β2N2N1+γ,f2(u)=p2β1N1N2u1u2+p2β2u22+γp2β1N1N2+p2β2+γ.(25)

Theorem 3.1

In the multi-type branching process approximation for the CTMC model (Equation24), assume there is one infectious individual in either sub-group 1 or sub-group 2. If the basic reproduction number R0=p1β1+p2β2γ<1, then the probability of a minor outbreak is P=1 and if R0>1, the probability of a minor outbreak is P=qi (i=1,2). qi is defined as the fixed point of Equation (Equation25).

The complete proof can be found in [Citation2,Citation30] by assuming the independence of multi-type branching process and by applying the backward Kolmogorov differential equations for the branching process approximation. In addition, it has been proved that if R0=p1β1+p2β2γ<1, the fixed point (1,1) is stable and if R0>1, the fixed point (q1,q2) in the domain (0,1)2 exists and it is stable. The fixed point (q1,q2) satisfies the following equations: (26) p1β1q12+p1β2N2N1q1q2+γp1β1+p1β2N2N1+γ=q1,p2β1N1N2q1q2+p2β2q22+γp2β1N1N2+p2β2+γ=q2.(26) Moreover, the probability of a minor outbreak by initially introducing i1 infectious superspreaders and i2 infectious non-superspreaders can be defined as follows (also discussed in [Citation30]) P=q1i1q2i2.It is noted that the probability of a major outbreak can simply be expressed as Pmajor=1Pminor.

Theorem 3.2

Assume R0=p1β1+p2β2γ>1 for the CTMC model (Equation24) with heterogeneous mixing, then the probability of a minor outbreak q2 initiated by an infectious non-superspreader is greater than the probability of a minor outbreak q1 initiated by an infectious superspreader. Moreover, q2 is determined by the positive root of the following quadratic equations: (27) p2(1β1β2)q22+[γβ2+(1+γβ2)β1β2]q2γβ1β22=0,(27) and q1 satisfies the following relation: (28) q1=q2(1β1β2)q2+β1β2.(28)

Proof.

We apply the balance Equation (Equation11) to Equation (Equation26) and rearrange it to obtain (29) p1β1q12+p2β1q1q2+γ=[p1β1+p2β1+γ]q1,p1β2q1q2+p2β2q22+γ=[p1β2+p2β2+γ]q2.(29) We then divide both sides of the first equation by q1 and divide both sides of the second equation by q2, (30) p1β1q1+p2β1q2+γq1=β1+γ,p1β2q1+p2β2q2+γq2=β2+γ.(30) It implies that (31) p1q1(β1β2)+p2q2(β1β2)+γ(1q11q2)=β1β2,γ(1q11q2)=(1p1q1p2q2)(β1β2).(31) The assumption of R0>1 indicates that q1<1, and p1+p2=1, so we have 1p1q1p2q2>0. Thus, when β1>β2, we have q1<q2. We further divide both sides of the first equation in (Equation30) by β1 and divide both sides of the second equation in (Equation30) by β2, (32) p1q1+p2q2+γq1β1=1+γβ1,p1q1+p2q2+γq2β2=1+γβ2.(32) This leads to (33) β2β1=11q211q1.(33) If we substitute Equation (Equation33) into the second equation of (Equation32), a cubic equation about q2 can be obtained. Since q2=1 is one of the real roots, we are able to reduce it to the quadratic Equation (Equation27), as desired.

Remark:

A similar result of q1<q2 in the first part of Theorem 3.2 has been proved in a different way in [Citation30]. The CTMC models considered in [Citation30] are formulated to account for different levels of infectivity or susceptibility. As our interest is the heterogeneity in individuals' activity levels, we only assume the difference of contact rates between the group of superspreaders and the group of non-superspreaders. The constraint to all models we considered is that all models share the identical R0.

The CTMC epidemic models with response functions can be formulated based on the deterministic models (Equation7) and (Equation19). Similar BP estimates can be applied to analyse the early outbreaks for both models. Since only the initial contact rate β(0) matters in the calculations of the probabilities of a minor outbreak for both models, similar results can be obtained with the CTMC models without response functions.

3.2. Numerical experiments

In this section, we perform several sets of numerical simulations for the CTMC model (Equation21) with homogeneous mixing and the CTMC model (Equation24) with heterogeneous mixing. All parameter values of the models are kept unchanged and are described in section (2.7). We apply the stochastic simulation algorithm (SSA or Gillespie's direct method) to generate the exact realizations for these CTMC models. All numerical simulations are implemented in Python package StochPy [Citation27].

We first perform numerical simulations to generate 5 realizations for CTMC models (Equation21) and (Equation24). Figure  is the plot of 5 realizations generated by SSA for CTMC model (Equation21). In Figure (a), 3 realizations are representing the major epidemic outbreaks and the other 2 realizations are presenting the minor epidemic outbreaks as the numbers of infectious individuals hit 0 very fast. The early stages of the 5 realizations are emphasized in Figure (b), it clearly illustrates the increasing numbers of infectious individuals for realizations standing for major outbreaks.

Figure 4. Five realizations generated by SSA for CTMC model (Equation21) with homogeneous mixing.

Figure 4. Five realizations generated by SSA for CTMC model (Equation21(21) P((St+Δt,It+Δt,Rt+Δt)−(St,It,Rt)=(−1,1,0))=βStItNΔt+o(Δt),P((St+Δt,It+Δt,Rt+Δt)−(St,It,Rt)=(0,−1,1))=γItΔt+o(Δt),(21) ) with homogeneous mixing.

Figure  represents 5 realizations generated by SSA for the CTMC model (Equation24) with heterogeneous mixing. The top two subfigures are with initial condition of 1 infectious superspreader and the bottom two subfigures are with initial condition of 1 infectious non-superspreader, respectively. For the top two subfigures, there are 3 realizations representing the major epidemic outbreaks as the numbers of infectious individuals increasing. The other 2 realizations represent the minor outbreaks. For the bottom two subfigures, there is only 1 realization out of 5 representing the successful establishment of the outbreak. Figure (a,b) emphasize the early stages of the epidemics for two cases, respectively. The preliminary result of simulations in Figure shows that initiations by superspreaders are more likely to cause the outbreaks.

Figure 5. Five realizations generated by SSA for CTMC model (Equation24) with heterogeneous mixing. The top two subfigures are with initial condition of 1 infectious superspreader. The bottom two subfigures are with initial condition of 1 infectious non-superspreader.

Figure 5. Five realizations generated by SSA for CTMC model (Equation24(24) P((S1,t+Δt,I1,t+Δ)−(S1,t,I1,t)=(−1,1))=S1,tβ1[p1I1,tN1+p2I2,tN2]+o(Δ),P((I1,t+Δt,R1,t+Δ)−(I1,t,R1,t)=(−1,1))=γI1,t+o(δ),P((S2,t+Δt,I2,t+Δ)−(S2,t,I2,t)=(−1,1))=S2,tβ2[p1I1,tN1+p2I2,tN2]+o(Δ),P((I2,t+Δt,R2,t+Δ)−(I2,t,R2,t)=(−1,1))=γI2,t+o(Δ)(24) ) with heterogeneous mixing. The top two subfigures are with initial condition of 1 infectious superspreader. The bottom two subfigures are with initial condition of 1 infectious non-superspreader.

The second set of numerical simulations is to generate 104 realizations by SSA for the CTMC model (Equation21) with homogeneous mixing and the CTMC model (Equation24) with two different initial conditions. We then estimate the probabilities of a minor outbreak by statistically summarizing the final epidemic sizes of all realizations. The final sizes are defined as S(initialtime)S(finaltime) for model (Equation21) and S1(initialtime)+S2(initialtime)S1(finaltime)S2(finaltime) for model (Equation24), respectively. If the final epidemic size is less than the pre-defined threshold 100, we consider there is a minor outbreak. While if the final epidemic size is larger, we consider there is a major outbreak. Other pre-defined thresholds with relatively small values also work. This is because the distributions of final sizes for stochastic epidemic models are well-known as bimodal (also discussed in [Citation22]), there are very few falling in between.

The result of numerical simulations of 104 realizations for CTMC model (Equation21) are presented in Figure . The distribution of final epidemic sizes indicates that the proportion of realizations with the final size smaller than the pre-defined value 100 is 0.6608. The estimated probability of a minor outbreak by BP is q=1R0=11.5147=0.6602.

Figure 6. The distribution of final epidemic sizes for 104 realizations for CTMC model (Equation21) with homogeneous mixing.

Figure 6. The distribution of final epidemic sizes for 104 realizations for CTMC model (Equation21(21) P((St+Δt,It+Δt,Rt+Δt)−(St,It,Rt)=(−1,1,0))=βStItNΔt+o(Δt),P((St+Δt,It+Δt,Rt+Δt)−(St,It,Rt)=(0,−1,1))=γItΔt+o(Δt),(21) ) with homogeneous mixing.

The result of numerical simulations of 104 realizations for CTMC model (Equation24) are presented in Figure . The top subfigure is for the case of initiation by 1 infectious superspreader and the bottom subfigure is for the case of initiation by 1 infectious non-superspreader. In Figure (a), the proportion of realizations with the final size smaller than the pre-defined value 100 is 0.6591. In Figure (b), the proportion of realizations with the final size smaller than the pre-defined value 100 is 0.9699. Thus, we estimate the probabilities of a minor outbreak by performing SSA are q1=0.6591 and q2=0.9699. This conforms the result of q1<q2 in Theorem 3.2, the initiation by the superspreaders is more likely to cause a major outbreak.

Figure 7. The distributions of final epidemic sizes for 104 realizations for CTMC model (Equation24) with heterogeneous mixing, (a) initiated by 1 infectious superspreader and (b) initiated by 1 infectious non-superspreader.

Figure 7. The distributions of final epidemic sizes for 104 realizations for CTMC model (Equation24(24) P((S1,t+Δt,I1,t+Δ)−(S1,t,I1,t)=(−1,1))=S1,tβ1[p1I1,tN1+p2I2,tN2]+o(Δ),P((I1,t+Δt,R1,t+Δ)−(I1,t,R1,t)=(−1,1))=γI1,t+o(δ),P((S2,t+Δt,I2,t+Δ)−(S2,t,I2,t)=(−1,1))=S2,tβ2[p1I1,tN1+p2I2,tN2]+o(Δ),P((I2,t+Δt,R2,t+Δ)−(I2,t,R2,t)=(−1,1))=γI2,t+o(Δ)(24) ) with heterogeneous mixing, (a) initiated by 1 infectious superspreader and (b) initiated by 1 infectious non-superspreader.

To verify the second part of Theorem 3.2, we estimate these two probabilities by the formulas in Equations (Equation27) and (Equation28) in the theorem. Based on the parameter values we used for simulations, q2 is determined by the following quadratic equation: (34) 3q22+144.74q2137.32=0.(34) By solving this equation, we pick the positive root for the estimate of the probability of a minor outbreak by BP, q2=0.9683, which is close to the value we have by SSA. The relation between q1 and q2 in Equation (Equation28) implies that (35) q1=q2(1β1β2)q2+β1β2=0.96830.9683×15+16=0.666.(35) Thus, estimates from BP and estimates by SSA show good agreement. We summarize the results of numerical experiments in Table .

Table 2. The probability of a minor outbreak for CTMC models (Equation21) with initially 1 infectious individual, and (Equation24) with initially 1 infectious super-spreader and 1 infectious non super-spreader, respectively. It is demonstrated that, even with identical R0, the probability of a minor outbreak can be varied by different choice of epidemic models.

Other numerical simulations with other combinations of parameter values are tested, the results on the probabilities of a minor outbreak estimated by BP are confirmed by performing SSA.

4. Conclusion and future work

Given the identical basic reproduction number R0, several deterministic epidemic models and continuous-time Markov chain (CTMC) epidemic models are formulated and studied. In the study of deterministic modelling, we mainly examine the final epidemic sizes predicted by the epidemic models. The relation between the final size and R0 is investigated as well. It has been proved that, by including the behaviour changes and the heterogeneity of activity levels, the predicted final sizes are smaller than the final size for the model with homogeneous mixing. We may conclude that the epidemic models with homogeneous mixing predict the worst scenarios. In CTMC modelling, as the behavioural change does not affect the early phases, two models with homogeneous mixing and heterogeneous mixing are mainly studied. For the CTMC model with homogeneous mixing, the well-known result on the probability of a minor outbreak is simply presented. For the CTMC model with heterogeneous mixing, only the heterogeneity caused by different activity levels is considered. For the CTMC model with two sub-groups, the probability of a minor outbreak which is initiated by superspreaders is smaller than the probability of a minor outbreak which is initiated by non-superspreaders. Both probabilities are calculated explicitly. We also perform several numerical simulation for deterministic modelling and apply stochastic simulation algorithm (SSA) to simulate the CTMC models. All theoretical results are confirmed by the numerical simulations.

It has been suggested in [Citation13] that, in order to measure the behavioural changes in the affected population, it is informative to estimate both the basic reproduction number R0 and the effective reproduction numbers with respect to time. The early identifications of the superspreaders and the qualifications and inferences of different activity levels are able to improve the formulations of models and thus the predictions on the final sizes. Taking more factors into consideration for the formulations of stochastic models can generate different results regarding the early phases. The heterogeneity of mixing makes predictions about the establishment of major outbreaks more difficult.

The study of heterogeneity of mixing in the population and the investigation on the impact of superspreaders on the transmissions of epidemics are worthy of much efforts. In this paper, we simply assumed that the population can be divided into two sub-groups, based on different activity levels. And for each sub-group, the averaged contact rate is assumed. This assumption is overly simple. It is discussed in [Citation25] that the degree of infectiousness may follow one type of continuous distributions in the population. Then if the infectious period is assumed to be exponential distributed, it is more reasonable to assume that the contact rate β is continuously distributed as well. The second direction of modelling superspreading is to assume that the number of superspreaders is extremely small. For this kind of stochastic model, the method of conditional moments (MCM) [Citation23] might be a good way to investigate.

Acknowledgments

The author thanks his Ph.D. supervisor Dr Fred Brauer from the University of British Columbia for discussions on this topic, reading the manuscripts and providing some proofs. The author also thanks two anonymous reviewers and the editor for their valuable comments and suggestions on the original manuscript. The work is funded by the Post-doctoral fellowship offered by Hausdorff Center for Mathematics and the University of Bonn.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

There is one infectious individual initially for all models. For two epidemic models (Equation12) and (Equation19) with heterogeneous mixing, the summation counts the numbers for the group of superspreaders and the group of non-superspreaders, respectively.

References

  • L.J.S. Allen, An Introduction to Stochastic Processes with Applications to Biology, 2nd ed., CRC Press, Boca Raton, FL, December 2010.
  • L.J.S. Allen, A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis, Infect. Dis. Model. 2(2) (2017), pp. 128–142.
  • K.B. Athreya and P.E. Ney, Branching Processes, Springer, Berlin, Heidelberg, 1972.
  • F. Bai, Vaccination models in infectious diseases, Ph.D. thesis, University of British Columbia, 2016.
  • F. Bai and L.J.S. Allen, Probability of a major infection in a stochastic within-host model with multiple stages, Appl. Math. Lett. 87 (2019), pp. 1–6.
  • F. Bai, K.E.S. Huff, and L.J.S. Allen, The effect of delay in viral production in within-host models during early infection, J. Biol. Dyn. 13(sup1) (2018), pp. 47–73.
  • N.T.J. Bailey, The Elements of Stochastic Processes with Applications to the Natural Sciences, John Wiley & Sons, New York, 1990.
  • F. Ball and D. Clancy, The final size and severity of a generalised stochastic multitype epidemic model, Adv. Appl. Probab. 25(4) (1993), pp. 721–736.
  • F. Brauer, The Kermack–McKendrick epidemic model revisited, Math. Biosci. 198(2) (2005), pp. 119–131.
  • F. Brauer, Epidemic models with heterogeneous mixing and treatment, Bull. Math. Biol. 70(7) (2008), pp. 1869–1885.
  • F. Brauer, Mathematical epidemiology: Past, present, and future, Infect. Dis. Model. 2(2) (2017), pp. 113–127.
  • F. Brauer, Early estimates of epidemic final sizes, J. Biol. Dyn. 13 (2018), pp. 23–30.
  • F. Brauer, The final size of a serious epidemic, Bull. Math. Biol. 81(3) (2018), pp. 869–877.
  • F. Brauer and C. Castillo-Chavez, Mathematical Models in Population Biology and Epidemiology, Springer, New York, 2012.
  • F. Brauer, P. van den Driessche, and J. Wu, Mathematical Epidemiology, Springer, Berlin, Heidelberg, 2008.
  • T. Britton, Stochastic epidemic models: A survey, Math. Biosci. 225(1) (2010), pp. 24–35.
  • G. Chowell, C. Viboud, J.M. Hyman, etal, The Western Africa Ebola virus disease epidemic exhibits both global exponential and local polynomial growth rates, PLoS Curr. 7 (2015).
  • G. Chowell, C. Viboud, L. Simonsen, etal, Characterizing the reproduction number of epidemics with early subexponential growth dynamics, J. R. Soc. Interface 13(123) (2016), p. 20160659.
  • S.A. Colgate, E.A. Stanley, J.M. Hyman, S.P. Layne, and C. Qualls, Risk behaviour-based model of the cubic growth of acquired immunodeficiency syndrome in the united states, Proc. Natl Acad. Sci. USA 86(12) (1989), pp. 4793–4797.
  • P. van den Driessche and J. Watmough, Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission, Math. Biosci. 180(1–2) (2002), pp. 29–48.
  • A.P. Galvani and R.M. May, Dimensions of superspreading, Nature 438(7066) (2005), pp. 293–295.
  • P.E. Greenwood and L.F. Gordillo, Stochastic epidemic modeling, in Chowell, G., Hayman, J.M., Bettencourt, L.M.A., Castillo-Chavez, C. Mathematical and Statistical Estimation Approaches in Epidemiology, Springer, Netherlands, 2009, pp. 31–52.
  • J. Hasenauer, V. Wolf, A. Kazeroonian, and F.J. Theis, Method of conditional moments (MCM) for the chemical master equation, J. Math. Biol. 69(3) (2013), pp. 687–735.
  • V. Isham, Stochastic models for epidemics, in A. C. Davison and Yadolah Dodge and N. Wermuth. Celebrating Statistics, Oxford: Oxford University Press, 2005, pp. 27–54.
  • J.O. Lloyd-Smith, S.J. Schreiber, P.E. Kopp, and W.M. Getz, Superspreading and the effect of individual variation on disease emergence, Nature 438(7066) (2005), pp. 355–359.
  • Z. Ma, Y. Zhou, J. Wu, Modeling and Dynamics of Infectious Diseases, Co-Published With Higher Education Press, Beijing, 2009.
  • T.R. Maarleveld, B.G. Olivier, and F.J. Bruggeman, StochPy: A comprehensive, user-friendly tool for simulating stochastic biological processes, PLoS ONE 8(11) (2013), p. e79345.
  • P. Magal, O. Seydi, and G. Webb, Final size of an epidemic for a two-group SIR model, SIAM J. Appl. Math. 76(5) (2016), pp. 2042–2059.
  • P. Magal, O. Seydi, and G. Webb, Final size of a multi-group SIR epidemic model: Irreducible and non-irreducible modes of transmission, Math. Biosci. 301 (2018), pp. 59–67.
  • A. Nandi and L.J.S. Allen, Stochastic two-group models with transmission dependent on host infectivity or susceptibility, J. Biol. Dyn. 13(sup1) (2018), pp. 201–224.
  • C. Viboud, L. Simonsen, and G. Chowell, A generalized-growth model to characterize the early ascending phase of infectious disease outbreaks, Epidemics 15 (2016), pp. 27–37.
  • J. Wallinga and M. Lipsitch, How generation intervals shape the relationship between growth rates and reproductive numbers, Proc. R. Soc. B: Biol. Sci. 274(1609) (2006), pp. 599–604.
  • P. Whittle, The outcome of a stochastic epidemic – a note on Bailey's paper, Biometrika 42(1/2) (1955), p. 116.