1,987
Views
2
CrossRef citations to date
0
Altmetric
Review

On the reproduction number in epidemics

ORCID Icon
Pages 623-634 | Received 18 Mar 2021, Accepted 25 Oct 2021, Published online: 22 Nov 2021

Abstract

This note provides an elementary derivation of the basic reproduction number and the effective reproduction number from the discrete Kermack–McKendrick epidemic model. The derived formulae match those derived from the continuous version of the model; however, the derivation from discrete model is a bit more intuitive. The MATLAB functions for its calculation are given. A real case example is considered and the results are compared with those obtained by the R0 and the EpiEstim software packages.

Introduction

In addition to the incidence and prevalence [Citation1,Citation2], the reproductive number is one of the most useful epidemiologic metrics for monitoring and controlling the development of an epidemic [Citation3–9]. While the incidence and prevalence tell us something about the size of the epidemic, the reproduction number informs us of its spread rate.

There are two kinds of reproduction numbers: the basic reproduction number, R0, and the effective (or time-varying) reproduction number, R. R0 is defined only at the beginning of an outbreak and ‘represents the number of secondary cases that one case can produce if introduced to a susceptible population’ [Citation10]. R is defined over the entire period of an epidemic and represents ‘the average number of secondary infectious produced by typical infective during the entire period of infectiousness’ [Citation11]. Neither R0 norR is a natural constant, because their value depends on the epidemic model used. For various models used for analyzing properties and for calculation of R0, see [Citation12–24]. For properties and calculation of R, see [Citation25–30]. For critical consideration of R0, see [Citation31] and references therein.

We first review the formulas for calculating R0 and R based on the continuous Kermack and McKendrick epidemic model [Citation32,Citation33]. Heesterbeek and Dietz [Citation14] used the following form of the model (1) I(t)=S(t)0A(τ)I(tτ)dτ,(1) where I(t) is the incidence rate, S(t) is the density of susceptibles in the population, A(τ) is excepted infectivity of a person with infection age τ. They define (2) R0S00A(τ)dτ(2) and derived the following expressions (3) 1=S00A(τ)erτdτ(3) where S0 is the initial density of the susceptible population and r is the natural growth rate. From these expressions, Wallinga and Lipsitch [Citation34], by introducing generation time distribution (4) w(t)A(t)0A(τ)dτ,(4) deduce a formula for calculation of R0, which has the form (5) 1R0=0w(τ)erτdτ.(5)

For practical calculation, they offer a discrete version of this formula implemented in the R0 package [Citation35].

It seems that Fraser [Citation26] was the first who deduce a general formula for calculation of R from the Kermack and McKendrick model. He starts with a generalization of Equation (1) in the form of a continuous renewal equation (6) I(t)=0β(t,τ)I(tτ)dτ,(6) where β(t,τ) is the transmissibility. He defines the case reproduction number Rc(t) as ‘the average number of people someone infected at time t can expect to infect’, thus (7) Rc0β(t+τ,τ)dτ,(7) and the instantaneous reproduction number R(t) as ‘the average number of people someone infected at time t could expect to infect should conditions remain unchanged’, expressed by the formula, (8) R0β(t,τ)dτ.(8)

Assuming that β(t,τ) can be factorized as (9) β(t,τ)=R(t)w(τ),(9) he deduces (10) R(t)=I(t)0I(tτ)w(τ)dτ,(10) and from this formula, a discrete version of the formula for calculation of R and Rc. The method was implemented in the EpiEstim package [Citation36,Citation37].

We will follow the above ideas, but, unlike the authors mentioned, we will derive formulas for calculating the reproduction number from the discrete form of the Kermack–McKendrick epidemic model. In particular, this derivation does not need the factorization assumption Equation (9), and the derivation of the formula (4) for generation time is a bit more intuitive. For other discrete models, see [Citation38] and references therein.

Kermack–McKendrick epidemic model

Let In,k denote the number of infectious peoples at calendar time n and infection-age k. Let In,0 be the number of peoples who became infected at the calendar time n. We call them nth generation. The transition process of the epidemic for the nth generation is as follows In,0In+1,1In+k,k,and the whole epidemic process may be schematically represented in this way [Citation32,Citation33]: tτk/n|01230|I0,0I1,0I2,0I3,01|I1,1I2,1I3,12|I2,2I3,23|I3,3|

Here, t is calendar time and τ is the infection age.

Two hypotheses govern the epidemic process. The first one is that a decline in the number of infected in nth generation depends only on age. Thus, we have (11) In,kIn1,k1=ψk1In1,k1,(11) where 0ψk11 is the removal rate. From this, it follows that (12) In,k=BkInk,0.(12)

A non-dimensional parameter Bkj=1k(1ψkj) can be interpreted as the probability that an individual of age k is still infectious [Citation3,Citation19].

Before stating the second hypothesis, we decompose In,0 to local (In,0local) and imported (In,0imported) cases [Citation37]: (13) In,0=In,0local+In,0imported.(13)

The second hypothesis assumes that In,0 is proportional to the number of contacts each generation has with all susceptible left in time interval n [Citation32]: (14) In,0local=Snk=1nϕkIn,k (n=1,2,),(14) where ϕk is the infective rate at age k and Sn is the number of susceptible individuals at the time interval n. Using Equations (12), (13) and (14), we obtain (15) In,0=Snk=1nAkInk,0+In,0imported (n=1,2,),(15) where AkϕkBk is the infectivity at age k [Citation19,Citation39]; i.e. the number of new infectives in unit time per ineffective contact. Note that the Equation (15) assumes that ‘an individual is not infective at the moment of infection’ [Citation32], so, starting at the 0th generation, we assume (16) I0,0=κI0+I0,0imported,(16) where κ0 is a constant and I0 the number of infected individuals initially. Equations (16) and (15) represent the discrete Kermack–McKendrick epidemic model that we will use to derive the formulas for calculating R0 and R.

Basic reproduction number

To derive the formula for calculation of R0, we start with the epidemic process for 0th generation: I0,0I1,1I2,2.

Thus, the number of infectious individuals of 0th generation at each time interval is, by Equation (12), I0,0B1I0,0B2I0,0.and the new infected individuals produced by these infectives are, by Equation (15), (17) I0,0S1A1I0,0S2A2I0,0.(17)

The total number of secondary infections produced by 0th generation is thus (18) C0=I0,0k=1SkAk.(18)

If (19) S0=S1=,(19) i.e. the susceptible population remains the same (which would literally mean that the newly infected individual is immediately removed and replaced by a new susceptible, but practically that the population is large), and we set I0=1 and I0,0imported=0 in Equation (16), then I0,0=κ, and thus, by definition[Citation14], (20) R0κS0k=1Ak.(20)

This is a discrete form of Equation (2).

Note. We could set I0=0 and I0,0imported=1, but we want to keep κ in the expression in order to use an empirical law for I, which depends on the initial condition I0.

The sequence of infected individuals produced by one infective in a completely susceptible population is thus, by Equation (17), (21) 1κS0A1κS0A2.(21)

Dividing each term of this sequence by Equation (20), we obtain the sequence of the fractions of infected produced by one infected case: (22) w1,w2,,(22) where (23) wkAkj=1Aj and k=1wk=1.(23)

Thus, wk can be interpreted as the probability that an infective individuals, produces a new case at age k. Equation (23) is a discrete form of Equation (4), and thus, by definition, the generation-time distribution. For more on generation time, see [Citation40–42]. From Equations (20) and (23), we can express infectivity at age k as (24) Ak=R0κS0wk(24)

Substituting Equation (24) for Ak into Equation (15), and taking In,0imported=0 and S0=S1=, we obtain (25) R0=κIn,0k=1nwkInk,0.(25)

This formula cannot be used directly for calculation of R0 from observed incidences because these are assumed to be produced only by the 0th generation in a wholly susceptible population. Thus, to obtain the formula for computing of R0 from Equation (25), we assume that infection is a continuous process that in its initial phase is governed by the exponential growth rate (26) I(t)=I0ert,(26) where I0 and r can be determined from observable data by regression analysis. Using Equation (26) and integrating it over a single time interval, we obtain (27) In,0=1tn+1tntntn+1I(t)dt=I0nn+1ertdt=κI0ern,(27) where (28) κ=er1r.(28)

Substituting Equation (27) for In,0 and Equation (28) for κ into Equation (25), we obtain Wallinga–Lipsitch formula [Citation34]: (29) R0=r(er1)k=1nwkerk.(29)

This is a discrete form of Equation (5). Note that calculation of R0 by Equation (29) depends on n; however, practically wk=0, for k>K, so we should take n=K. For more on the calculation of R0 by Equation (29), see [Citation35].

The effective reproduction number

By similar reasoning as in the previous section, we will derive the formula for calculation of the effective reproduction number R. For the nth generation, we have the following transition process In,0In+1,1In+2,2.

Using Equation (12), the number of individuals who are still infective in the nth generation is In,0B1In,0B2In,0.

The sequence of newly infected individuals generated by these infectives is In,0Sn+1A1In,0Sn+2A2In,0.

The total number of secondary infections generated by the nth generation is therefore (30) Cn=k=1Sn+kAkIn,0.(30)

We have two possibilities regarding Sn. The first is that, similar to the calculation of the R0 for the 0th generation, where the population of the initial number of susceptibles does not change, we assume that the current population of the susceptibles does not change (31) Sn=Sn+1=.(31)

If we insert this and In,0=1 into Equation (30), then we can define the instantaneous reproduction number Rn as (32) RnSnk=1Ak.(32)

This is a discrete analogue of Equation (8). From Equation (15), we have (33) Sn=In,0localk=1nAkInk,0 (n=1,2,),(33)

Extracting Ak from Equation (23) and substituting Equation (33) into (32), we obtain the well-known Fraser formula [Citation26]: (34) Rn=In,0localk=1nwkInk,0.(34)

In particular, if In,0=1 (or any other constant) for all n, then Rn1 as n. When In,0local=0 for some n, then, for this n, we have Rn=0. If wk=1 for k=m and wk=0 for all other k, i.e. an infected person is practically infectious for only one day, then Rn=In,0/Inm,0. This relation can be used for a quick but crude estimate of Rn.

Equation (34) is a discrete form of Equation (10). Substituting Equation (24) into (32), and taking κ=1, we obtain (35) Rn=SnS0R0.(35)

This is a well-known result for the SIR model [Citation11]: the ratio of R and R0 represents the fraction of susceptibles who are still in the population.

If we abandon condition (31) and put In,0=1 into Equation (30), then we can define the case reproduction number Rc,n as (36) Rc,nk=1Sn+kAk.(36)

Using Equation (33) for Sn and substituting Equation (23) into (36), we obtain [Citation26] (37) Rc,n=k=1wkIn+k,0localj=1nwjIn+kj,0=k=1wkRn+k.(37)

We note that none of the formulas (29), (34) and (37), depend on the size of the initial susceptible population. Thus, they can thus be used to calculate the reproduction number from daily incidence reports once the distribution of generation time is known.

Examples

For practical calculation, we write a MATLAB programme using functions that are listed in the Appendix. These functions assume wk=0, k>K and In,0=0, n>N.

For a numerical example, we take data for Spanish influenza in Germany 1918–19 [Citation43] and compare results of this calculation with results obtained by the R0 package [Citation35] and the EpiEstim package [Citation36].

In Figure , we see that the calculation of Rc by Equation (37) and those obtained by the R0 package using the Wallinga–Teunis method [Citation25] are practically identical; the absolute difference is less than 107. The difference is noticeable towards the end of data; i.e. on the length of the vector giving the generation time. Calculation of Rc by formula (37) implicitly assumes that Rc,n=0; i.e. In,0=0 for n>N. In particular, this yields Rc,N=0. The calculation of Rc that includes only valid data is therefore limited to n<NK.

Figure 1. Comparison of the case reproduction number calculated by Equation (37) and R0 package using the Wallinga–Teunis method.

Figure 1. Comparison of the case reproduction number calculated by Equation (37) and R0 package using the Wallinga–Teunis method.

In Figure , we see that the absolute difference between the calculation of instantaneous reproduction number by Equation (34) and by function estimate_R from the EpiEstim package is in a range 102 to 101. The reason for the difference is that the function estimate_R implements a formula for calculating Rt, which, for statistical reasons, adds Gamma distribution parameters a in the numerator and 1/b in the denominator of Equation (34) [Citation36]. We note that this has a side effect: Rt>0, even if It=0; however, it prevents the possibility of an undefined 0/0 case.

Figure 2. Comparison of the instantaneous reproduction number calculated by Equation (34) and by the function estimate_R from EpiEstim package.

Figure 2. Comparison of the instantaneous reproduction number calculated by Equation (34) and by the function estimate_R from EpiEstim package.

Conclusion

We have shown that the well-known formulae for calculating the various epidemic reproduction numbers can be derived from the classical discrete Kermack–McKendrick epidemic model. An advantage of the model is that it is more intuitive and some of the assumptions of the continuous model can be omitted. In particular, as already mentioned, the derivation does not require a factorization assumption on the epidemic's transmissibility.

The practical example shows that case reproduction number calculated by the formula (37) gives numerically almost the same result as the reproduction number determined by the algorithmically much more demanding Wallinga–Teunis method.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • N.P. Jewell, Statistics for Epidemiology, Chapman & Hall/CRC, Boca Raton, 2004.
  • M. Woodward, Epidemiology Study Design and Data Analysis, CRC Press, Boca Raton, 2014.
  • J.A.P. Heesterbeek, A brief history of R0 and a recipe for its calculation. Acta Biotheor. 50 (2002), pp. 189–204.
  • J.M. Heffernan, R.J. Smith, and L.M. Wahl, Perspectives on the basic reproductive ratio. J. R Soc. Interface 2 (2005), pp. 281–293.
  • R. Breban, R. Vardavas, and S. Blower, Theory versus data: how to calculate R0? Plos One 2 (2007), pp. 1–4.
  • L.M.A. Bettencourt, and R.M. Ribeiro, Real time Bayesian estimation of the epidemic potential of Emerging infectious diseases. Plos One 3 (2008), pp. 1–9.
  • B. Ridenhour, J.M. Kowalik, and D.K. Shay, Unraveling R0: considerations for public health applications. Am. J. Public Health 108 (2018), pp. S445–S454.
  • P.L. Delamater, E.J. Street, T.F. Leslie, Y.T. Yang, and K.H. Jacobsen, Complexity of the basic reproduction number (R0). Emerg. Infect. Dis. 25 (2019), pp. 1–4.
  • N.C. Grassly, and C. Fraser, Mathematical models of infectious disease transmission. Nat. Rev. Microbiol. 6 (2008), pp. 477–487.
  • K. Dietz, Transmission and contol of arbovirus diseases, in Proceedings of SIMS Conference on Epidemiology Alta - Ut., July 8–12, 1974, D. Ludwig, K.L. Cooke, eds., Society for industrial and applied mathematics, Philadelphia, 1975. pp. 104–121.
  • H.W. Hethcote, The mathematics of infectious diseases. SIAM Rev. 42 (2000), pp. 599–653.
  • O. Diekmann, J.A.P. Heesterbeek, and J.A.J. Metz, On the definition and the computation of the basic reproduction ratio R0 in models for infectious–diseases in heterogeneous populations. J. Math. Biol. 28 (1990), pp. 365–382.
  • K. Dietz, The estimation of the basic reproduction number for infectious diseases. Stat. Methods Med. Res. 2 (1993), pp. 23–41.
  • J.A.P. Heesterbeek, and K. Dietz, The concept of R0 in epidemic theory. Stat. Neerl. 50 (1996), pp. 89–110.
  • P. van den Driessche, and J. Watmough, Reproduction numbers and sub–threshold endemic equilibria for compartmental models of disease transmission. Math. Biosci. 180 (2002), pp. 29–48.
  • C.P. Farrington, M.N. Kanaan, and N.J. Gay, Estimation of the basic reproduction number for infectious diseases from age–stratified serological survey data. J R Stat Soc C–Appl 50 (2001), pp. 251–283.
  • M.G. Roberts, and J.A.P. Heesterbeek, Model–consistent estimation of the basic reproduction number from the incidence of an emerging infection. J. Math. Biol. 55 (2007), pp. 803–816.
  • H. Nishiura, Correcting the actual reproduction number: A simple method to estimate R0 from early epidemic growth data. Int J Env Res Pub He 7 (2010), pp. 291–302.
  • J.L. Ma, Estimating epidemic exponential growth rate and basic reproduction number. Infect Dis Model 5 (2020), pp. 129–141.
  • L. Pellis, F. Ball, and P. Trapman, Reproduction numbers for epidemic models with households and other social structures. I. definition and calculation of R0. Math. Biosci. 235 (2012), pp. 85–97.
  • P. van den Driessche, Reproduction numbers of infectious disease models. Infect Dis Model 2 (2017), pp. 288–303.
  • G. Chowell, and F. Brauer, The basic reproduction number of infectious diseases: Computation and estimation Using compartmental epidemic models, in Mathematical and Statistical Estimation Approaches in Epidemiology, G. Chowell, J.M. Hyman, L.M.A. Bettencourt, C. Castillo–Chavez, ed., Springer, Dordrecht, 2009. pp. 363.
  • P. Yan, Separate roles of the latent and infectious periods in shaping the relation between the basic reproduction number and the intrinsic growth rate of infectious disease outbreaks. J. Theor. Biol. 251 (2008), pp. 238–252.
  • R. Ke, E. Romero-Severson, S. Sanche, and N. Hengartner, Estimating the reproductive number R0 of SARS–CoV–2 in the United States and eight european countries and implications for vaccination. J. Theor. Biol. 517 (2021), pp. 1–10.
  • J. Wallinga, and P. Teunis, Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures, (2004).
  • C. Fraser, Estimating individual and household reproduction numbers in an emerging epidemic. Plos One 2 (2007), pp. 1–12.
  • L.F. White, and M. Pagano, A likelihood-based method for real-time estimation of the serial interval and reproductive number of an epidemic. Stat. Med. 27 (2008), pp. 2999–3016.
  • G. Chowell, C. Viboud, L. Simonsen, and S.M. Moghadas, Characterizing the reproduction number of epidemics with early subexponential growth dynamics. J. R Soc. Interface 13 (2016), pp. 1–12.
  • G. Chowell, and H. Nishiura, Quantifying the transmission potential of pandemic influenza. Phys. Life. Rev. 5 (2008), pp. 50–77.
  • H. Nishiura, and G. Chowell, The effective reproduction number as a prelude to statistical estimation of time–dependent epidemic trends, in Mathematical and Statistical Estimation Approaches in Epidemiology, G. Chowell, J.M. Hyman, L.M.A. Bettencourt, C. Castillo–Chavez, ed., Springer, Dordrecht, 2009. pp. 103–122.
  • J. Li, D. Blakeley, and R.J. Smith, The failure of R0. Comput Math Method M 2011 (2011), pp. 1–17.
  • W.O. Kermack, and A.G. McKendrick, A contribution to the mathematical theory of epidemics, Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character 115 (1927), pp. 700–721.
  • A.G. McKendrick, Applications of mathematics to medical problems. Proceedings of the Edinburgh Mathematical Society 44 (1925), pp. 98–130.
  • J. Wallinga, and M. Lipsitch, How generation intervals shape the relationship between growth rates and reproductive numbers. P Roy Soc B–Biol Sci 274 (2007), pp. 599–604.
  • T. Obadia, R. Haneef, and P.Y. Boelle, The R0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks. BMC Med Inform Decis 12 (2012), pp. 1–9.
  • A. Cori, N.M. Ferguson, C. Fraser, and S. Cauchemez, A new framework and software to estimate time–varying reproduction numbers during epidemics. Am. J. Epidemiol. 178 (2013), pp. 1505–1512.
  • R.N. Thompson, J.E. Stockwin, R.D. van Gaalen, J.A. Polonsky, Z.N. Kamvar, P.A. Demarsh, E. Dahlqwist, S. Li, E. Miguel, T. Jombart, J. Lessler, S. Cauchemez, and A. Cori, Improved inference of time–varying reproduction numbers during infectious disease outbreaks. Epidemics–Neth 29 (2019), pp. 1–11.
  • F.Z. Brauer, and C. Castillo–Chávez, Discrete epidemic models. Math. Biosci. Eng. 7 (2010), pp. 1–15.
  • O. Diekmann, and J.A.P. Heesterbeek, Mathematical Epidemiology of Infectious Diseases Model Building, Analysis and Interpretation, Wiley, Chichester, 2000.
  • A. Svensson, A note on generation times in epidemic models. Math. Biosci. 208 (2007), pp. 300–311.
  • S.W. Park, D. Champredon, J.S. Weitz, and J. Dushoff, A practical generation–interval–based approach to inferring the strength of epidemics from their speed. Epidemics–Neth 27 (2019), pp. 12–18.
  • H. Nishiura, N.M. Linton, and A.R. Akhmetzhanov, Serial interval of novel coronavirus (COVID–19) infections. Int. J. Infect. Dis. 93 (2020), pp. 284–286.
  • H. Nishiura, Time variations in the transmissibility of pandemic influenza in Prussia, Germany, from 1918–19. Theor. Biol. Med. Model. 4 (2007), pp. 1–9.

Appendix

MATLAB functions for calculation of the reproduction number.

function R0 = calcR0(w,r)

%CALCR0 Calculate basic reproduction number (Eq 29)

   R0 = r/((exp(r) - 1)*sum(w.*exp(-r*(1:length(w)))));

end

function R = calcR(w,I)

%CALCR Calculate instantaneous reproduction number (Eq 34)

   K = length(w);

   N = length(I);

   R = NaN(N,1);

   for n = 1:N

     s = 0;

  for k = 1:min(n,K)

    s = s + w(k)*I(n+1-k);

  end

  if s > 0

    R(n) = I(n)/s;

  end

end

end

function [Rc,R] = calcRc(w,I)

%CALCRC Calculate case reproduction number (Eq 37)

   R = calcR(w,I);

   K = length(w);

   N = length(I);

   Rc = NaN(N,1);

   for n = 1:N

     s = 0;

     for k = 1:min(N-n+1,K)

       s = s + w(k)*R(n+k-1);

     end

     Rc(n) = s;

   end

end